-
System Development Engineer II, Telemetry…
- Amazon (Arlington, VA)
-
Description
Does the idea of playing a crucial role in building telemetry collection and active network monitoring solutions for fleets of devices like Linux servers, network devices, and industrial Android devices, in thousands of facilities around the world sound like fun to you? The Infrastructure Reliability Engineering organization is looking for talented Systems Development Engineers to join us in our mission to build solutions for monitoring and observing network infrastructure that is deployed globally across Amazon's fulfillment operations network. In this role you’ll design, build and lead delivery of monitoring systems that detect and help pinpoint network disruptions and outages through active monitoring and data collection.
The software you build will run on millions of devices used by one million-plus Amazon associates every day, at thousands of sites around the world, supporting hundreds of teams with a stake in delivering shipments to Amazon retail customers. Data is at the heart of everything Amazon does and this role ensures our operations support and infrastructure teams can gather insights to make good decisions. Your work will contribute to proactively detecting network and infrastructure problems before they affect associates. Our facilities stretch from the smallest last-mile delivery stations in emerging markets to one million-plus square foot robotics facilities in North America, Japan, India and the Middle East.
In this role you’ll ensure our teams have eyes on every facet of delivering shipments to Amazon retail customers worldwide. You should be excited about learning every day and delighting customers by solving problems that impact order fulfillment. You are passionate about software quality, repeat-ability, test-ability and maintainability. You understand the challenges associated with operating a large-scale system in production, and your designs and implementations reflect that understanding. Our mission is to design systems and platforms that set the global standard for performance, availability, security, and cost, enabling our customer fulfillment and logistics operations to deliver customer orders on time, every time.
Key job responsibilities
As a System Development Engineer on the team, you will: - Design and implement scalable software solutions for active monitoring and network device logging. - Collaborate with team members to build and maintain distributed services -Develop automation to support the growing number of devices in Amazon's Fulfillment network -Optimize existing systems for performance and reliability -Participate in code reviews and contribute to best practices -Troubleshoot and resolve complex technical issues -Stay up-to-date with the latest AWS technologies and apply them to solve business challenges Prior knowledge of networking concepts is not required.
A day in the life
Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment.
The benefits that generally apply to regular, full-time employees include:
- Medical, Dental, and Vision Coverage
- Maternity and Parental Leave Options
- Paid Time Off (PTO)
- 401(k) Plan
If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you!
At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply!
About the team
The Telemetry Engineering team supports the Infrastructure Reliability Engineering organization's goals to increase availability by providing visibility and services to detect and prevent disruptions to the network in addition to providing the data to resolve disruptions that do occur. The team's main products include solutions to generate active monitoring metrics and collect network device logs and metrics. To support multiple device platforms, the team uses multiple coding languages including Rust, Java and Python.
Basic Qualifications
- Experience in automating, deploying, and supporting large-scale infrastructure
- Experience programming with at least one modern language such as Python, Ruby, Golang, Java, C++, C#, Rust
- Experience with Linux/Unix
- Experience with CI/CD pipelines build processes
Preferred Qualifications
- Experience with distributed systems at scale
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
-
Recent Jobs
-
System Development Engineer II, Telemetry Engineering
- Amazon (Arlington, VA)