- Verint Systems, Inc. (Olympia, WA)
- …performance. + Support the design and implementation of fault-tolerant, self-healing systems . + Lead and contribute to blameless postmortems and continuous ... in on-call rotations, incident response, and root cause analysis to improve system reliability . + Collaborate with development and operations teams to… more
- MongoDB (Seattle, WA)
- …provide hybrid work accommodation. **Role Overview** We are seeking a talented Site Reliability Engineer (SRE) Lead with a strong networking background to join ... for a range of critical infrastructure and operational functions that support the broader engineering organization. Among these are our multi-cloud-provider… more
- Amazon (Seattle, WA)
- …and/or solid understanding of computer systems to influence design for reliability . * Lead identifying and validating product/component risks and work with ... organization under MLA focused on Hardware Development, Software Development, Fleet Ops Systems , and Manufacturing, Quality, and Reliability . This position is in… more
- Amazon (Seattle, WA)
- …products deployed across Amazon's global operations. Key job responsibilities - Define reliability baselines of products and systems and develop monitoring ... mechanisms for prioritizing failure and reliability investigations. - Develop Reliability Centered Maintenance procedures for systems and products. Identify… more
- Amazon (Bellevue, WA)
- …for given asset. Drive resolution of critical asset issues as well as support processes/ systems deep dives to ensure root cause analysis and correction ... develop processes, documentation and communications for program/process rollout and ongoing support . Write whitepapers to gain initial buy-in and alignment as well… more
- Nordstrom (Seattle, WA)
- …and track SLOs, SLAs, and error budgets. Continuously refine processes to improve system reliability and team efficiency. What You Bring + Experience 5+ ... for a strategic and hands-on Senior Manager of Site Reliability Engineering to lead our SRE team...our SRE team in delivering resilient, scalable, and high-performing systems . This role is central to our mission of… more
- Google (Kirkland, WA)
- …+ 2 years of experience designing, analyzing, and troubleshooting large-scale distributed systems . Site Reliability Engineering (SRE) combines software and ... SRE ensures that Google's services-both our internally critical and our externally-visible systems -have reliability , uptime appropriate to users' needs and a… more
- Amazon (Bellevue, WA)
- …global strategy for spare parts inventory management within the Central Reliability Maintenance Engineering (RME) Decision Science and Technology (DST) team. This ... scale. You will not only be responsible for overseeing stock, but will also lead process improvement projects, such as enhancing the tracking of parts and optimizing… more
- Amazon (Seattle, WA)
- …infrastructure. In other words, we're the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and ... delivering on customer promises today and establish the framework, systems , and mechanisms for tomorrow. They will also have...for a team of individual contributors. * Set up, lead and continuously improve the Supply and Operational Execution… more
- LiveRamp (Seattle, WA)
- …LiveRamp offers complete flexibility to collaborate wherever data lives to support the widest range of data collaboration use cases-within organizations, between ... forefront of rapidly evolving compliance and privacy requirements.** **You will:** + ** Support and/or own the deployment of global products including setting up… more