- Walmart (Sunnyvale, CA)
- …generation Last Mile Delivery software applications, including very high capacity , guaranteed availability, and mass market usability without compromising quality. ... Grafana etc. Perform operational excellence using Prometheus, Grafana, xMatters for monitoring and alerting. Work on Continuous Integration and Continuous Delivery… more
- Lockheed Martin (Sunnyvale, CA)
- …Division Multiple Access \(WCDMA\) waveform, giving warfighters 10 times the communications capacity of the legacy UHF SATCOM system\. This position will be part ... changes are formally processed through a change‑order mechanism\. * Cost & Schedule Monitoring : Track actual costs and progress against the fixed price and baseline… more
- Cardinal Health (Sacramento, CA)
- …to production outages. + Analyze production system operations using tools such as monitoring , capacity analysis and outage root cause analysis to identify and ... process improvements and back-end solutions for commercial technologies to maximize performance and suitability for business needs. This job family manages… more
- Amazon (Cupertino, CA)
- …the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you'll experience ... you will oversee the fleet of servers you develop, monitoring their quality and how they are meeting the...uniqueness. *Mentorship and Career Growth* We're continuously raising our performance bar as we strive to become Earth's Best… more
- Amazon (Cupertino, CA)
- …the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you'll experience ... stop embracing our uniqueness. Mentorship and Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why… more
- Palo Alto Networks (Santa Clara, CA)
- …architecture to improve scalability in networking like BGP, OSPF, service reliability, capacity , and performance + Collaborate with development teams to ensure ... work follow using python or go code + Build BGP and networking monitoring / remediation tools + Engage with customers on escalations to provide remediation +… more
- NVIDIA (Santa Clara, CA)
- …challenged, improving, and evolving for the better. You will help advance NVIDIA's capacity to build and deploy leading infrastructure solutions for a broad range of ... software related to managing fleets of GPU nodes. + Implementing monitoring and health management capabilities that enable industry leading reliability,… more
- Walmart (Sunnyvale, CA)
- …SRE tools and monitoring systems with built-in observability and performance monitoring . + **Establish platform engineering excellence** by building reusable ... that coordinate between different AI agents for automated incident response, capacity planning, and performance optimization across e-commerce, supply chain,… more
- PennyMac (Westlake Village, CA)
- …for overseeing the Site Reliability Operations (SRO) team that provides 24/7 monitoring and support of the company's IT infrastructure. This leader will develop ... a culture of excellence, collaboration, and continuous learning while managing performance , career development, and succession planning. + Operational Oversight -… more
- Sacramento Municipal Utility District (Sacramento, CA)
- …SMUD's transmission and distribution substation infrastructure has the required capacity , maintenance, and reliability using specialized technical and professional ... and administering the budget for the assigned liens of business) by monitoring unit budget expenditures against plan; identifying and reconciling budget anomalies; … more
Recent Jobs
-
Mechanical Engineering Intern - Ditch Witch
- The Toro Company (Perry, OK)
-
Senior DGX AI Cloud Performance Analysis Tools Engineer
- NVIDIA (Santa Clara, CA)