- NVIDIA (Santa Clara, CA)
- …(SLIs) to measure and ensure infrastructure quality. + Write high-quality Root Cause Analysis (RCA) reports for production-level incidents and work towards ... preventing future occurrences. + Supporting our researchers to run their flows on our clusters including performance analysis and optimizations of deep learning workflows and participate in the team's on-call rotation to support critical infrastructure. +… more
- NVIDIA (Santa Clara, CA)
- …at product development stage + Collaborating with cross-functional teams to determine root causes of failures + Crafting and maintaining detailed documentation of ... reliability tests and results + Analyzing data to identify trends and areas for improvement + Ensuring compliance with industry standards + Providing technical support and mentorship to other departments What we need to see: + A bachelor's or higher degree in… more
- Amazon (Sunnyvale, CA)
- …and delivers new and innovative software solutions and concepts. * Help root -cause and solve the hardest intrinsic challenges which the organization is facing. ... * Suggest and develop tools and mechanisms which greatly help your peers and application developers writing applications for Fire TV * Deliver high quality software through working in a dynamic, team-focused Agile/Scrum environment. A day in the life In the… more
- Fiserv (Los Angeles, CA)
- …analytical skills, with the ability to diagnose complex infrastructure issues, identify root causes, and implement effective solutions **How you'll work:** + This ... role requires being on-call during non-standard and/or overnight hours on a rotational basis + This role requires use of a computer and audio equipment \#SystemsEngineer \#MainframeJobs \#MCP \#Unisys \#WFL \#COMS \#DASDL \#LegacySystems \#LI-CD **Salary… more
- Amazon (Santa Clara, CA)
- …business needs for AWS colocation products. - Investigating and documenting root -cause failure analysis associated with infrastructure and equipment failures. - ... Working with local agencies having jurisdiction to ensure compliance with federal, state and municipal requirements and codes. - Collaborating with other engineering, operations, and commissioning teams to properly test and validate the installation,… more
- Kimley-Horn (San Diego, CA)
- …pools, and server utilization; recommend scaling or optimization actions. + Track defect/ root cause trends, run post incident reviews, and sponsor automation to ... reduce repetitive tasks. + Actively work with project teams to understand their needs and create innovative solutions that can be crafted internally with internal teams or placed on software vendors roadmaps **Qualifications** + While a bachelor's degree in… more
- Leidos (San Diego, CA)
- …to perform technical troubleshooting and diagnosis of failed equipment and support root cause analysis. + Perform data analysis and write technical reports. ... **Education:** + BS degree or higher in Engineering, Physics, Mathematics, or related field from an accredited college/university. **Basic Qualifications:** + BS with 8+ years of relevant experience as an electrical engineer in an R&D environment or MS with 6+… more
- Amazon (San Francisco, CA)
- …initiatives, and mentoring programs that strengthen team capabilities and resolve root causes of endemic issues Scale technical impact by actively coaching ... multiple engineers, providing career guidance, and driving adoption of AI development best practices across Amazon Music About the team Our mission is to accelerate Amazon Music engineers by providing advanced AI agents and automated workflows that turn… more
- WestRock Company (Salinas, CA)
- …improvement activities * Lead and monitor CAR (Corrective Action Request) and Root Cause Corrective Action (RCCA) activities for systemic issues to ensure robust ... product and process improvements * Plan, lead, and measure process and voice of customer performance and quality system effectiveness and make adjustments in strategy and/or procedures as needed * Conduct internal quality audits to oversee inspections of raw… more
- Ford Motor Company (Palo Alto, CA)
- …challenges (eg, predictive maintenance, quality defect prediction, process optimization, root cause analysis, production scheduling) into actionable data science ... initiatives. 7. Apply a wide range of data science techniques, including advanced statistical modeling, machine learning, and deep learning, to deliver robust and scalable solutions. 8. **LLM Application & Innovation:** 9. Drive the exploration and… more