• Senior AI Infrastructure Engineer - DGX Cloud

    NVIDIA (Santa Clara, CA)
    …to the existing system through careful preparation and planning while managing capacity and performance . NVIDIA's culture of diversity, intellectual curiosity, ... This role demands knowledge across different systems, networking, coding, database, capacity management, continuous delivery and deployment and open source cloud… more
    NVIDIA (11/06/25)
    - Related Jobs
  • IT Staff Systems Engineer (HPC)

    Cadence Design Systems, Inc. (San Jose, CA)
    …their working environment (Direct EDA experience desired) + Proven experience in capacity and performance management, optimizing performance , ensuring ... to customer success. + Driving the overall operational strategy for internal High- Performance Compute (HPC) farms in all Cadence locations. + Developing and… more
    Cadence Design Systems, Inc. (09/26/25)
    - Related Jobs
  • Senior Staff Software Engineer, SRE, ML Fleet…

    Google (Sunnyvale, CA)
    …of improvement. Additionally SRE's will keep an ever-watchful eye on our systems capacity and performance . Much of our software development focuses on optimizing ... or a related technical field. + Experience with infrastructure optimization, performance analysis, and cost reduction in large-scale environments. + Experience with… more
    Google (12/12/25)
    - Related Jobs
  • Senior Mainframe Systems Programmer - zVM

    Ensono (Los Angeles, CA)
    …of client environments + Manage z/VM systems: + Software + Availability + Network + Capacity and performance + Storage + Directory and security + Working in a ... Support Incident and Change Management processes + Provide metrics tracking operational performance and quality of services delivered + Assist Data Center Operations… more
    Ensono (10/07/25)
    - Related Jobs
  • Site Reliability Developer 4

    Oracle (Sacramento, CA)
    …anti-fragile systems. + Incident response, On-call management. Being on-call. + Capacity planning, performance and efficiency + Monitoring and metrics. ... to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity , security, performance attributes, and requirements of the service and… more
    Oracle (12/06/25)
    - Related Jobs
  • Senior Site Reliability Developer

    Oracle (Pleasanton, CA)
    …standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, ... to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity , security, performance attributes, and requirements of the service and… more
    Oracle (12/04/25)
    - Related Jobs
  • Site Reliability Engineer 2 DevOps | Remote (US…

    Oracle (Sacramento, CA)
    …deploying key services with deep focus on architecture, production operations, capacity planning, performance management, deployment, and release engineering. ... standards, and methods for large-scale distributed systems Facilitate service capacity planning and demand forecasting, software performance analysis,… more
    Oracle (12/04/25)
    - Related Jobs
  • Distinguished, Architect - AI/ML

    Walmart (Sunnyvale, CA)
    …that coordinate between different AI agents for automated incident response, capacity planning, and performance optimization across e-commerce, supply chain, ... SRE expertise including Service Management (Incident, Problem & Change), Performance Engineering, and capacity planning for mission-critical systems… more
    Walmart (11/01/25)
    - Related Jobs
  • Principal Program Manager - Demand Planning

    Oracle (Sacramento, CA)
    …processes into system models to determine and plan resource needs that meets capacity and performance needs. **Required Qualifications:** + Bachelor's degree in ... demand requirements. + Socialize demand plans with senior leaders in capacity planning and operations, supply chain, engineering, customer management, and finance… more
    Oracle (11/25/25)
    - Related Jobs
  • Supplier Quality Engineering Manager

    Oracle (Santa Clara, CA)
    …etc.), and driving corrective actions and continuous improvement of quality and capacity performance at the manufacturing sites + Assign and monitor ... Manage the people responsible for the development, readiness, and manufacturing performance of compute, storage, and networking hardware through the supply chain… more
    Oracle (12/02/25)
    - Related Jobs