• Senior Site Reliability Engineer , DGX…

    NVIDIA (CA)
    …budgets, and incident handling + Experience building and operating comprehensive observability stacks ( monitoring , logging, tracing) using tools like ... NVIDIA's DGX Cloud team as a Senior Site Reliability Engineer to maintain high-performance DGX Cloud clusters for AI...clusters with a focus on performance at scale, real-time monitoring , logging and alerting + Define SLOs/SLIs, monitor error… more
    NVIDIA (08/30/25)
    - Related Jobs
  • Forward Deployed Solution Engineer

    ServiceNow, Inc. (San Francisco, CA)
    It all started in sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how we work. Fast forward to today - ... and scalable software. As a **Senior Forward Deployed Solution Engineer (FDSE)** , you act as the CTO of...are trusted by architects, PMs, and customer teams to lead implementation from zero to one **What You Bring:**… more
    ServiceNow, Inc. (08/14/25)
    - Related Jobs
  • AI Data Engineer

    Stanford University (Stanford, CA)
    AI Data Engineer **Business Affairs: University IT (UIT), Redwood City, California, United States** Information Technology Services Post Date Sep 08, 2025 ... # 107222 Job Purpose Are you an experienced AI/GenAI engineer who loves shipping real systems with a passion...efficiency, and decision-making. You may serve as the technical lead for specific AI tracks and interrelated applications. This… more
    Stanford University (09/09/25)
    - Related Jobs
  • AI Applications Engineer

    Stanford University (Stanford, CA)
    AI Applications Engineer **Business Affairs: University IT (UIT), Redwood City, California, United States** Information Technology Services Post Date Sep 08, 2025 ... Requisition # 107213 **Job Purpose** Are you an experienced AI/GenAI engineer who loves shipping real systems? Join Stanford's Enterprise Technology team to design,… more
    Stanford University (09/09/25)
    - Related Jobs
  • Senior Software Engineer -SRE

    Intuit (Mountain View, CA)
    …platform team in Small Businesses at Intuit! We're looking for a passionate software engineer to lead operational excellence and site reliability to deliver the ... in one of the areas in site reliability (Automation, Monitoring tools, Cloud Operations) + Hands-on experience in at...always doing the most glamorous tasks. **How you will lead ** + Responsible for driving operational excellence for the… more
    Intuit (08/30/25)
    - Related Jobs
  • VP, Principal Engineer , AI Agents

    Teradata (Sacramento, CA)
    …role, you will be part of Teradata's Product Engineering leadership team and will lead the next gen agentic architecture for AI across Teradata's data and analytics ... governance. + Modernize the platform through microservices, Kubernetes-native deployment, observability , and API-first integration, ensuring AI services are secure,… more
    Teradata (09/11/25)
    - Related Jobs
  • Principal DevOps Engineer (AI/ML)

    Palo Alto Networks (Santa Clara, CA)
    …infrastructure and is one of the largest GCP customers. As a Principal Site Reliability Engineer , you will be part of a team supporting the services running on this ... infrastructure. This includes automation, architecture, performance, observability , troubleshooting, security, and reliability. Our Infrastructure Platform stack… more
    Palo Alto Networks (08/15/25)
    - Related Jobs
  • Staff Full Stack Engineer

    General Motors (Sacramento, CA)
    …at minimum._ **The Role: ** We are seeking an experienced Staff full stack engineer with a strong ability to execute hands-on technical work. The AI Lifecycle Team ... models to the AV. As a Staff Full Stack Engineer , you will collaborate closely with machine learning engineers,...across ML lifecycle + Raise the bar on system observability , debuggability, and operational excellence, and user experience. +… more
    General Motors (09/24/25)
    - Related Jobs
  • Principal, Software Engineer

    Walmart (Sunnyvale, CA)
    …delivery for large, mission-critical, cross-functional projects. + Experience with network observability , monitoring , and DevOps. + Advanced certifications such ... Summary ** **What you'll do ** As a Principal Engineer , you will play an important role in shaping...networking, and interconnection. **What you'll do:** + Architect and lead the design of high-performance, scalable WAN and backbone… more
    Walmart (10/02/25)
    - Related Jobs
  • Staff, Software Engineer

    Walmart (Sunnyvale, CA)
    …seamless and secure APIs. + **Operational Excellence** : Champion CI/CD pipelines, observability , monitoring , and incident response practices. + **Innovation** : ... **Position Summary ** As a Staff Software Engineer you are responsible for developing high performance... with deep expertise in GraphQL and Node.js to lead the design and development of scalable, reliable, and… more
    Walmart (09/06/25)
    - Related Jobs