• Staff Software Engineer, AI Platform

    LinkedIn (Mountain View, CA)
    …ability to work well in a diverse, team-focused environment with other SRE /SWE Engineers, Project Managers, etc. -Experience building ML applications, LLM serving, ... GPU serving. -Experience with search systems or similar large-scale distributed systems -Expertise in machine learning infrastructure, including technologies like MLFlow, Kubeflow and large scale distributed systems -Experience with distributed data processing… more
    LinkedIn (08/08/25)
    - Related Jobs
  • Senior AI Infrastructure Engineer - DGX Cloud

    NVIDIA (Santa Clara, CA)
    …and open source cloud enabling technologies like Kubernetes and OpenStack. DGX Cloud SRE at NVIDIA ensures that our internal and external facing GPU cloud services ... run maximum reliability and uptime and at the same time making changes to the existing system through careful preparation and planning while managing capacity and performance. NVIDIA's culture of diversity, intellectual curiosity, problem solving and openness… more
    NVIDIA (08/08/25)
    - Related Jobs
  • Principal Architect, Security Solutions

    Red Hat (Sacramento, CA)
    …networking or storage + Background in DevOps or site reliability engineering ( SRE ) \#LI-HM1 The salary range for this position is $148,540.00 - $245,050.00. ... Actual offer will be based on your qualifications. **Pay Transparency** Red Hat determines compensation based on several factors including but not limited to job location, experience, applicable skills and training, external market value, and internal pay… more
    Red Hat (08/08/25)
    - Related Jobs
  • Principal Software Engineer in Test (Prisma…

    Palo Alto Networks (Santa Clara, CA)
    …SASE Functionality and Scale, working closely with Development, Product Management, SRE and Technical Marketing teams + Provide Thorough Technical Leadership in ... the areas of Cloud Based Orchestration, Cloud delivered Security, Cloud Networking and Automation Design + Participate in system design so that Quality Assurance is considered throughout the entire lifecycle of the Prisma Access Feature Development + Develop… more
    Palo Alto Networks (08/08/25)
    - Related Jobs
  • Advisory Solutions Architect

    MongoDB (San Francisco, CA)
    …and platforms into modern, scalable, and efficient technology stacks + Experience in SRE practices + Proven ability to up-level the broader organization - for ... example by sharing best practices or creating reusable assets + A MongoDB Certification + A Cloud Provider Certification **What you do at MongoDB:** In this role, you will work on complex opportunities where analysis of situations or data requires an in-depth… more
    MongoDB (08/08/25)
    - Related Jobs
  • Staff Software Engineer

    LiveRamp (San Francisco, CA)
    …regulations. + Work closely with various internal teams (engineering, product, QE, DevOps, SRE ) on product delivery. + Build products that integrate to the rest of ... LiveRamp's Data Collaboration Platform + Continuously improving the quality, stability, simplicity and performance of our applications and libraries. About you: + 10+ years of experience writing and deploying high-quality production code + Solid Java or Scala… more
    LiveRamp (08/07/25)
    - Related Jobs
  • AVP, Technology Operations

    PennyMac (Westlake Village, CA)
    …observability. + Strong understanding of modern IT operations principles including SRE practices, DevOps methodologies, and ITIL frameworks. + Proven experience ... managing teams responsible for 24/7 critical infrastructure environments, preferably in financial services or other regulated industries. + Demonstrated success in driving operational transformation and improvement initiatives. + Exceptional leadership… more
    PennyMac (08/07/25)
    - Related Jobs
  • Software Engineer, Infrastructure

    Matroid (Palo Alto, CA)
    …Docker, Kafka, Redis, Prometheus, MongoDB, Nginx + Experience with DevOps, SRE and/or InfoSec + Familiarity with machine learning technologies, concepts and ... operation What we offer in return + Competitive pay and equity + The chance to constantly work on stimulating intellectual challenges + Gym membership reimbursement + Free lunch, healthy drinks, and snacks every day + Medical, dental, and vision insurance with… more
    Matroid (08/07/25)
    - Related Jobs
  • Director, Engineering Services

    Western Digital (San Jose, CA)
    …stakeholder engagement, and cross-functional collaboration. + Knowledge of SRE , platform engineering, and Agile/SAFe practices. **Additional Information** Western ... Digital is committed to providing equal opportunities to all applicants and employees and will not discriminate against any applicant or employee based on their race, color, ancestry, religion (including religious dress and grooming standards), sex (including… more
    Western Digital (08/01/25)
    - Related Jobs
  • Principal Observability Customer Success…

    Amazon (East Palo Alto, CA)
    …serverless) - Experience with DevOps practices and tools - Knowledge of SRE principles and practices About the team Diverse Experiences AWS values diverse ... experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't… more
    Amazon (07/31/25)
    - Related Jobs