• Principal Site Reliability Engineer , AI…

    NVIDIA (Santa Clara, CA)
    …system architecture. What We Need to See: + 15+ years of experience in SRE , Production Engineering, or Cloud Infrastructure, with a strong track record of leading ... platform-scale efforts and high-impact programs. + Deep expertise in Linux/Unix systems engineering and public/private cloud platforms (AWS, GCP, Azure, OCI). + Expert-level programming in Python and one or more languages such as C++, Go or Rust. +… more
    NVIDIA (07/31/25)
    - Related Jobs
  • Senior DGX Cloud Software Engineer

    NVIDIA (Santa Clara, CA)
    …developing multi-cloud infrastructure services. Experience teaching reliability engineering (eg SRE ) and/or other scale-oriented cloud systems practices to peers ... and/or other companies (eg CRE). Experience in running private or public cloud systems based on one or more of Kubernetes, OpenStack, Docker or Slurm. + Experience with accelerated compute and communications technologies such BlueField Networking, Infiniband… more
    NVIDIA (07/26/25)
    - Related Jobs
  • Senior Software Engineer

    Aeris Communications (San Jose, CA)
    …Collaborate actively with other developers and other cross-functional teams like QA, SRE , and Operations. Assist in support of the existing code in production ... environments. Key Responsibilities + Investigate and evaluate advanced technologies, protocols, and architectures to identify scalable and efficient solutions that address system-level challenges and support secure, high-performance product development in the… more
    Aeris Communications (07/24/25)
    - Related Jobs
  • Sr. System Development Engineer , Solid…

    Amazon (Cupertino, CA)
    …kernel drivers. - 5+ years or more in software development, systems development, SRE (Site Reliability Engineering), or Resilience Engineering - 5+ years of server ... systems debug experience; debugging and root causing complex server platforms - 5+ years of experience building software automation systems to increase durability, security, availability and scalability of Linux based storage systems - Experience with… more
    Amazon (06/11/25)
    - Related Jobs
  • Supervisor, Site Reliability Engineering - Federal…

    ServiceNow, Inc. (San Diego, CA)
    It all started in sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how we work. Fast forward to today - ... holding a green card, will be considered._** Our Site Reliability Engineering ( SRE ) team consists of highly skilled engineers responsible for maintaining and… more
    ServiceNow, Inc. (08/01/25)
    - Related Jobs
  • Sr. AWS Cloud & DevOps Architect - Remote

    McAfee, Inc. (San Jose, CA)
    …alerting solutions to maintain system health and security​ + Drive SRE practices by implementing strategies that improve reliability, availability, and scalability ... and guide junior engineers in cloud architecture, DevOps, and SRE best practices. + Act as a subject matter...Certified Solutions Architect - Professional or AWS Certified DevOps Engineer - Professional is highly desirable. + Familiarity with… more
    McAfee, Inc. (08/01/25)
    - Related Jobs
  • Principal Inbound Product Manager - Performance…

    ServiceNow, Inc. (San Diego, CA)
    It all started in sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how we work. Fast forward to today - ... Engagement + Actively engage with customers, partners, and internal stakeholders (eg, SRE , Support, Customer Success) to capture feedback and identify pain points. +… more
    ServiceNow, Inc. (08/08/25)
    - Related Jobs
  • AVP, Technology Operations

    PennyMac (Westlake Village, CA)
    …capacity. + Advanced AWS certifications (Solutions Architect Professional, DevOps Engineer Professional, or similar). + Advanced knowledge and experience using ... observability. + Strong understanding of modern IT operations principles including SRE practices, DevOps methodologies, and ITIL frameworks. + Proven experience… more
    PennyMac (08/07/25)
    - Related Jobs
  • Technical Lead, Cloud Networking, gRPC

    Google (Sunnyvale, CA)
    …and mobile; the list goes on and is growing every day. As a software engineer , you will work on a specific project critical to Google's needs with opportunities to ... strategy, new initiatives and define roadmap (eg, partner teams would include SRE , Product Management, and key internal/external customers). Google is proud to be… more
    Google (07/02/25)
    - Related Jobs
  • Technical Product Manager, KSM

    Keeper Security, Inc. (El Dorado Hills, CA)
    …deeply technical Product Manager, someone who codes, collaborates, and ships like an engineer to manage our KSM (Keeper Secrets Manager) , one of our ... and automation workflows + Understand user needs by engaging directly with DevOps, SRE , and engineering teams at enterprise customers + Monitor trends in secrets… more
    Keeper Security, Inc. (06/11/25)
    - Related Jobs