• Engineering Manager - AI DevOps

    NVIDIA (Santa Clara, CA)
    infrastructure with GPU and Kubernetes capabilities to deliver high -throughput, low-latency inferencing solutions in distributed environments. Lead a team ... with expertise in AI inference infrastructure , test automation (SDET), and Infrastructure as Code (IaC) + Architect and implement scalable test automation… more
    NVIDIA (09/26/25)
    - Related Jobs
  • Public Cloud Linux Engineer

    NTT DATA North America (Austin, TX)
    …+ Implement and maintain system monitoring, alerting, and logging solutions to ensure high availability and reliability. + Lead root cause analysis and document ... Strong understanding of Linux-based backup strategies, disaster recovery planning, and high - availability configurations (eg, Pacemaker, DRBD). + Familiar with… more
    NTT DATA North America (10/08/25)
    - Related Jobs
  • Public Cloud Linux Engineer

    NTT America, Inc. (Plano, TX)
    …Implement and maintain system monitoring, alerting, and logging solutions to ensure high availability and reliability. Lead root cause analysis and document ... understanding of Linux-based backup strategies, disaster recovery planning, and high - availability configurations (eg, Pacemaker, DRBD). Familiar with security… more
    NTT America, Inc. (10/02/25)
    - Related Jobs
  • Sr Cloud Database Engineer

    S&P Global (Southfield, MI)
    …and manage automated backups, replication, and disaster recovery procedures. + Implement high - availability and failover strategies using AWS RDS and PostgreSQL ... non-production environments. This role requires deep expertise in PostgreSQL, cloud infrastructure (AWS), and automation, with a strong focus on performance,… more
    S&P Global (09/09/25)
    - Related Jobs
  • DevOps Solutions Engineer

    Lovingly (Hopewell Junction, NY)
    …+ Optimize cloud networking, security, and multi-tenant architecture. + Implement high - availability and disaster recovery solutions. + Enhance security ... model, blending remote flexibility with in-person collaboration, fostering a high -performance, innovation-centric culture. AI-First Thinking at Lovingly We believe… more
    Lovingly (08/08/25)
    - Related Jobs
  • Principal AI Software Engineer

    GE Vernova (Niskayuna, NY)
    …multi-tenant AI systems serving millions of users with high availability and performance requirements + Architect advanced RAG (Retrieval-Augmented ... + Track record of building and scaling AI-powered features for 10M+ users with high availability requirements + Expert knowledge of AI observability tools, cost… more
    GE Vernova (09/27/25)
    - Related Jobs
  • Senior Director - IT Operations/Security

    PagerDuty (San Francisco, CA)
    …Operations and Infrastructure to support a growing, hybrid workforce, ensuring high availability and performance of critical systems. + Drive the strategy, ... our IT Operations at scale. In this role, you will architect , build, and support PagerDuty's rapidly expanding SaaS/cloud-first information technology… more
    PagerDuty (09/13/25)
    - Related Jobs
  • Sr. Backend Engineer C++ - (Hybrid)

    Comcast (Chicago, IL)
    …+ Construct and optimize the infrastructure of the ad delivery system with high concurrency, high availability , and low latency ad delivery. + Continuous ... for passionate Sr. Backend Engineers to help design, build and support our high -quality, innovative video advertising platform. This position is based in New York.… more
    Comcast (09/24/25)
    - Related Jobs
  • Senior Azure Data Engineer

    MetLife (Cary, NC)
    …and cost optimization for the Azure data environment. * Ensure platform availability , data pipeline reliability, and high performance for downstream consumption. ... via APIs and RESTful services. * Proficiency with Terraform for managing Azure infrastructure via code. Preferred: * Azure Solutions Architect or Azure Data… more
    MetLife (10/09/25)
    - Related Jobs
  • Senior Cloud Engineer - WFH

    Shuvel Digital (Reston, VA)
    …of cloud-based systems and optimize resources to ensure cost-effectiveness and high availability . + Possess strong analytical and problem-solving abilities ... candidate will have a strong foundation in DevSecOps Principles, Cloud infrastructure (On-premises, Hybrid, and Cloud based), virtual technologies, networking, and… more
    Shuvel Digital (09/04/25)
    - Related Jobs