• Site Reliability Engineer (SRE) - Data…

    Insight Global (Memphis, TN)
    Job Description As a Data Center Site Reliability Engineer (SRE) at client, you will play a pivotal role in ensuring the reliability, scalability, and performance of ... including HPC and GPU clusters. * Design, implement, and manage monitoring , logging, and alerting systems (eg, Prometheus, Grafana, PagerDuty). * Develop… more
    Insight Global (10/09/25)
    - Related Jobs
  • Senior Data Engineer

    Integra Partners (Troy, MI)
    …and motivated individual to join our data engineering team. As a Senior Data Engineer , you will play a key role in building and maintaining robust data pipelines ... adhere to healthcare industry standards such as HIPAA and HITRUST. + Implement monitoring , alerting, and logging for data pipelines to proactively detect and resolve… more
    Integra Partners (10/08/25)
    - Related Jobs
  • Site Reliability Engineer II

    Dentsply Sirona (Waltham, MA)
    …Skills, Knowledge, & Capabilities + Infrastructure as code (Terraform, Ansible), monitoring and observability , CI/CD pipelines, cloud architecture, incident ... worldwide. We are looking for a talented Site Reliaiblity Engineer II to join our team. You will manage...scripts to reduce manual intervention in operations + Implement monitoring and alerting strategies for infrastructure and applications +… more
    Dentsply Sirona (10/07/25)
    - Related Jobs
  • Python Software Engineer III…

    JPMorgan Chase (Houston, TX)
    …improve Mean time to Resolve(MTTR) and Mean time to Detect(MTTD) + Enhances observability by identifying gaps and building monitoring , logging and alerting to ... engineering career to the next level. As a Software Engineer III at JPMorgan Chase within the Corporate Technology...(SLA/SLOs, error budgets, MTTR, MTTD) + Demonstrated experience with monitoring tools such as Dynatrace, OTeL, Grafana + Hands-on… more
    JPMorgan Chase (10/04/25)
    - Related Jobs
  • Senior Software Engineer - AI-First…

    Rev.io (Atlanta, GA)
    …LLM-powered agents, orchestration platforms, sub-agent coordination, prompt engineering, AI observability / monitoring tools. + Languages: Python, C#, React, Lua, ... work environment! About the role: As a Senior Software Engineer at Rev.io, you will operate on the cutting...accelerate stages of the SDLC (requirements, coding, testing, deployment, monitoring ). + Collaborate with architects to design and evolve… more
    Rev.io (10/04/25)
    - Related Jobs
  • (USA) Software Engineer III

    Walmart (Bentonville, AR)
    **Position Summary ** **What you'll do ** As a **Software Engineer III** , you will design, build, and scale next-generation AI-powered applications that directly ... and inference, enabling seamless experimentation, model deployment, and performance monitoring . + **MLOps Integration** : Incorporate industry best practices in… more
    Walmart (10/03/25)
    - Related Jobs
  • Senior Payments Software Engineer

    Truist (Atlanta, GA)
    …the forefront of real-time payments innovation. As a Senior Payments Software Engineer , you will lead the design and development of scalable, cloud-native ... implementation, maintenance, and support of highly complex solutions.** **3. Build observability into applications using logging, metrics, and alerting tools.** **4.… more
    Truist (10/02/25)
    - Related Jobs
  • Principal Software Test Engineer - Ally.ai…

    Ally (Raleigh, NC)
    …**The Opportunity** Join Ally's Generative AI journey as a Principal Software Test Engineer on the Ally.ai platform. You'll define the quality strategy and test ... with GitLab quality gates and IaC-driven ephemeral environments. * Production Quality & Monitoring : partner with SRE to convert test signals into production SLOs and… more
    Ally (10/02/25)
    - Related Jobs
  • Senior Integrations Engineer , Enterprise…

    Airtable (Austin, TX)
    …+ Bring strong technical judgment in selecting the right tools and patterns, ensure observability and monitoring are in place for integration health + Champion ... Airtable to transform how work gets done. We are seeking a Senior Engineer , Enterprise Integrations & Agentic AI to design, build, and extend Airtable's enterprise… more
    Airtable (09/24/25)
    - Related Jobs
  • Network Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …the world. We are seeking a highly skilled and experienced Network Site Reliability Engineer (SRE) to join our Enterprise Network Operations and SRE team. In this ... using hands-on debugging and by focusing on network automation, observability , documentation, and operational excellence. Your primary responsibilities will include… more
    NVIDIA (09/23/25)
    - Related Jobs