• Site Reliability Engineer II

    Insight Global (Jacksonville, FL)
    …operations, or command center roles. * Hands-on with monitoring and observability tools (Datadog, Dynatrace, Splunk, ServiceNow, etc.). * Strong scripting/automation ... skills (Python, PowerShell, Bash). * Knowledge of incident management frameworks (ITIL a plus). Problem-solver with strong communication skills, able to work in high-pressure environments n/a more
    Insight Global (12/03/25)
    - Related Jobs
  • Backend Engineer - Generative AI

    Insight Global (Orange, CA)
    …platform reliability, security, and compliance (HIPAA, PHI). * Implement observability , testing, and governance for LLM-based applications. * Maintain infrastructure ... as code (IaC) and CI/CD pipelines. * Integrate open-source and third-party tools (LangChain, Weaviate, Pinecone, Azure OpenAI, Vertex AI). Eligible States for Remote Work: AZ, FL, CA, GA, KS, ID, MO, NV, NC, TN, TX, WY We are a company committed to creating… more
    Insight Global (12/01/25)
    - Related Jobs
  • AI Platform Engineer

    Expedient (Cleveland, OH)
    …auto-scaling to meet performance targets + Monitor & Optimize: Implement comprehensive observability using Elastic Stack; continuously tune for efficiency + Secure & ... Protect: Ensure infrastructure meets strict security and compliance standards + Collaborate & Support: Partner with developers and engineers; translate complex infrastructure into simple terms + Troubleshoot & Improve: Rapidly resolve issues and drive… more
    Expedient (11/28/25)
    - Related Jobs
  • Senior Software Engineer , AI Inference…

    NVIDIA (Santa Clara, CA)
    …cloud platforms (AWS/GCP/Azure), infrastructure as code, CI/CD, and production observability . + Contributions to open-source projects and/or publications; please ... include links to GitHub pull requests, published papers and artifacts. At NVIDIA, we believe artificial intelligence (AI) will fundamentally transform how people live and work. Our mission is to advance AI research and development to create groundbreaking… more
    NVIDIA (11/27/25)
    - Related Jobs
  • Senior Software Engineer

    Microsoft Corporation (Redmond, WA)
    …work closely with product teams to enhance availability, reliability, observability , and operability across our planet-scale systems. We prioritize long-term ... platform improvements through engineering over repetitive manual tasks while having data-driven approach to make investment decisions. Increasingly, we leverage AI to amplify our ability to scale reliability across Azure. Our teams contribute to product… more
    Microsoft Corporation (11/26/25)
    - Related Jobs
  • Principal Software Engineer - Opensearch

    Oracle (Indianapolis, IN)
    …use cases such as **application search** , **log analytics** , and ** observability pipelines** . + Deep understanding of **distributed systems architecture** , ... including experience building and maintaining **high-throughput, highly available services** at scale. + Proficient in **high-level programming languages** , particularly **Java and Python** , with a strong emphasis on clean, maintainable, and testable code. +… more
    Oracle (11/25/25)
    - Related Jobs
  • Senior Principal Software Engineer

    Oracle (Trenton, NJ)
    …media tools). + Ensure services are built for scale, availability, observability , performance, and security, optimized for graphics and rendering pipelines. + ... Collaborate with distributed engineering teams to deliver cloud-native solutions for media production workflows. + Drive operational excellence for GPU-powered services, including performance monitoring, failure analysis, and workload optimization. + Stay… more
    Oracle (11/25/25)
    - Related Jobs
  • Senior Software Engineer - Security…

    Oracle (Nashville, TN)
    …learn. **Responsibilities** + Cloud service design for availability, scalability, observability , and testability. + Implementation, validation and documentation of ... services and their component micro-services. + Stay abreast of emerging technologies, industry best practices, ensuring compliance and driving innovation within the organization. + Work collaboratively to realize and achieve the technical vision of the team. +… more
    Oracle (11/25/25)
    - Related Jobs
  • Senior Software Engineer , GraphQL Platform

    New York Times (New York, NY)
    …caching, and multi-region operations + Experience troubleshooting and improving observability for a platform distributed across multiple systems + Experience ... deploying and maintaining applications on Kubernetes This is a hybrid role. #LI-Hybrid REQ- 018720 The annual base pay range for this role is between: $140,000 - $160,000 USD The New York Times Company is committed to being the world's best source of… more
    New York Times (11/25/25)
    - Related Jobs
  • (USA) Staff, Software Engineer

    Walmart (Bentonville, AR)
    …and Enterprise products + Working knowledge on any of the Observability tools and enterprise monitoring solutions like Dynatrace, AppDynamics, New Relic, ... Prometheus etc. + Root-cause analysis complex problems involving multiple parties, networks, hardware, and software that relate to scaling and performance. + Secure the system from issues, be they real, perceived, or notional. **What you'll bring:** +… more
    Walmart (11/25/25)
    - Related Jobs