- NVIDIA (Santa Clara, CA)
- …what is possible. Make the choice to join us today. As a software engineer in our internal infrastructure group, you will craft services, tools and libraries for ... track record of owning features end-to-end including testing, deployment, and monitoring . + Excellent planning, interpersonal and problem solving skills. +… more
- NVIDIA (Santa Clara, CA)
- …data systems like Ray, Spark Rapids + Familiarity with metrics collection, health monitoring , and observability tools + Building, operating and maintaining full ... ML platform for data scientists to use. As a data processing platform engineer , you will design, implement and operate Kubernetes based GPU accelerated data… more
- Coinbase (Sacramento, CA)
- …root cause analysis, and blameless retrospectives * Define metrics and bolster monitoring / observability across corporate IAM systems * Participate in regular ... and fully supported. Coinbase is hiring! We are looking for an experienced system engineer (SE) to join the IT Operations Corporate Engineering team to build and… more
- LiveRamp (San Francisco, CA)
- …with Engineering teams** + **Setup and maintain Infrastructure & Product Reliability monitoring and alerting** + **Maintain and enhance CI/CD Tooling and Terraform ... clouds (GCP or AWS)** + **Experience with deployment and monitoring of highly scalable products.** + **Hands on experience...+ **Experience with SRE best practices, working knowledge of observability principles is a big plus** + **Ability to… more
- Walmart (Sunnyvale, CA)
- …automation. + Deploy and monitor products on **cloud platforms with agent observability ** , telemetry, and auditability in mind. + Develop and implement ... best-in-class **data health monitoring , traceability, and context enrichment** processes to ensure data used by agents is reliable and governed. + Lead technical… more
- Rubrik (Sacramento, CA)
- …and exceeding availability and reliability goals * Manage and streamline monitoring systems to enhance observability and enable proactive identification ... of issues. * Coordinate and manage incidents, upgrades and changes for InfoSec's applications and services * Drive post-incident analysis with partner teams and/or vendors to identify root cause and ensure preventative measures are implemented promptly *… more
- LinkedIn (Mountain View, CA)
- … senior technical leader driving the long-term reliability and observability strategy across LinkedIn's infrastructure + Re-architect LinkedIn's backend systems ... excellence and incident response + Define and build frameworks to improve monitoring , alerting, and observability across hundreds of services and systems… more
- NVIDIA (Santa Clara, CA)
- …a secure operational environment. + Lead initiatives to improve network observability by integrating advanced monitoring and alerting systems, collaborating ... GeForce Now is looking for a Manager, Network Site Reliability Engineer (SRE) to enhance our network infrastructure and operations. We are looking for a leader who… more
- NVIDIA (Santa Clara, CA)
- …tools for collecting, analyzing, and visualizing data for reporting, alerting, monitoring . + Collaborate with NVIDIA leadership, senior engineers, program ... building for performance and reliability at global scale, covering automation, monitoring , high availability, capacity planning, and lifecycle management. + Define… more
- Insight Global (Santa Clara, CA)
- …and cloud). * Optimize training jobs for performance, resiliency, and cost. Monitoring & Reliability * Implement observability tools (logging, metrics, alerting) ... Job Description Insight Global is looking for a Sr . Data Scientist to work 4 days on-site...scalability, and preparing for a future IPO. The Data Engineer will be responsible for designing, building, and maintaining… more
Recent Jobs
-
Southern NH Health System - Float Physical Therapist - Full Time (25% additional premium pay)
- SolutionHealth (Nashua, NH)
-
Senior Full-Stack Java Developer w/ ArcGIS
- vaco (Plano, TX)
-
Occupational Therapist OT - Pediatric - Part Time
- Kaiser (Portland, OR)
-
Behavioral Health Tech Supervisor
- HireMaster (Cherry Hill, NJ)