- Microsoft Corporation (Mountain View, CA)
- …in security, reproducibility, and cost efficiency. + Implement end-to-end observability and operations through metrics, tracing, logging, dashboard development, ... monitoring , and automated alerts for model training and platform health (using Prometheus, Grafana, OpenTelemetry). + Architect and operate services on Azure cloud… more
- The Walt Disney Company (New York, NY)
- …Public Cloud Provider (eg, AWS, Microsoft Azure, Google Cloud) + Experience with observability tools for metrics, logging, and monitoring (eg, Datadog, Splunk, ... Grafana) + Experience with storage & caching technologies: S3, S3 Compatible on-prem object storage, DynamoDB, RDBMS, Redis \#DISNEYTECH The hiring range for this position in Santa Monica, CA is $155,700 to $208,700 per year, in Seattle, WA and New York, NY is… more
- Amazon (Bellevue, WA)
- …technical solutions that deliver measurable impact. - Ensure reliability and observability , implementing automation, monitoring , and CI/CD pipelines using ... CloudFormation, and CodePipeline. - Contribute to long-term architecture vision, influencing build-vs-buy decisions, security posture, and cost optimization strategies for global learning platforms. About the team We operate at the intersection of technology,… more
- Microsoft Corporation (Redmond, WA)
- …seek new knowledge that will improve the availability, reliability, efficiency, observability , and performance of products while also driving consistency in ... monitoring and operations at scale. **Qualifications** **Required Qualifications:** + Candidate must be enrolled in a full time bachelor's or masters program in area… more
- NVIDIA (Santa Clara, CA)
- …distributed computing frameworks (SLURM, Ray), and multi-cloud environments + Observability & Automation: CI/CD, Infrastructure as Code, and GPU performance ... monitoring + Solutions architecture or consulting background with experience working across multiple customer engagements simultaneously. Technical pattern… more
- Microsoft Corporation (Redmond, WA)
- …solutions, and patterns that will improve the availability, reliability, efficiency, observability , and performance of products while also driving consistency in ... monitoring and operations at scale and shares knowledge with other engineers. **Qualifications** **Required Qualifications:** + Bachelor's Degree in Computer Science… more
- Microsoft Corporation (Redmond, WA)
- …seeks new knowledge that will improve the availability, reliability, efficiency, observability , and performance of products while also driving consistency in ... monitoring and operations at scale. **Qualifications** **Required Qualifications:** + Enrolled in a full time bachelor's or master's program in Computer Science,… more
- Insight Global (Colonie, NY)
- …release. Own observability for the core product: Improve and maintain monitoring with Datadog, Sentry, and GCP Cloud Logs / Error Reporting. Design dashboards, ... alerts, and runbooks so that production behavior is visible, actionable, and predictable. Make updates with minimal review without introducing additional bugs: Move quickly but safely in an evolving codebase by leaning on tests, logs, and clear rollback paths.… more
- Coinbase (Lansing, MI)
- …* Lead end-to-end delivery of projects through implementation, deployment, and monitoring * Improve and maintain operational excellence standards across the team, ... proactively addressing technical debt and driving improvements in reliability and observability * Participate in code reviews and on-call rotation, lead incident… more
- Coinbase (Santa Fe, NM)
- …* Lead end-to-end delivery of projects through implementation, deployment, and monitoring * Improve and maintain operational excellence standards across the team, ... proactively addressing technical debt and driving improvements in reliability and observability * Participate in code reviews and on-call rotation, lead incident… more