- Oracle (Olympia, WA)
- …layer security, and service-to-service communication patterns. + Experience in automation and configuration management tools (Terraform, Ansible, Puppet, Chef, ... etc.). + Proficient in scripting languages (Python, Bash, etc.). + Experience with monitoring and observability tools (Prometheus, Grafana, ELK, etc.). + Familiarity with cloud platforms (OCI, AWS, Azure, GCP) and container technologies (Kubernetes, Docker). +… more
- Oracle (Olympia, WA)
- …fundamental architectural changes for GPU delivery, health monitoring, triage automation , and diagnostic services. These are essential for running distributed ... AI/ML/HPC workloads across thousands of GPUs, leveraging technologies like RoCE and Infiniband. **Why Join Us?** + **Innovative Projects:** Build groundbreaking solutions for our customers from the ground up. + **Exciting Times:** Be part of a young,… more
- Xplore Inc. (Bellevue, WA)
- …integrations, satellite tasking systems, visualizations for tasking and telemetry, automation of operations, image processing pipelines, and more + Integrate ... Major Tom with additional Ground Station Network providers via APIs. + Team Collaboration: Collaborate with the satellite operations team to understand their needs and translate them into functional software + Identify and troubleshoot customer-reported issues… more
- Oracle (Olympia, WA)
- …Serviceability): Experience with fault management, telemetry, and recovery. Test automation : Experience with CI/CD, unit/integration testing frameworks, and static ... analysis. Cloud-scale operations: Familiarity with cloud-scale operations and fleet management of BMC firmware. Other skills: Experience with GPU libraries like CUDA or ROCm is a plus. Disclaimer: **Certain US customer or client-facing roles may be required to… more
- Oracle (Seattle, WA)
- …fundamental architectural changes for GPU delivery, health monitoring, triage automation , and diagnostic services. These are essential for running distributed ... AI/ML/HPC workloads across thousands of GPUs, leveraging technologies like RoCE and Infiniband. **Why Join Us?** + Innovative Projects: Build groundbreaking solutions for our customers from the ground up. + Exciting Times: Be part of a young, fast-growing team… more
- Oracle (Olympia, WA)
- …infrastructure using best-in-class engineering practices. Contribute to platform automation , observability, CI/CD pipelines, and operational excellence. Troubleshoot ... complex issues in distributed systems and participate in on-call rotations as needed. Mentor junior engineers and participate in design and code reviews. **What You'll Do** + Build cloud service on top of the modern Infrastructure as a Service (IaaS) building… more
- Oracle (Olympia, WA)
- …fundamental architectural changes for GPU delivery, health monitoring, triage automation , and diagnostic services. These are essential for running distributed ... AI/ML/HPC workloads across thousands of GPUs, leveraging technologies like RoCE and Infiniband. **Why Join Us?** + **Innovative Projects:** Build groundbreaking solutions for our customers from the ground up. + **Exciting Times:** Be part of a young,… more
- Oracle (Olympia, WA)
- …optimize the SR (Service Request) flow using failure reporting, logs, and automation tools (ASR). + Drive improvements in service profitability and cost reduction; ... influence EOSL (End of Service Life) recommendations. + **Continuous Process Improvement:** + Lead and participate in continuous process improvement projects and ad hoc initiatives. + Provide a business view on product issues, quality, and performance… more
- Oracle (Olympia, WA)
- …Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to ... improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software… more
- Amazon (Kirkland, WA)
- …engineering excellence for reliability and disaster recovery while ensuring optimal developer experience for the SCADA application developers and system integrators. ... issues affecting SCADA availability - Support operations through infrastructure automation , reducing manual intervention and improving reliability - Drive continuous… more
Recent Jobs
-
Director, Business Development/KAM, Remote
- Boehringer Ingelheim (Boston, MA)