- Zoom (San Jose, CA)
- …large clusters; + Gain an in-depth understanding of ES or Kafka management features, and collaborate with the development department on architecture and functional ... for core services; + Conduct cutting-edge middleware research and apply it to current systems ; + Train new employees and create plans to gradually enable them to… more
- Walmart (Sunnyvale, CA)
- … management system. You'll independently handle high impact, critical software/ systems monitoring issues, troubleshoot business and production issues. As ... of changes. * Provides support to the business for new and existing systems by responding to user questions, concerns, and issues (for example, technical… more
- Rubrik (Sacramento, CA)
- …and exceeding availability and reliability goals * Manage and streamline monitoring systems to enhance observability and enable proactive identification ... enable teams at Rubrik to develop secure software and protect data and systems with appropriate security controls. Information Security also develops systems to… more
- Chevron Corporation (Bakersfield, CA)
- …assuring power system integrity and reliability, and monitoring power systems for efficient power management . Horizons engineers typically complete the ... and projects through the application of technical engineering knowledge and project management skills. These areas of support may include: * Optimizing facilities to… more
- Walmart (Sunnyvale, CA)
- …Strong expertise with Cloud Technologies like Azure and GCP. + Experience in monitoring production system and using different systems like Grafana, Prometheus. + ... you'll do:** + Design, build, and deploy large-scale, production-grade ML systems , including deep learning models, real-time inference services, and end-to-end ML… more
- NVIDIA (Santa Clara, CA)
- …advancement. What you will be doing: + Drive next generation fleet management solutions for scaling AI infrastructure using GPUs and Grace solution from ... Nvidia. Work with customers, product management and other architects to narrow down on requirements...+ Bring up clarity on architecture for fleet health monitoring and fault-remediation solution at scale. Work with customers… more
- NVIDIA (Santa Clara, CA)
- …custom software related to managing fleets of GPU nodes. + Implementing monitoring and health management capabilities that enable industry leading reliability, ... and programming using libraries and APIs exposed by baseboard management controllers. We welcome out-of-the-box thinkers who can provide...part of an DGX Cloud team responsible for production systems that enable large scalable GPU clusters to be… more
- NVIDIA (Santa Clara, CA)
- …or ARM Platforms including BMC-BIOS communication, thermal management , power management , firmware update, device monitoring , firmware security, etc. + Board ... of NVIDIA Server platforms. + Designing and developing performance optimized active monitoring BMC solutions using DMTF Standards including MCTP, Redfish, SPDM and… more
- TEKsystems (San Diego, CA)
- …team supporting the deployment and sustainment of advanced ground systems for tactical communication platforms. This role involves hands-on technical ... initial deployment, operations, and maintenance of airborne and terrestrial communication systems providing voice and data services. + Collaborate with engineering,… more
- TP-Link North America, Inc. (Irvine, CA)
- ABOUT US: Headquartered in the United States, TP-Link Systems Inc. is a global provider of reliable networking devices and smart home products, consistently ranked ... We believe technology changes the world for the better! At TP-Link Systems Inc, we are committed to crafting dependable, high-performance products to connect… more
Recent Jobs
-
Penske - Truck Driver - Class A - Penske Logistics
- Penske Truck Leasing (Hanover, MD)