- ServiceNow, Inc. (Santa Clara, CA)
- …expertise with CentOS/RedHat in large-scale production environments + Familiarity with monitoring and testing tools such as AppDynamics, Jenkins, Splunk, and New ... Relic + Solid background in agile software development methodologies, particularly Scrum + Experience with the ServiceNow platform, including custom development beyond out-of-the-box configurations + Knowledge of concurrency, multithreading, and distributed… more
- NVIDIA (Santa Clara, CA)
- …lifecycle management for large-scale Machine Learning systems. + Implement monitoring and health management capabilities that enable industry-leading reliability, ... availability, and scalability of GPU assets. You will be harnessing multiple data streams, ranging from GPU hardware diagnostics to cluster and network telemetry. + Work on software that manages NVLINK topography across GPU clusters. + Build automated test… more
- Amazon (Cupertino, CA)
- …data center. After launch you will oversee the fleet of servers you develop, monitoring their quality and how they are meeting the customer requirements. A day in ... the life Your day to day responsibilities will include interfacing with our internal and external customers to understand project requirements and facilitate system development ontop of your server design. You will be responsible for learning operational… more
- DoorDash (San Francisco, CA)
- …end-to-end including feature creation, model development and deployment, experimentation, monitoring and explainability, and model maintenance. + Develop production ... machine learning solutions batch and realtime to provide the world class merchant experience. You can find out more on our ML blog here (https://doordash.engineering/category/data-science-and-machine-learning/) We're excited about you because you have + 5+… more
- DoorDash (San Francisco, CA)
- …to the stakeholders. + Aid in the development and implementation of continuous monitoring for key security controls. + Utilize data analytics to identify security ... trends and potential risks. We're excited about you because + You have 8+ years of experience in IT audit, cybersecurity, or a related field. + You have experience building a Security assessment program ground up and planning and executing Security Risk… more
- DoorDash (Redwood City, CA)
- …kitchen equipment, etc.) Food Safety + Standard operating procedures compliance & monitoring . + Ensure safe food handling procedures + Manage the shared storage ... areas of the facility Required experience and skills + Ability to establish and maintain effective working relationships with management, other associates, hourly employees and dashers while making decisions quickly and delegating responsibilities accordingly.… more
- Cardinal Health (Sacramento, CA)
- …materials and assets needed for B2B and D2C campaigns. + Assist with the monitoring and reporting of campaign performance, helping to track key metrics under the ... guidance of the B2B and D2C Marketing Managers. + Coordinate with external vendors or agencies on specific tasks, such as providing creative assets or tracking campaign progress. + Support the B2B and D2C Marketing Managers in managing project timelines and… more
- Broadcom (Palo Alto, CA)
- …CI/CD pipelines for streamlined application packaging + Implement comprehensive monitoring , logging, and alerting solutions + Troubleshoot complex issues related ... to Kubernetes clusters, applications, and services + Apply advanced Kubernetes design patterns + Develop automation tools and scripts for Kubernetes deployments + Collaborate cross-functionally to ensure seamless integration of Kubernetes solutions… more
- NVIDIA (Santa Clara, CA)
- …software related to managing fleets of GPU nodes. + Implementing monitoring and health management capabilities that enable industry leading reliability, ... availability, and scalability of GPU assets. You will be harnessing multiple data streams, ranging from GPU hardware diagnostics to cluster and network telemetry. + Working with teams across NVIDIA to ensure production AI clusters run reliability and… more
- Amazon (Irvine, CA)
- …scalable, highly available data solutions, including ETL/ELT pipelines, data quality, and monitoring processes. - 5+ years of experience and proficiency in SQL, ... Python, and at least one additional programming language (eg, Java, Scala, etc). Preferred Qualifications - AWS experience preferred, with proficiency in a wide range of AWS services (eg, EC2, S3, RDS, Lambda, IAM, VPC, CloudFormation) - AWS Professional level… more