- Celestica (Richardson, TX)
- …SAN, and NAS. + Strong understanding of server architectures (x86, ARM, GPU servers), CPU/memory subsystems, PCIe, and power management. + Strong understanding of ... server architectures (x86, ARM, GPU servers), CPU/memory subsystems, PCIe, power management, and Baseband Management Controllers (BMC) functionality. + Proficiency… more
- Celestica (Richardson, TX)
- …benchmarking, and bottleneck identification across all layers of the rack (CPU, GPU , memory, PCIe, network fabric, storage I/O, power delivery). + and AI/ML ... (NVMe). + Experience with performance benchmarking and tuning tools for CPU, GPU , network, and storage. + Proficiency in Linux operating systems, including system… more
- Meta (Austin, TX)
- …modern parallel environments (eg 9. distributed clusters, multicore SMP, and GPU ). 10. Telecommute from anywhere in the US permitted. **Minimum Qualifications:** ... Minimum Qualifications: 11. Master's degree (or foreign degree equivalent) in Computer Science, Computer Software, Computer Engineering, Applied Sciences, Mathematics, Physics, or related field and two years of work experience in the job offered or in a… more
- American Airlines (Dallas, TX)
- …resource allocation + Experience with AI-specific platform technologies (eg, NVIDIA GPU Cloud, Lambda Labs, or other AI-optimized infrastructure) + Experience ... developing and integrating APIs to support AI model access and deployment across applications + Proficiency with microservices architecture for scalable and modular AI services + Familiarity with continuous monitoring and logging tools (eg, Prometheus,… more
- CVS Health (Austin, TX)
- …and CDI (Containerized Data Importer), VM templates and golden images, SR-IOV and GPU passthrough, and hybrid workload orchestration mixing VMs and containers on the ... same platform + **Container Orchestration** : Multi-cluster management, cluster autoscaling, workload scheduling, resource quotas, and cross-cluster networking with service mesh (Istio, Linkerd) + **Infrastructure Automation** : Terraform/Terragrunt for… more
- Amazon (Dallas, TX)
- …preprocessing, model hosting, feature selection, hyperparameter tuning, distributed & GPU training, deployment, monitoring, and retraining - Experience with MLOps ... (eg, MLFlow, Kubeflow) and orchestration (eg, Airflow, AWS Step Functions). Experience building applications using GenAI technologies (LLMs, Vector Stores, LangChain, Prompt Engineering) Amazon is an equal opportunity employer and does not discriminate on the… more
- Meta (Austin, TX)
- …Qualifications: 27. Experience with AI/ML infrastructure planning and large-scale GPU cluster deployments 28. Background in hyperscale data center operations ... 29. Knowledge of infrastructure automation 30. Understanding of network architecture and distributed systems at hyperscale **Public Compensation:** $342,000/year to $403,000/year + bonus + equity + benefits **Industry:** Internet **Equal Opportunity:** Meta is… more
- Microsoft Corporation (Austin, TX)
- …will allow you to develop deeper business acumen on cloud supply chain, AI, and GPU as you interact across all levels of the organization. In the S&OE horizon, you ... will: + Create the AI-related demand plan in the system and keep it up to date + Coordinate supply supportability inputs to understand if there are any constraints which would impact commits + Perform pre-checks to ensure each demand request has all required… more
- Microsoft Corporation (Austin, TX)
- …proven track record of delivering complex Central Processing Unit(CPU), Graphics Processing Unit( GPU ) or System on Chip(SoC) IPs. + Demonstrated experience in one or ... more of the following: interconnect fabrics, Network on Chip(NOCs), AXI-4 protocol base other complex IP/blocks or subsystems. + Experience with IP/SOC verification for a full product cycle from definition to silicon, including writing IP/block or subsystem… more
- Meta (Austin, TX)
- …ARC team works closely with Reality Labs Research teams to implement mid sized GPU clusters tailored to meet individual team needs. ARC is engaged at nearly every ... layer of the deployments beginning with hardware selection and design, all the way through deployment, configuration, and life cycle management.As one of the leading engineering teams in the enterprise world, our goal is to provide a solid foundation and… more