"Alerted.org

Job Title, Industry, Employer
City & State or Zip Code
20 mi
  • 0 mi
  • 5 mi
  • 10 mi
  • 20 mi
  • 50 mi
  • 100 mi
Advanced Search

Advanced Search

Cancel
Remove
+ Add search criteria
City & State or Zip Code
20 mi
  • 0 mi
  • 5 mi
  • 10 mi
  • 20 mi
  • 50 mi
  • 100 mi
Related to

  • Distinguished, Architect - AI/ML

    Walmart (Sunnyvale, CA)



    Apply Now

    Position Summary...

     

    Building the right technology foundation for Infrastructure & platforms is vital to success at the scale of Walmart. Our team builds and maintains the foundational technologies that support the tech organization. Included in this are data platforms, enterprise architecture, DevOps, cloud computing, and infrastructure. All of these products and services are supported by scalable and powerful infrastructure, ensuring a secure and seamless employee and customer experience across stores, digital channels, and distribution centers.

     

    What you'll do...

     

    Join Walmart Global Tech's Site Reliability Engineering organization as a Distinguished AI/ML Engineer to architect revolutionary agentic AI systems that autonomously monitor, predict, and resolve issues across the world's largest retailer's technology ecosystem, impacting millions of customers and associates globally. You'll lead the transformation of traditional SRE practices into cutting-edge, self-healing platforms that serve as the intelligent backbone for reliability engineering across all of Walmart's systems, from e-commerce to stores to supply chain. You'll be responsible for designing and building Tier 0 high-availability, resilient agentic platforms that serve as the backbone for reliability engineering across all of Walmart's systems, stores and facilities across US and international markets while defining and implementing unified, intelligent, operationally robust technical solutions and tools for all Walmart Technology organizations across all channels and geographies. **What you'll do:** **AI/ML & Agentic Systems Technical Leadership:**

     

    + **Architect and develop advanced agentic AI systems** that can autonomously handle complex reliability engineering workflows, predictive failure analysis, and self-optimization across all Walmart technology systems.

    + **Design and implement multi-agent orchestration platforms** that coordinate between different AI agents for automated incident response, capacity planning, and performance optimization across e-commerce, supply chain, and in-store systems.

    + **Build intelligent observability and monitoring systems** using ML-driven anomaly detection, predictive analytics, and autonomous incident resolution capabilities that span all of Walmart's technology ecosystem.

    + **Develop self-healing infrastructure platforms** that leverage AI to predict, prevent, and automatically resolve system issues before they impact customers, associates, or business operations across any Walmart system.

    Site Reliability Engineering Technical Excellence:

    + **Design, write and build advanced tools to improve reliability, latency, availability, and scalability** of all Walmart Tech systems including: 1) Engineer reliability and availability starting with metrics and measurements across all domains, 2) Enable scaling by providing technical solutions, developing automation and/or optimizing processes for all engineering teams, 3) Build tools/automate to prevent re-occurrence of problems across all mission critical Walmart services, 4) Augment existing instrumentation to build a cohesive picture of system characteristics across the entire Walmart technology landscape with special attention to points of failure.

    + **Architect and implement fault-tolerant systems and services** across Walmart's hybrid cloud infrastructure with focus on autonomous recovery and intelligent failure prediction for e-commerce, supply chain, financial services, and in-store technology.

    + **Collaborate with engineering teams and leadership** across all Walmart technology organizations to establish technical strategies and solutions to improve mean time to detect (MTTD) and mean time to restore (MTTR) through intelligent automation and predictive capabilities.

    + **Work with service owners across all domains** (e-commerce, supply chain, stores, fintech, etc.) to define SLOs and build SLIs to ensure all critical systems are meeting SLAs while maintaining optimal performance and user experience.

    + **Perform complex troubleshooting and analysis** of large-scale distributed systems across Walmart's entire technology stack, using expertise in coding, algorithms, and distributed system design.

    Strategic Technical Innovation:

    + **Partner closely with all engineering organizations** including E-commerce, Supply Chain, Store Technology, Fintech, and Data Platform teams to deliver autonomous reliability solutions through advanced machine learning, natural language processing, and computer vision technologies.

    + **Drive the development of MLOps and AIOps platforms** that enable continuous learning, model deployment, monitoring, and autonomous optimization of reliability engineering systems across all Walmart domains.

    + **Innovate in agentic AI technologies for SRE** including large language models (LLMs) for automated incident response, reinforcement learning agents for capacity optimization, multi-modal AI for infrastructure monitoring, and federated learning for cross-domain reliability insights.

    + **Implement advanced CI/CD pipelines for reliability systems** including automated deployment, validation, and rollback mechanisms for SRE tools and monitoring systems with built-in observability and performance monitoring.

    + **Establish platform engineering excellence** by building reusable SRE infrastructure, intelligent monitoring platforms, and developer productivity tools that serve all Walmart engineering teams.

    + **Provide technical mentorship and guidance** to engineering teams across all Walmart organizations on advanced SRE concepts, AI/ML for reliability, platform engineering best practices, and autonomous system design through code reviews, technical discussions, and knowledge sharing.

    What you'll bring:** **Advanced AI/ML & Agentic Systems Expertise

    + 12+ years of expert-level experience with machine learning algorithms, deep learning frameworks (TensorFlow, PyTorch), and production ML deployment at enterprise scale

    + Deep hands-on experience building agentic AI systems, multi-agent frameworks, LLM-based agents, and autonomous decision-making platforms

    + Proven ability to architect and implement AI-driven solutions for complex technical challenges

    Enterprise-Scale Site Reliability Engineering Mastery

    + Comprehensive SRE expertise including Service Management (Incident, Problem & Change), Performance Engineering, and capacity planning for mission-critical systems

    + Deep understanding of reliability KPIs (MTTD, MTTR, availability) with proven track record of improving system reliability at scale

    + Experience with chaos engineering, fault injection, and building self-healing systems across diverse technology stacks

    Cloud-Native Platform Engineering at Scale

    + Expert-level cloud engineering experience (Azure, GCP, AWS) with deep knowledge of containerization (Kubernetes, Docker) and serverless architectures

    + Strong platform engineering skills including Infrastructure as Code (Terraform, CloudFormation), service mesh architectures, and building developer productivity tools

    + Experience designing and implementing self-service ML deployment platforms and API gateways for enterprise environments

    Advanced Observability & Monitoring Excellence

    + Deep expertise with distributed tracing (OpenTelemetry, Jaeger), metrics collection (Prometheus, Grafana), and log aggregation (ELK stack, Splunk)

    + Hands-on experience building AI-driven anomaly detection, predictive monitoring systems, and ML-specific dashboards

    + Proven ability to implement comprehensive observability solutions for complex AI/ML pipelines and distributed systems

    About Walmart Global Tech

    Imagine working in an environment where one line of code can make life easier for hundreds of millions of people. That’s what we do at Walmart Global Tech. We’re a team of software engineers, data scientists, cybersecurity expert's and service professionals within the world’s leading retailer who make an epic impact and are at the forefront of the next retail disruption. People are why we innovate, and people power our innovations. We are people-led and tech-empowered. We train our team in the skillsets of the future and bring in experts like you to help us grow. We have roles for those chasing their first opportunity as well as those looking for the opportunity that will define their career. Here, you can kickstart a great career in tech, gain new skills and experience for virtually every industry, or leverage your expertise to innovate at scale, impact millions and reimagine the future of retail. Walmart’s culture is a competitive advantage, and it’s fostered by being together. Working together in person allows us to collaborate, align quickly and innovate with greater speed. We use our campuses to create purposeful connection rooted in deepening understanding and investing in the development of our associates.

     

    Our hubs: Walmart is a global company with offices across the United States and around the world. Our global headquarters is in Bentonville, Arkansas, with primary hubs in the San Francisco Bay area and New York/New Jersey.

    Benefits:

    Benefits: Beyond our great compensation package, you can receive incentive awards for your performance. Other great perks include 401(k) match, stock purchase plan, paid maternity and parental leave, PTO, multiple health plans, and much more.

    Equal Opportunity Employer:

    Walmart, Inc. is an Equal Opportunity Employer – By Choice. We believe we are best equipped to help our associates, customers, and the communities we serve live better when we really know them. That means understanding, respecting, and valuing unique styles, experiences, identities, ideas, and opinions – while being inclusive of all people.

     

    _The above information has been designed to indicate the general nature and level of work performed in the role. It is not designed to contain or be interpreted as a comprehensive inventory of all responsibilities and qualifications required of employees assigned to this job. The full Job Description can be made available as part of the hiring process._

     

    At Walmart, we offer competitive pay as well as performance-based bonus awards and other great benefits for a happier mind, body, and wallet. Health benefits include medical, vision and dental coverage. Financial benefits include 401(k), stock purchase and company-paid life insurance. Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty, and voting. Other benefits include short-term and long-term disability, company discounts, Military Leave Pay, adoption and surrogacy expense reimbursement, and more. You will also receive PTO and/or PPTO that can be used for vacation, sick leave, holidays, or other purposes. The amount you receive depends on your job classification and length of employment. It will meet or exceed the requirements of paid sick leave laws, where applicable. For information about PTO, see https://one.walmart.com/notices . Live Better U is a Walmart-paid education benefit program for full-time and part-time associates in Walmart and Sam's Club facilities. Programs range from high school completion to bachelor's degrees, including English Language Learning and short-form certificates. Tuition, books, and fees are completely paid for by Walmart.

     

    Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to a specific plan or program terms.

     

    For information about benefits and eligibility, see One.Walmart (https://one.walmart.com/) .

     

    The annual salary range for this position is $169,000.00 - $338,000.00 Additional compensation includes annual or quarterly performance bonuses. Additional compensation for certain positions may also include :

     

    - Stock

     

    ㅤ

     

    ㅤ

     

    ㅤ

     

    ㅤ

     

    ‎

     

    Minimum Qualifications...

     

    _Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications._

     

    Option 1: Bachelor's degree in computer science, computer engineering, computer information systems, software engineering, or related area and6 years’ experience in software engineering, architecture, or related area.Option 2: 8 years’ experience in software engineering, architecture, or related area.

     

    Preferred Qualifications...

     

    _Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications._

     

    Master’s degree in computer science, computer engineering, computer information systems, software engineering, or related area and 4 years' experience in software engineering, architecture or related area., We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly. The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart’s accessibility standards and guidelines for supporting an inclusive culture.

     

    Primary Location...

     

    1345 Crossman Ave, Sunnyvale, CA 94089-1114, United States of America

     

    Walmart and its subsidiaries are committed to maintaining a drug-free workplace and has a no tolerance policy regarding the use of illegal drugs and alcohol on the job. This policy applies to all employees and aims to create a safe and productive work environment.

     

    Walmart, Inc. is an Equal Opportunity Employer- By Choice. We believe we are best equipped to help our associates, customers, and the communities we serve live better when we really know them. That means understanding, respecting, and valuing diversity- unique styles, experiences, identities, abilities, ideas and opinions- while being inclusive of all people.

     


    Apply Now



Recent Searches

[X] Clear History

Recent Jobs

  • Distinguished, Architect - AI/ML
    Walmart (Sunnyvale, CA)
  • Critical Facilities Technician (DISA OKC)
    EMCOR Group (Oklahoma City, OK)
[X] Clear History

Account Login

Cancel
 
Forgot your password?

Not a member? Sign up

Sign Up

Cancel
 

Already have an account? Log in
Forgot your password?

Forgot your password?

Cancel
 
Enter the email associated with your account.

Already have an account? Sign in
Not a member? Sign up

© 2025 Alerted.org