"Alerted.org

Job Title, Industry, Employer
City & State or Zip Code
20 mi
  • 0 mi
  • 5 mi
  • 10 mi
  • 20 mi
  • 50 mi
  • 100 mi
Advanced Search

Advanced Search

Cancel
Remove
+ Add search criteria
City & State or Zip Code
20 mi
  • 0 mi
  • 5 mi
  • 10 mi
  • 20 mi
  • 50 mi
  • 100 mi
Related to

  • Site Reliability Engineer

    IBM (Austin, TX)



    Apply Now

    Introduction

    A career in IBM Software means you’ll be part of a team that transforms our customer’s challenges into solutions.

     

    Seeking new possibilities and always staying curious, we are a team dedicated to creating the world’s leading AI-powered, cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers, so the door is always open for those who want to grow their career.

     

    IBM’s product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.

    Your role and responsibilities

    As an Entry-Level Site Reliability Engineer (SRE) on our Austin Development Lab Engineering Team, you will join a team dedicated to ensuring the reliability, scalability, and performance of IBM systems and infrastructure. This plays a critical role in advancing critical IBM Power System development initiatives, gaining hands-on experience with both physical hardware and software environments.

    Your responsibilities will include, but not limited to:

    * Assisting in the setup, configuration, and maintenance of IBM Power servers and related infrastructure.

    * Performing hands-on tasks in the lab, including racking, cabling, hardware troubleshooting, and physical system configuration.

    * Supporting software-related reliability initiatives such as automation, monitoring, performancetuning, and system optimization.

    * Participating in incident response, diagnostics, and root-cause analysis for both hardware and software issues.

    * Collaborating with cross-functional teams to ensure smooth integration between physical systems and application environments.

    * Supporting projects related to lab analytics—gathering, analyzing, and interpreting data to help guide better business and operational decisions.

    * Contributing to the deployment, scaling, and ongoing maintenance of production and test systems.

    * Writing clear, concise documentation for processes, configurations, and troubleshooting steps.

    * Learning and applying best practices in systems reliability, observability, and infrastructure operations.

    * You will be expected to grow into a well-rounded SRE capable of tackling challenges in both the physical data center like environment and the software layer that powers our services.

    * Mentorship and hands-on training will be provided to help you develop the skills to excel in both domains.

    Required technical and professional expertise

    • To be successful in this role, the candidate must be hands-on, proactive, talented at problem solving, have the attitude to challenge the norm and have a strong desire to learn and work towards perfection.

    • Passion for eliminating repetitive manual processes using automation.

    • Strong attention to detail and excellent analytical capabilities.

    • Excellent troubleshooting, problem solving, and debugging skills.

    • Proficiency in programming concepts and frameworks.

    • Proficiency in scripting/coding for automation using Python, shell scripting (bash, etc), Ansible, and related tools and languages.

    • Familiarity with server operations, virtualization, and related infrastructure concepts.

    • Fundamental understanding of computer networks.

    • Fundamental understanding of data science/analytics framework.

    • An automation mindset, wherever possible, you should use scripting and automation.

    • Ability to work independently and as part of a team to achieve the SRE agenda.

    • Complete project work, both supervised and unsupervised.

    • Ability to effectively prioritize and execute tasks in a high-pressure environment.

    • Good Written, oral, and interpersonal communication skills.

    Preferred technical and professional experience

    * Fundamental understanding of Linux/Unix systems is a plus.

    * Fundamental knowledge of Red Hat OpenShift and Kubernetes is a plus

    * Automation/Scripting: In-depth experience with Ansible, Python, Terraform, and CI/CD tools is a plus, but a fundamental understanding is a must.

    * Hands-on experience crafting alerts and dashboards using Python or any other language.

     

    IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.

     


    Apply Now



Recent Searches

[X] Clear History

Recent Jobs

  • Site Reliability Engineer
    IBM (Austin, TX)
  • Assistant Director of Events
    Oregon State University (Corvallis, OR)
  • Content Manager
    Robert Half The Creative Group (Philadelphia, PA)
  • Sr. Supervisor, Operations
    American Water (Peoria, IL)
[X] Clear History

Account Login

Cancel
 
Forgot your password?

Not a member? Sign up

Sign Up

Cancel
 

Already have an account? Log in
Forgot your password?

Forgot your password?

Cancel
 
Enter the email associated with your account.

Already have an account? Sign in
Not a member? Sign up

© 2025 Alerted.org