- Amazon (Dallas, TX)
- …Data Centers in your region. - Troubleshoot, conduct Root Cause Analysis ( RCA ) and create Corrective Action (CA) documentation for site/equipment failures. - ... Directly support operational issues with ad-hoc training, complex operating procedure reviews, including critical equipment, and event support. - Own the design for existing data center upgrades and design-solutions, which add capacity, improve availability,… more
- Deloitte (Fort Worth, TX)
- …performance tuning in ServiceNow, ability to troubleshoot and identify bottlenecks, establish RCA for system performance issues, and experience in using Health Scan ... Data, and Instance Observer to improve overall system performance. + Excellent software engineering and product architecture/design foundation with deep understanding of Business Context Diagrams (BCD), sequence/activity/state/ER/DFD diagrams, OOP/OOD,… more
- SpaceX (Bastrop, TX)
- …repairs, overhauls, and equipment upgrade projects + Perform root cause analysis ( RCA ) and action corrective measures to address equipment, process, and behavioral ... gaps + Leverage feedback from maintenance technicians to drive improvements to work instructions and processes + Assist in creation, modification, and updates to preventative maintenance instructions to eliminate recurring failures or low value add tasks +… more
- Fiserv (Frisco, TX)
- …executives professionally and effectively. Ability to extrapolate and deliver an RCA . Skilled at recognizing problems when small and proactively resolving before ... escalations occur. + Experience with APIs, and other relevant programming language and Tools. **What would be good to have** : + Experience in a B2B SaaS company. + Experience with Payment Processing life cycle such as Reversal, Chargeback, and Refund. + Deep… more
- Cognizant (Austin, TX)
- …requirements and strategize planning from SRE and resiliency perspective * Triage and RCA of production incidents * Observability and monitoring with APM tools and ... creating dashboards/alerts and automation for incidents * Leadership qualities like cross teams collaboration and effective communication **Required Qualifications:** * Agile * JIRA * Budget and Resource Planning * ITSM tools like Service Now * Microsoft… more
- CVS Health (Austin, TX)
- …actions. + Join forces with problem management in Root Cause Analysis ( RCA ), corrective actions through closure and proactive problem management. + Prepare and ... present metrics, status and service health reports. + Conduct training and knowledge-sharing sessions for various teams and new hires to support standardized processes. + Drive the continuous improvement of service management processes and procedures to… more
- JPMorgan Chase (Plano, TX)
- …communication during critical network incidents. + Conduct Post-Mortem/Root Cause Analysis ( RCA ) on major incidents to identify underlying causes and implement ... corrective actions. + Monitor and maintain network infrastructure to ensure optimal performance and availability. + Troubleshoot and resolve complex network issues escalated from Tier-1 and Tier-2 support teams. + Perform network upgrades and maintenance… more
- S&P Global (Dallas, TX)
- …proactive Service Improvement initiatives, Client Action Plans, Root Cause Analysis ( RCA ), and post-mortem reviews for Incidents or Service Level Agreement breaches, ... focusing on technical aspects. + Host regular reviews with Clients to provide updates on current technical activities and operating results, ensuring transparency and alignment with client expectations. + Proactively monitor and act against key metrics for… more
- Cognizant (Austin, TX)
- …resiliency perspective + Specialized in building and managing automation + Triage and RCA of production incidents + Observability and monitoring with APM tools and ... creating dashboardsandalerts and automation for incidents + Leadership qualities like cross teams collaboration and effective communication **Required Qualifications** + Possess strong technical skills in Cloud Basics Unix Linux Docker Kubernetes Openshift APM… more
- Ryder System (Lancaster, TX)
- …best result at the most cost-effective solution. Conduct root cause analysis ( RCA ) on recurring faults or systemic issues; implement corrective and preventive ... actions. + Adhere to maintenance strategies. + Perform and manage preventative maintenance functions of team on lower, to mid complexity automated sites. **Additional Responsibilities** + Perform other duties as requested. **Skills and Abilities** + Ability to… more