- 
        Senior Service Engineer - Livesite Manager - CTJ…
- Microsoft Corporation (Reston, VA)
- 
             Do you want to work on the cutting edge of distributed systems and high scale storage? Do you want to work on a meaningful and impactful project and make a difference to the U.S. government and country? Azure Storage for air-gapped clouds (AGC) is a foundational part of Azure and is entrusted with storing exabytes of data for everything from the virtual hard disks that back Azure Virtual Machines to customer blobs to SQL Server databases to OneDrive content, all while providing industry leading availability and durability for that data. We are recruiting for a Senior Service Engineer – Livesite Operations. This individual will lead critical production operations for Azure Storage Core services, driving observability, automation, and engineer readiness to ensure operational reliability and resilience at hyperscale. What is Livesite? At Microsoft, Livesite refers to a customer-first, production-focused mindset and set of practices aimed at keeping services always up and healthy. It includes incident management, proactive monitoring, automation, and continuous improvement to minimize customer impact and reduce Time-to-Mitigation (TTM) for critical issues. Livesite work spans the entire lifecycle: • Pre-Incident: Monitoring, alerting, and preventive measures. • During Incident: Rapid mitigation and communication under high-pressure conditions. • Post-Incident: Retrospectives, repair tracking, and amplifying learnings to prevent recurrence. Why Join Us? You’ll play a pivotal role in ensuring the reliability of Azure Storage services that power mission-critical workloads, including premium AI scenarios. Your work will directly impact customer trust and accelerate innovation across Microsoft’s cloud platform. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. Responsibilities + Lead tracking and resolution of incidents, drive improvements for time-to-mitigation and operational reliability metrics, drive repair items. + Manage Incident Manager (IM) and Core rotations, including onboarding, livesite support processes, and readiness for new on-call engineers. + Develop and, coordinate training programs to ensure engineer preparedness for livesite responsibilities. + Define and implement observability standards and practices for Core services. + Partner with engineering teams to enhance telemetry, alerting, AI automation, and dashboards to reduce manual overhead and improve operational efficiency + Drive parity efforts to align with security and compliance standards. + Work with engineering leads to develop support plans for new services onboarded into the AGC + Embody our culture (https://careers.microsoft.com/v2/global/en/culture) and values (https://www.microsoft.com/en-us/about/corporate-values) Qualifications Required / Minimum Qualifications: + Bachelor's Degree in Computer Science, Information Technology, Mechanical Engineering, Electrical Engineering, Aerospace Engineering, Data Science, Cybersecurity, or related field AND 3+ years technical experience in software engineering, network engineering, service engineering, systems engineering, or industrial controls OR equivalent experience. Other Requirements: **Security Clearance Requirements:** Candidates must be able to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: + **The successful candidate** **must have an active** **U.S. Government Top Secret Clearance with access to Sensitive Compartmented Information (SCI) based on a Single Scope Background Investigation (SSBI) with Polygraph.** Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. Failure to maintain or obtain the appropriate U.S. Government clearance and/or customer screening requirements may result in employment action up to and including termination. + **Clearance Verification** : This position requires successful verification of the stated security clearance to meet federal government customer requirements. You will be asked to provide clearance verification information prior to an offer of employment. + **Microsoft Cloud Background Check** : This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter. + **Citizenship & Citizenship Verification:** This position requires verification of U.S. citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local United States government agency customer and is subject to certain citizenship-based restrictions where required or permitted by applicable law. To meet this legal requirement, citizenship will be verified via a valid passport, or other approved documents, or verified US government Clearance Preferred Qualifications: + Master's Degree in Computer Science, Information Technology, Mechanical Engineering, Electrical Engineering, Aerospace Engineering, Data Science, Cybersecurity, or related field AND 6+ years technical experience in software engineering, network engineering, service engineering, systems engineering, or industrial controls OR Bachelor's Degree in Computer Science, Information Technology, Mechanical Engineering, Electrical Engineering, Aerospace Engineering, Data Science, Cybersecurity, or related field AND 8+ years technical experience in software engineering, network engineering, service engineering, systems engineering, or industrial controls OR equivalent experience.3+ years technical experience working with large-scale cloud or distributed systems. + Hands-on experience with livesite operations, observability, and incident response. + Excellent communication and stakeholder management skills. + Experience and proficiency in automation frameworks and observability tooling (dashboards, query languages such as Kusto/SQL). + Experience with Azure or similar hyperscale cloud platforms. + Background in security, compliance, or parity initiatives Service Engineering IC4 - The typical base pay range for this role across the U.S. is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $158,400 - $258,000 per year. Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay Microsoft will accept applications for the role until October 31, 2025. \#Silver Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) . 
 
 
- 
        
Recent Jobs
- 
                
                    Senior Service Engineer - Livesite Manager - CTJ - Poly
                
                - Microsoft Corporation (Reston, VA)