-
Site Reliability Engineer II
- Microsoft Corporation (Reston, VA)
-
Do you have a passion for high scale services and working with some of Microsoft’s most critical customers? We’re looking for a **Site Reliability Engineer II** with the right mix of software development, on-line services experience and passion for quality to envision, design, and deliver Office 365 government cloud service offerings.
Office 365 is at the center of Microsoft’s cloud first, devices first strategy as it brings together cloud versions of our most trusted communication and collaboration products like Exchange, SharePoint, and Teams with our cross-platform desktop suites and mobile apps. The Office 365 Enterprise Cloud team works with Microsoft’s largest enterprise and government customers to deliver features that meet their specific needs and enable cloud adoption. As you would expect, our customers have the highest expectations for feature quality, security, reliability, availability, and performance.
The Site Reliability Engineering (SRE) team provides leadership, direction and accountability for application architecture, system design, and end-to-end implementation. As a Site Reliability Engineer, you will identify and deliver software improvements using your skills in software development, complexity analysis, and scalable system design. Effective collaboration skills will be required to work closely with other engineering teams to ensure services/systems are highly stable and performant, meeting the expectations of our government customers and users.
At Microsoft, we can offer you a collaborative team, exciting challenges, and a fun place to work. The work environment empowers you to have a positive impact on millions of end users.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities
+ Demonstrates working knowledge of distributed systems design, interactions between cloud technology layers and components, common dependencies at scale, and the code that defines infrastructures. Can identify and recommend configurations optimal of cloud technology solutions and modify the code base that defines systems or cloud technologies to improve the reliability and operability of supported products with minimal guidance from other engineers.
+ Develops an understanding of the code, features, and operations of specific products at scale as required to contribute to incremental improvements in product availability, reliability, efficiency, observability, and/or performance; participates in on-boarding, code/design reviews, and regular meetings with the engineering teams that develop and/or manage those products.
+ Researches and maintains an awareness in industry trends, advances in distributed systems and cloud technologies, new tools, and/or processes for maintaining and improving product availability, reliability, efficiency, observability, and/or performance. Contributes to the implementation of new solutions within their team by identifying ways they can be applied to solve persistent problems.
+ Leverages experience with large scale distributed systems and specific products, as well as objective insights drawn from analyses of production telemetry data to suggest changes or add-ons to product features or code to improve the availability, reliability, efficiency, observability, and performance of product components or features supported by their team.
+ Develops and tests basic changes to optimize code and improve the observability, reliability and operability of a defined range of platform, system, or product components or features with direction from other engineers.
+ Engages with product engineering teams by participating code/design reviews, regular meetings, on-call rotations and incident responses throughout product development and operations cycles.
+ Leverages knowledge of underlying systems/platforms and insights drawn from engagements with product engineering teams and telemetry analyses to propose potential improvements in code base and designs across components and features of one or more products.
+ Designs, develops, and maintains telemetry pipelines and monitoring tools that detail operations metrics (e.g., availability, reliability, performance, efficiency) of product components and features operating at scale. Independently performs analyses using existing tools and/or models to identify insights and shares them with product engineering teams to directly contribute to improvements in product development and/or operations; monitors the impact of changes on operations metrics (e.g., Time-to-X).
Qualifications
Required Qualifications:
+ Master's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration
+ OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration
+ OR equivalent experience.
+ 3+ years of technical experience in software engineering, network engineering, or systems administration.
Other Requirements:
Security Clearance Requirements: Candidates must be able to meet Microsoft, customer and/or government security screening requirements that are required for this role. These requirements include, but are not limited to the following specialized security screenings:
+ Candidates must have an active Secret eligibility determination. This role will require candidates to maintain the Secret clearance. Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. Failure to maintain or obtain the appropriate clearance and/or customer screening requirements may result in employment action up to and including termination.
+ Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
+ Clearance Verification: This position requires successful verification of the stated security clearance to meet federal government customer requirements. You will be asked to provide clearance verification information prior to an offer of employment.
+ Citizenship & Citizenship Verification: This position requires verification of U.S. citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local United States government agency customer and is subject to certain citizenship-based restrictions where required or permitted by applicable law. To meet this legal requirement, citizenship will be verified via a valid passport, or other approved documents, or verified US government Clearance.
Preferred Qualifications:
+ Master's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration
+ OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 5+ years technical experience in software engineering, network engineering, or systems administration
+ OR equivalent experience.
+ 2+ years technical experience working with large-scale cloud or distributed systems.
Site Reliability Engineering IC3 - The typical base pay range for this role across the U.S. is USD $100,600 - $199,000 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $131,400 - $215,400 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
Single reqs: Microsoft will accept applications for the role until October 22, 2025.
\#M365Core
Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) .
-