-
AI Capacity Engineering & Analysis
- Meta (Menlo Park, CA)
-
Summary:
Meta is seeking a Performance and Capacity Engineer to join the Capacity team to focus on AI strategy and planning projects. This role will focus on working cross-functionally with a number of teams to ensure optimal operation and growth of our AI computing resources, from both a cost and technology perspective. Tens of billions of user requests, hundreds of peta bytes of data, thousands of giga bps of network flow. Help build one of the largest AI training and inference services in the world!
Required Skills:
AI Capacity Engineering & Analysis Responsibilities:
1. Performance and Capacity Engineer, Infrastructure Responsibilities
2. Identify scaling issues and solve them for the largest AI capacity in the world
3. Work with Product Engineering, Infrastructure Engineering, and Data Engineering teams to find the optimal way to scale the AI infrastructure
4. Provide clear visibility to what is going on for all products on AI infrastructure: Run capacity and performance experiments to determine scaling and utilization parameters for various service tiers
5. Work with financial analysts, operations and engineering to perform cutting-edge technologies investigation and cost analysis for AI
6. Identify AI capacity-related issues proactively and work with systems, network, application operations and engineering teams to discover resolutions
Minimum Qualifications:
Minimum Qualifications:
7. Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
8. 6+ years of experience working with distributed systems at scale
9. Proficient in any coding language and designing software systems
10. Experience learning new technical and business concepts on the job
11. Desire to learn about capacity planning and optimization
Preferred Qualifications:
Preferred Qualifications:
12. MS or PhD degree in Computer Science, Electrical Engineering, Operations Research or other technical field
13. Experience working with large scale AI/ML systems (GPU based)
14. Direct experience in capacity planning for a major private or public cloud
15. Practical experience and demonstrated success in performance or capacity engineering
Public Compensation:
$147,000/year to $208,000/year + bonus + equity + benefits
**Industry:** Internet
Equal Opportunity:
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at [email protected].
-
Recent Jobs
-
AI Capacity Engineering & Analysis
- Meta (Menlo Park, CA)
-
Mechanical Engineer
- AGCO Corporation (Lincoln, NE)