-
Network Operations Engineer- Regional Site Lead
- Insight Global (New York, NY)
-
Job Description
Insight Global is seeking a highly skilled Network Operations Engineer to support a rapidly growing AI cloud and compute organization operating hyperscale datacenter campuses across the U.S. This position serves as a Regional Site Lead, combining hands‑on operational ownership with cross‑functional collaboration to ensure network reliability, performance, and production readiness at scale.
This is a full-time, permanent role with competitive salary, equity, and comprehensive benefits. The position follows a hybrid onsite model at one of the company’s regional datacenter locations.
The Network Operations Engineer will oversee end‑to‑end network operations for a designated campus, acting as the primary escalation point for network health, incident response, break‑fix coordination, and deployment validation. This role requires strong troubleshooting ability across physical and logical layers, operational leadership, and comfort working in fast-moving, large‑scale environments.
Key Responsibilities
Campus Operations Ownership: Serve as the primary point of contact for all network operations within a regional datacenter campus.
Tier 2/3 Troubleshooting: Diagnose and resolve complex issues across optical, physical, L2/L3 fabric, and control-plane layers.
Break‑Fix Coordination: Lead hardware interventions such as line card swaps, optic replacements, RMAs, and cabling corrections.
Deployment + Expansion Support: Validate new pods, fabrics, and clusters for production readiness; support turn-ups and expansions.
Runbook Execution + Improvement: Execute operational procedures, refine runbooks, and document lessons learned for scale.
Cross‑Functional Partnering: Work closely with onsite DC Ops, engineering teams, vendors, logistics partners, and leadership.
Team Support + Mentorship: Provide guidance and expertise to junior engineers as the regional team grows.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to [email protected] learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Skills and Requirements
5–8 years of network engineering experience, including direct operational ownership within hyperscale datacenter environments (AI/ML clusters, large-scale fabrics, cloud infrastructure, or multi‑site regional campuses).
Deep expertise with modern datacenter fabrics, including EVPN/VXLAN, BGP, CLOS topologies, and high‑radix switching systems.
Demonstrated Tier 2/3 incident response leadership, with the ability to troubleshoot under pressure in large-scale production environments.
Hands-on experience with datacenter hardware break‑fix, RMA workflows, structured cabling, optics handling, and physical-layer troubleshooting.
Strong understanding of datacenter logistics—hardware lifecycle, vendor coordination, staging, and deployment workflows at scale.
Ability to work in a hybrid onsite + remote model, adapting to operational needs across multiple campuses.
Comfort with fast-paced, high-growth operations, supporting AI, HPC, or cloud-scale network environments. Nice-to-Haves
• AI/HPC fabric operations (RoCEv2, PFC, ECN).
• Regional/campus operations leadership experience.
• Familiarity with observability tools and basic automation (Python, Ansible).
• Experience in follow-the-sun support models.
-
Recent Jobs
-
Network Operations Engineer- Regional Site Lead
- Insight Global (New York, NY)
-
Director
- Pleasant Senior Center Of Humboldt County (Winnemucca, NV)
-
Technician IV
- General Atomics (Poway, CA)