-
Incident Manager
- Insight Global (Detroit, MI)
-
Job Description
Insight Global is looking for an Incident Manager to join one of our Energy clients out of Detroit, MI. This individual’s primary responsibility will be Incident Management, and secondary responsibilities include Event and Problem management. This position requires you be onsite at DTE HQ in Detroit at least 1 day per week and remaining days are work from home. You will also need to be willing to be on call for incidents 1 week per month.
Responsibilities Include:
Incident Management/Major Incident Management (Primary responsibility)
Identify an Incident has/is occurring and respond accordingly
Utilize KB articles, incident logs, etc. to quickly determine next steps
Engage the appropriate resources needed to contain the incident
Establish a communication bridge for resources to collaborate
Send out communication to appropriate audience at regular intervals
during the incident
Log all activities that occur during the incident
Moderate the communication bridge to keep the team focused on
containment – minimizing mean time to restore (MTTR)
Escalate to management as needed
Event Management (secondary responsibility)
Use EM tools to improve detection and response times to incidents
Reduce downtime by proactively detecting performance anomalies before
they become a widespread system-down incident
Recognize the need for additional alarms or modified alarm thresholds
based on past incidents
Problem Management (secondary responsibility)
Detect that a problem exists – i.e. repeat incidents of the same type
Log the problem and assemble a team to work the problem
Log a known error and any workaround
Facilitate the technical team as they resolve the problem
Document the problem resolution and ensure any KB articles are updated
Maintain key performance metrics
MTTR (mean time to restore)
MTTK (mean time to know)
Incidents with defective or non-existent alarms
Mean Time to Detect/Know
Unplanned Outage count and duration
Planned Outage count and duration
Incident counts by Portfolio
Exact compensation may vary based on several factors, including skills, experience, and education.
Employees in this role will enjoy a comprehensive benefits package starting on day one of employment, including options for medical, dental, and vision insurance. Eligibility to enroll in the 401(k) retirement plan begins after 90 days of employment. Additionally, employees in this role will have access to paid sick leave and other paid time off benefits as required under the applicable law of the worksite location.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to [email protected] learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Skills and Requirements
• Bachelor’s Degree in IT, Computer Science, or related field
• 5-10 years of experience as an Incident Manager or similar role
o Hands-on experience managing major outages and coordinating cross-functional teams
• Strong communication & coordination skills
o Ability to moderate bridges and keep teams focused during high-pressure incidents.
o Strong written and verbal communication for status updates and stakeholder engagement.
• Strong understanding of IT infrastructure (servers, networks, applications, cloud environments)
• Familiarity with metrics tracking: MTTR, MTTK, outage counts, alarm effectiveness.
• Ability to interpret KB articles, incident logs, and system alerts quickly. • ITIL Foundation certification
-