Incident Manager
Salary undisclosed
Checking job availability...
Original
Simplified
Incident Manager
This position is responsible for managing service restoration calls during a Mission Critical Incident (MCI). This role facilitates and works collaboratively with Subject Matter Experts, Support Teams, and vendors to restore service as quickly as possible. This includes driving the call, considering next steps, parallel efforts, escalations both internally/externally, and timely communication of service interruptions to stakeholders.
Key Responsibilities include, but are not limited to:
- Driving service restoration with command and control as quickly as possible to minimize business impact.
- Ensuring business impacts and scope of service interruption are identified.
- Engaging subject matter experts (SME), collecting proposed service restoration actions, determining action items, and delegating action plans to SMEs.
- Responsible for post-incident action items, including conducting a Post Incident Review (PIR), working with service owners/product owners/DevOps Leads/vendors to determine root cause and identify permanent fixes.
- Reviewing PIR key findings with stakeholders, ensuring tasks for related PRBs are addressed in a timely manner, publishing root cause, and providing Senior Leadership Team (SLT) reviews as needed.
- Ensuring that the best possible levels of service quality and availability are maintained.
- Facilitating the outage calls and ensuring that all the required resources are engaged to work on high-priority incidents.
- Ensuring that effective communication is maintained with the Executives and Business Leadership during a mission-critical incident.
- Responsible for monitoring and operating multiple IT platforms.
- Must have a solid understanding of all key ITIL disciplines and processes.
- Working towards continuous operational and process improvement while maintaining 100% compliance with quality and legal standards.
- Must have a high degree of technical knowledge to understand the environment and provide management updates when needed.
- Understanding IT impact on the business and raising alternative workarounds.
- Responsible for technical and service monitoring, detecting, and incident handling for all technology-related incidents.
- Collaborating with other teams, customers, and vendors to improve service and increase the value of Command Center Operations.
- Reporting to Management on service interruptions, impacts, and scope.
- Supporting Mission Critical Incident Management reporting (KPIs and customer SLAs).
- Assisting the Mission Critical Incident Management Process Owners in driving Service Management best practices and ITIL process standardization.
- Assisting the Mission Critical Incident Management Process Owner in identifying and planning for Major Incident Management process improvement projects.
Qualifications:
- Highly motivated with strong leadership skills.
- Requires 5+ years of working experience as an incident analyst/manager in a large, enterprise environment facilitating effective Mission Critical Incident (MCI) calls.
- Must have solid experience working with teams from different technical platforms and a working knowledge of automation and monitoring, including service level-based monitoring.
- Preferred bachelor's degree in a related field.
- In-depth knowledge and proven experience in troubleshooting, problem determination, root cause analysis, and rapid problem resolution.
- Must be knowledgeable in networking, open systems, and cloud computing technologies (Azure, GCP, etc.).
- Must have a thorough understanding of ITIL: Incident, Problem, Knowledge, Change, Configuration Management principles.
- ITIL v3 or higher Foundations certification preferred.
- Previous retail experience desired.
- Must possess strong leadership abilities.
- Lead service restoration efforts across the organization.
- Good understanding of production IT Environment and IT Operations.
- Strong oral and written communication skills required.
- Demonstrates a high level of energy, is results-driven, and is able to work under pressure.
- Flexible work hours required.