Job description

  • Location:
    Sandton
  • Employee Type:
    Permanent
  • Department:
    ITS
  • Division:
    Central Services

Incident Manager (ITS) (13409)

Description

The Incident Manager is accountable for the end-to-end management of production incidents across Investec's technology landscape, driving rapid restoration of service, minimizing business impact, and enabling continuous improvement. The role orchestrates cross functional responders, ensures clear stakeholder communication, and embeds best practice processes that enhance operational resilience and client experience.

Key Responsibilities

•    Lead the lifecycle of Major Incidents (MI), from detection and triage through to resolution and closure.
•    Act as the single point of command during MIs, coordinating SMEs across Infrastructure, Networks, Applications, Cybersecurity, and 3rd parties.
•    Ensure swift decision-making, clear ownership, and adherence to escalation paths and SLA/SLO targets.
•    Validate incident priority/severity, business impact, and restoration strategies; authorize workarounds where appropriate.
•    Partner with the Command Centre and Observability teams to refine alerting thresholds, correlation rules, and event noise reduction (e.g., Dynatrace, ServiceNow Event Management).
•    Maintain a tight feedback loop between monitoring signals and incident workflows to improve MTTA/MTTR and reduce false positives.
•    Champion automation for incident enrichment (runbooks, CMDB context, topology) to accelerate diagnosis and response.
•    Provide timely, audience appropriate communications during incidents—executive summaries, business impact statements, and recovery ETAs.
•    Coordinate client communications where applicable, ensuring accuracy and consistency across channels.
•    Run structured post incident briefings for business and technology stakeholders.
•    Own and evolve the Incident Management process aligned to ITIL and Investec governance standards.
•    Ensure high-quality incident records (root cause hypotheses, recovery actions, learnings) and enforce closure criteria and documentation completeness.
•    Drive trend analysis and feed systemic insights into Problem Management, Change Enablement, Capacity, and Resilience programmes.
•    Define, track, and report metrics (e.g., MTTA, MTTR, MTTD, incident volume by category, repeat offenders, Sev distribution) and recommend targeted improvements.
•    Maintain on call rosters, escalation matrices, and runbooks; run scenario drills and game days for high value services.
•    Collaborate with Architecture, Cyber, and Platform Owners to validate failover strategies, DR readiness, and performance under stress.
•    Ensure alignment with Business Continuity and Disaster Recovery (BC/DR) requirements, including post exercise remediation.
•    Manage third party engagement during incidents, ensuring contractual SLAs are met and that partners contribute to root cause and prevention actions.
•    Lead supplier service reviews for chronic incidents and performance concerns.
•    Curate knowledge articles, standard operating procedures, and quick win workarounds to accelerate future restorations.
•    Ensure accurate CMDB/service mapping linkages to critical business services for impact assessment and prioritization.
•    Where applicable, lead an Incident Response Analysts/Major Incident Leads cohort, setting objectives, KPIs, and coaching for performance.
•    Foster a culture of calm under pressure, accountability, and continuous learning.
•    Align team capacity across shifts and on call to ensure 24×7 response coverage.

Qualifications, Experience and Skills

•    5–8 years of IT experience, including 3+ years leading Major Incidents or Incident Response in a complex enterprise environment.
•    Bachelor's degree in information technology, Computer Science, Information Systems, or an IT Diploma/NQF 6+ or relevant experience
•    Strong knowledge of ITIL Incident Management and its integration with Change, Problem, and Service Level Management.
•    Hands‑on experience with observability and ITSM platforms (Dynatrace, ServiceNow ITSM/ITOM/Event Management).
•    Excellent facilitation and communication skills, clear, calm, and structured under pressure.
•    Demonstrated ability to analyse incident patterns and improve MTTA/MTTR.
•    Ability to collaborate with cross‑functional technical teams and influence outcomes.
•    ITIL certification (Foundation or higher).
•    Microsoft Certified: Azure Fundamentals – AZ900
•    Experience in financial services or other regulated industries.
•    Exposure to SRE practices (SLIs/SLOs, error budgets), automation, and AIOps.
•    Knowledge of BC/DR, high availability patterns, and resilience testing.

Investec Culture 

At Investec we seek creative, talented people with passion, energy and stamina, who collaborate unselfishly.

 

 

 

We are committed to diversity and inclusion when recruiting internally and externally. 


 
Close map
Location
Sandton
100 Grayston Drive, Investec Bank Ltd, Johannesburg, South Africa, 2196
Loading...

Meet the recruiter

Kiara Jenna Hendricks

LinkedIn

Share this page
Share with linkedin
Share with facebook
Share with twitter
Share with email
Vacancy Alerts
Create an alert subscription based on this vacancy

Benefits

Pension
Private Medical Cover
Virtual GP
Gym Discounts
Psychologist Service
Annual Leave
Life Assurance
Loading