Job description

  • Location:
    Sandton
  • Employee Type:
    Permanent
  • Department:
    ITS
  • Division:
    Central Services

Problem Manager (ITS) (13411)

Description

The Problem Manager is responsible for identifying, analyzing, and eliminating the underlying causes of recurring or high impact incidents across Investec's technology environment. The role ensures that systemic issues are addressed proactively, risk is reduced, and overall service stability is improved. Through structured root cause analysis, strong collaboration with technical teams, and clear governance, the Problem Manager helps prevent repeated incidents and drives long term resilience.

Key Responsibilities

•    Own and govern the end to end Problem Management process aligned to ITIL standards.
•    Ensure all problems are logged, prioritised, analysed, and tracked through to closure.
•    Define severity criteria and ensure proper classification of problems based on business and operational risk.
•    Maintain KPIs, dashboards, and reporting for trend analysis and leadership insight.
•    Lead post incident reviews (PIRs) for Major Incidents, ensuring structured RCA and high quality documentation.
•    Facilitate technical deep dives with SMEs across Applications, Infrastructure, Networks, Cybersecurity, Cloud, and 3rd party vendors.
•    Drive the creation and implementation of corrective and preventative actions (CPAs).
•    Validate RCA outputs and ensure accuracy, completeness, and alignment across teams.
•    Analyse incident patterns, alerts, failures, and service behaviour to identify emerging risks before they impact clients.
•    Improve predictive capabilities by working closely with the Observability, Monitoring, and Command Centre teams.
•    Identify chronic services, recurring incident types, and areas requiring infrastructure or application hardening.
•    Recommend improvements to architecture, monitoring thresholds, service mapping, and operational procedures.
•    Partner with engineering, architecture, and platform teams to reduce technical debt and systemic weaknesses.
•    Ensure implementation of long term fixes, including re architecture, automation, performance tuning, or resilience improvements.
•    Track and report remediation progress, escalating overdue actions.
•    Provide risk based recommendations to leadership for investment and prioritisation.
•    Produce monthly, quarterly, and incident linked Problem Management reports for technology and business stakeholders.
•    Present RCA findings, service stability concerns, and risk insights to senior leaders.
•    Communicate problem statuses, trends, and remediation risks across all relevant teams.
•    Ensure transparency and shared accountability across domains.
•    Engage external suppliers for joint RCAs where vendor systems contributed to the incident.
•    Ensure vendors deliver detailed RCA documentation and complete contractual remediation commitments.
•    Participate in service reviews to address chronic vendor related failures.
•    Maintain a central repository of problem records, RCAs, patterns, and lessons learned.
•    Create knowledge articles and improvement recommendations to prevent repeat issues.
•    Align insights with Change Enablement to reduce change induced incidents.

Qualifications, Experience and Skills

•    5–7 years IT experience, including 3+ years in Problem Management or Incident/Service Management within a complex enterprise environment.
•    Bachelor's degree in information technology, Computer Science, Information Systems, or an IT Diploma/NQF 6 or relevant experience in the field.
•    Strong understanding of ITIL, particularly Incident, Problem, and Change.
•    Proven analytical capability with deep experience in RCA methodologies.
•    Strong communication and documentation skills with the ability to present to senior stakeholders.
•    ITIL Intermediate/Expert certification.
•    Experience in financial services or other regulated industries.
•    Microsoft Certified: Azure Fundamentals – AZ900
•    Knowledge of cloud platforms (Azure, AWS), microservices, and distributed systems.
•    Understanding of AIOps, automation, and resilience engineering practices.
•    Experience using ITSM and observability platforms such as ServiceNow, Dynatrace, Logic Monitor, or equivalent.
•    Ability to lead technical discussions and influence across cross‑functional teams.

Investec Culture 

At Investec we seek creative, talented people with passion, energy and stamina, who collaborate unselfishly.

 

 

 

We are committed to diversity and inclusion when recruiting internally and externally. 


 
Close map
Location
Sandton
100 Grayston Drive, Investec Bank Ltd, Johannesburg, South Africa, 2196
Loading...

Meet the recruiter

Kiara Jenna Hendricks

LinkedIn

Share this page
Share with linkedin
Share with facebook
Share with twitter
Share with email
Vacancy Alerts
Create an alert subscription based on this vacancy

Benefits

Pension
Private Medical Cover
Virtual GP
Gym Discounts
Psychologist Service
Annual Leave
Life Assurance
Loading