718 Helpdesk Management jobs in Singapore

Incident Management Specialist

Singapore, Singapore beBeeIncidentManagement

Posted today

Job Viewed

Tap Again To Close

Job Description

Job Title: Incident Manager ">

Job Description: We are seeking a highly skilled Incident Manager to join our team. The successful candidate will be responsible for managing technology incidents that impact our business operations.

">

Key Responsibilities:

">
    ">
  1. Manage technology incidents that impact our business operations
  2. ">
  3. Work with relevant business and technology groups to comply with incident and problem management processes and procedures
  4. ">
  5. End-to-end ownership of major incidents to minimize downtime
  6. ">
  7. Establish strong command and control of an incident, establishing clear accountability and precise evaluation of complex issue scenarios
  8. ">
  9. Participate in all incident resolution calls to facilitate incident determination, recovery, and resolution
  10. ">
  11. Timely incident recognition, logging, assignment, and resolution with proper documentation
  12. ">
  13. Incident progression coordination and monitoring of incidents and potential areas through symptoms, trends, or deviations from standards
  14. ">
  15. Escalation of critical and unresolved incidents to appropriate levels of management
  16. ">
  17. Ensure accurate capture and documentation of incident data in the incident reporting tool
  18. ">
  19. Post-incident activity to ensure highest levels of service quality and improve service levels through identification of problem trends and causes
  20. ">
  21. Ability to communicate well and manage highly stressful situations during an incident
  22. ">
">

Required Skills and Qualifications:

">
    ">
  • Bachelor's degree in Business, Computer Science, or related discipline
  • ">
  • ITIL certification
  • ">
  • 8-10 years of experience managing complex IT initiatives in a matrix environment or operational line managers experience
  • ">
  • Excellent English communication skills (written and oral)
  • ">
  • Experience in application support, knowledge on EOD batch processing, infrastructure (storage, network, Unix/Linux), web/application/middleware services, and good to know payments flow
  • ">
">

Benefits:

">
    ">
  • Competitive salary and benefits package
  • ">
  • Opportunity to work with a dynamic team
  • ">
  • Professional development and growth opportunities
  • ">
">

Others:

">
    ">
  • Good knowledge of Macro, Excel, PowerPoint, ticketing tools, and data analysis
  • ">
  • Ability to work in a fast-paced environment and prioritize tasks effectively
  • ">
  • Strong analytical and problem-solving skills
  • ">
This advertiser has chosen not to accept applicants from your region.

Incident Management Analyst

069534 $6800 Monthly TANGSPAC CONSULTING PTE LTD

Posted 10 days ago

Job Viewed

Tap Again To Close

Job Description

GENERAL DESCRIPTION
The Incident and Service Level Analyst is the primary IT resource to monitor incidents, track problems and ensure SLA’s are correct in place and followed. The Incident analyst will work closely with his counterparts from other locations.

KEY FEATURES OF THE POSITION

  • Ensure major incidents are resolved in the shortest period of time.
  • Responsible for incident, problem and service level management.
  • Initiate and coordinate incidents solving/resolution activities.
  • Perform incident review and make recommendations for improvement.
  • Part of the Major Incident Management Team (MIM).
  • Take joint responsibility in the governance of the Incident and Problem Management end to end process with cross technology teams ensuring all KPI’s are met.
  • Facilitate post-mortem and RCA tasks for high-priority incidents.
  • Produce comprehensive incident and problem reports to all required audiences.
  • Co-own Problem Management activities for all managed incidents.
  • Identify individual and at scale emerging problems and escalate issues into Problem Management queue.
  • Conduct Root Cause Analysis for all escalated incidents and Problem Management tickets.
  • Triage high priority /Major Incidents, work Support teams for resolution and perform escalations of notable incidents.
  • Produce management information, including KPI’s and reports.
  • Follow up, analyse and track Incidents and SLA breaches.
  • Drive and monitor the effectiveness of the incident, problem and service level management processes.
  • Perform incident trend analysis and propose recommendations to improve incident trends.

Job Requirements:

Personal and Social

  • Ability to work independently or as part of a team.
  • Ability to drive both teams and individuals.
  • Conscientious in ensuring defined SLAs are met.
  • Ability to interact and coordinate with various IT teams within a financial institution.
  • Good communication and organization skills.
  • Ability to analyse problems, troubleshooting, provide short term, and long-term solutions.
  • Experienced in problem-solving and troubleshooting in an IT environment.
  • Ability to work under fire; handle stressful situations in a calm manner.

Professional and Technical

  • At least 4 years of experience working in a Bank or financial institute.
  • Hands on experience in managing Service, Change and Incident management.
  • Experience in managing End user Support, Applications and Infrastructure services.
  • Possess good people management skills across all levels with the ability to manage multiple support pillars to identify root cause of the incidence.
  • Ability to prioritise and multitask when managing incidences with multiple layers. Candidate should have in depth knowledge of regulatory TRM guidelines and of ITIL concepts.
  • Candidates with ServiceNow tool experience preffered but not mandator

Interested candidates please email your latest resume to

This advertiser has chosen not to accept applicants from your region.

IT Incident Management Analyst (Shift)

Singapore, Singapore Assurity Trusted Solutions Pte Ltd

Posted today

Job Viewed

Tap Again To Close

Job Description

Assurity Trusted Solutions (ATS) is a wholly owned subsidiary of the Government Technology Agency (GovTech). As a Trusted Partner over the last decade, ATS offers a comprehensive suite of products and services ranging from infrastructure and operational services, authentication services, governance and assurance services as well as managed processes. In a dynamic digital and cyber landscape, where trust & collaboration are key, ATS continues to drive mutually beneficial business outcomes through collaboration with GovTech, government agencies and commercial partners to mitigate cyber risks and bolster security postures.
We are looking for an IT Incident Management Analyst to join us!
A brief summary of your job responsibility:

  • Perform 24/7 threats and events monitoring for various domains and notify relevant stakeholders if needed
  • Support operation and emergency planning and preparedness with relevant authorities
  • Conduct fact finding on incident occurrence, impact assessment and severity rating, and triage with relevant agencies if needed
  • Correlate information from multiple sources to detect any anomaly of incident reporting and response
  • Monitor and log the incident fact finding and investigation, and the status progress
  • Inform relevant internal and external stakeholders manage timely investigation and updates, and escalate investigation and post-incident reporting if needed
  • Prepare investigation reports and periodic updates with relevant stakeholders
Requirements
  • Prefer with 2-5 years' relevant experiences in Incident management, investigation and report writing
  • Prefer with familiarity on IT/Info/Data/Cyber security or ICT incident management best practices
  • Able to work independently and contribute as a team player
  • Possess good communication and interpersonal skills
Join us and discover a meaningful and exciting career with Assurity Trusted Solutions!
The remuneration package will commensurate with your qualifications and experience. Interested applicants, please click "Apply Now".
We thank you for your interest and please note that only shortlisted candidates will be notified.
By submitting your application, you agree that your personal data may be collected, used and disclosed by Assurity Trusted Solutions Pte. Ltd. (ATS), GovTech and their service providers and agents in accordance with ATS's privacy statement which can be found at: or such other successor site.
Benefits
  • A wholly-owned subsidiary of GovTech
  • We promote a learning culture and encourage you to grow and learn
#J-18808-Ljbffr

This advertiser has chosen not to accept applicants from your region.

IT Incident Management Analyst (Shift)

Singapore, Singapore Assurity Trusted Solutions Pte Ltd

Posted today

Job Viewed

Tap Again To Close

Job Description

Assurity Trusted Solutions (ATS) is a wholly owned subsidiary of the Government Technology Agency (GovTech). As a Trusted Partner over the last decade, ATS offers a comprehensive suite of products and services ranging from infrastructure and operational services, authentication services, governance and assurance services as well as managed processes. In a dynamic digital and cyber landscape, where trust & collaboration are key, ATS continues to drive mutually beneficial business outcomes through collaboration with GovTech, government agencies and commercial partners to mitigate cyber risks and bolster security postures.
We are looking for an IT Incident Management Analyst to join us!
A brief summary of your job responsibility:
Perform 24/7 threats and events monitoring for various domains and notify relevant stakeholders if needed
Support operation and emergency planning and preparedness with relevant authorities
Conduct fact finding on incident occurrence, impact assessment and severity rating, and triage with relevant agencies if needed
Correlate information from multiple sources to detect any anomaly of incident reporting and response
Monitor and log the incident fact finding and investigation, and the status progress
Inform relevant internal and external stakeholders manage timely investigation and updates, and escalate investigation and post-incident reporting if needed
Prepare investigation reports and periodic updates with relevant stakeholders
Requirements
Prefer with 2-5 years' relevant experiences in Incident management, investigation and report writing
Prefer with familiarity on IT/Info/Data/Cyber security or ICT incident management best practices
Able to work independently and contribute as a team player
Possess good communication and interpersonal skills
Join us and discover a meaningful and exciting career with Assurity Trusted Solutions!
The remuneration package will commensurate with your qualifications and experience. Interested applicants, please click "Apply Now".
We thank you for your interest and please note that only shortlisted candidates will be notified.
By submitting your application, you agree that your personal data may be collected, used and disclosed by Assurity Trusted Solutions Pte. Ltd. (ATS), GovTech and their service providers and agents in accordance with ATS's privacy statement which can be found at: or such other successor site.
Benefits
A wholly-owned subsidiary of GovTech
We promote a learning culture and encourage you to grow and learn
#J-18808-Ljbffr

This advertiser has chosen not to accept applicants from your region.

Assistant Director - Crisis & Incident Management

$13000 Monthly MORGAN MCKINLEY PTE. LTD.

Posted 10 days ago

Job Viewed

Tap Again To Close

Job Description

Roles and Responsibilities:

  • Lead a team of Major Incident Managers, Problem Managers and Change Managers.
  • Lead and oversee major incidents (severity 1 & 2) for all IT systems to ensure timely recovery of services.
  • Ensure closure of incident & problem tickets, meeting the agreed SLAs.
  • Ensure major incidents communications such as activating War Room for triage, Conference Bridge and send incident broadcast communication to all Synapxe stakeholders and provide regular updates via incident tracking dashboard until incident closure.
  • Collaborate and across multiple internal teams to restore services when incidents occurred, gather required experts to perform root cause analysis for problem resolution.
  • Oversee the Problem Management process to produce reports on Root Cause Analysis, SLA measurement and/or performance of incident & problem management and present to Synapxe management when required.
  • Oversee the Change Management process and ensure flawless execution of the process.
  • Drive IT IPC contract/service provider’s performance for IT Operations and ensure Service Level Agreements have been fulfilled and established improvement plans.
  • Drive the Major Change Review meetings to deconflict major changes.
  • Work closely with the monitoring teams to ensure potential major incidents are arrested at an early stage prior to impact.
  • Work closely with the Disaster Recovery team during yearly exercises and any real time invocation of DR services.
  • Maintain processes, templates and SOP, website and support information related to Incident, Problem & Change Management and manage relevant ad-hoc duties.

Requirements / Qualifications

  • B.S. in Computer Science or related diploma/degree with min 10 years’ experience
  • Familiarity with ITIL framework & methodologies.
  • Demonstrable experience and capability to interact with senior management.
  • Welcome new challenges, understand the sense of urgency and be able to manage different priorities.
  • Uses best practices and knowledge of internal or external business issues to improve products or services
  • Exercises judgment within defined procedures; practices and policies to obtain solution
  • Experience working in an infrastructure technology environment highly desirable


---

Morgan McKinley Pte Ltd

May Thinzar Khine

EA Licence No: 11C5502

EA Registration No: R22110157

This advertiser has chosen not to accept applicants from your region.

Officer, Information Security Incident Management Analyst, Global Information Security

Singapore, Singapore MERRILL LYNCH GLOBAL SERVICES PTE. LTD.

Posted today

Job Viewed

Tap Again To Close

Job Description

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.
Being a Great Place to Work is core to how we drive Responsible Growth. This includes our commitment to being an inclusive workplace, attracting and developing exceptional talent, supporting our teammates’ physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve.
Bank of America is committed to an in-office culture with specific requirements for office-based attendance and which allows for an appropriate level of flexibility for our teammates and businesses based on role-specific considerations.
At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!

Job Description:

As a Junior Cyber Security Analyst, you will play a crucial role in ensuring the security and integrity of our organizations digital assets.

Collaborating with a dynamic team, you will intake cybersecurity related requests from internal and external entities that require triage, remediation or escalation.

This entry level position provides an opportunity for learning and growth in the ever-evolving field of cybersecurity.

If you are passionate about safeguarding digital environments and eager to build a career in cybersecurity, we invite you to join our team.

Responsibilities:

  • Answer incoming calls to assist with information security inquiries or issues.
  • Document all interactions accurately for record-keeping and analysis.
  • Follow established processes and guidelines to handle common issues.
  • Collaborate with team members to resolve more complex problems or escalate as necessary.
  • Monitor your assigned queues and work cases in an efficient and effective manner

Skills:

  • Effective typing skills
  • Familiarity with Cyber Security and Information Technology
  • Strong problem-solving and critical thinking skills
  • Effective communication and interpersonal skills
#J-18808-Ljbffr

This advertiser has chosen not to accept applicants from your region.

Manager, Incident Response & Management

Singapore, Singapore Monograph

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

Who we are About Stripe

Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone’s reach while doing the most important work of your career.

About the team

The Incident Response team is a global 24/7 team responsible for driving incident response and management from detection to resolution. Stripe is proud of its five 9s API reliability and this team is at the forefront of ensuring we keep it that way - working hand-in-hand with Reliability Eng and across the Tech Org. This team of incident response managers (IRM) is defined by our sense of ownership and how we drive incidents to resolution - marshaling the necessary cross-functional resources to respond to and resolve service outages, critical bugs, security attacks and anything that significantly impacts the users of our products. The team is user-first and ensures appropriate external communications from Stripe and senior management to keep our users informed of disruption to their experience of Stripe. The team is highly skilled in incident troubleshooting, program management, incident classifications, incident communications, incident escalation and technical adeptness as incidents can arise from anywhere and cut across products and orgs in Stripe.

What you’ll do

This position entails leading and optimizing Stripe's incident management processes and automation, ensuring efficiency and adherence to stringent incident response metrics. As the head of the incident response team, you will establish and maintain a best-in-class incident response framework, upholding the reliability standards expected of Stripe. Responsibilities include but are not limited to incident classification, escalation, and notification management, along with accountability for key incident response metrics (TTx). You will generate actionable insights to drive continuous improvement, collaborating with engineering leadership to refine incident detection, response, user communication, and tooling efficacy. Leadership and development of a highly effective 24/7 global incident response management team, characterized by urgency, programmatic ownership of incidents and communications, and the capacity to engage engineering teams, are crucial. Additionally, you will manage incident communications across multiple channels for executive and end-user audiences, and identify automation opportunities to streamline incident response workflows, thereby safeguarding users and minimizing disruption to their operations.

Responsibilities
  • Lead the global 24/7 team of regional managers and incident response managers with ability to be hands-on and support frontline on-call with speed, cross-functional collaboration and escalation
  • Develop and own Stripe's incident response and management strategy and cross-functional roadmap, ensuring it aligns with the company's reputation for reliability.
  • Spearhead and manage Stripe's AI-First strategy for automation of incident response workflows, partnering with the engineering team to implement required tooling enhancements.
  • Enhance Stripe's incident response by leading and implementing improvements derived from analyzing user-facing incidents and extracting actionable insights and learnings.
  • Collaborate closely with executive leadership, engineering, and operations teams to lead significant programs and reshape workflows and metrics concerning reliability and incident operations.
  • Manage relevant TTx metrics, particularly those related to communication and escalation. Collaborate with engineering leadership to implement necessary improvements for each metric.
  • Develop user-focused metrics and data to guide Stripe's incident response, reliability strategy, and user communications (including RCAs), ensuring impactful decision-making.
Who you are

We’re looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.

Minimum requirements
  • 5+ years of management experience, including 2+ years of experience managing managers with a proven record in building, growing and transforming teams.
  • Extensive experience (4+ years) leading incident response for complex, large-scale distributed services with high SLOs/SLAs, coupled with deep expertise in crisis management.
  • Demonstrated ability to lead, influence other leaders and deliver complex strategic projects involving multiple stakeholders
  • Strong analytical skills, and the ability to use data to drive business decisions
  • Possesses proficiency in basic incident troubleshooting and a reasonable understanding of system architecture. Fluent in using SQL, Splunk, or similar query languages.
  • Exceptional communication abilities, capable of adapting incident updates for diverse audiences (executives, external users, internal teams).
  • Affinity for a fast paced work environment, crafting strategic and rapid fixes to high intensity problems with a keen eye for detail and a high bar for quality
  • Comfort navigating ambiguity, while identifying areas for process improvement and establishing best practices
Preferred qualifications
  • Experience managing geographically dispersed teams
  • Experience using infrastructure and application monitoring tools such as Prometheus, Sentry and others
  • Experience in incident response at a high-growth technology company, preferably within the payments or e-commerce sectors.
  • Proven ability to apply Agentic and Generative AI to revolutionize incident response, coupled with a strong grasp of current industry trends in the incident response domain.
  • Demonstrated history of driving engineering and process enhancements to improve incident response efficiency within a rapidly expanding technology organization.

Office-assigned Stripes spend at least 50% of the time in a given month in their local office or with users. This hits a balance between bringing people together for in-person collaboration and learning from each other, while supporting flexibility about how to do this in a way that makes sense for individuals and their teams.

The annual salary range for this role in the primary location is S$208,000 - S$312,000. This range may change if you are hired in another location. For sales roles, the range provided is the role’s On Target Earnings (“OTE”) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. This salary range may be inclusive of several career levels at Stripe and will be narrowed during the interview process based on a number of factors, including the candidate’s experience, qualifications, and specific location. Applicants interested in this role and who are not located in the primary location may request the annual salary range for their location during the interview process.

Specific benefits and details about what compensation is included in the salary range listed above will vary depending on the applicant’s location and can be discussed in more detail during the interview process. Benefits/additional compensation for this role may include: equity, company bonus or sales commissions/bonuses; retirement plans; health benefits; and wellness stipends.

Office locations

Singapore

Team

Infrastructure & Corporate Tech

Job type

Full time

Apply for this role

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.
Be The First To Know

About the latest Helpdesk management Jobs in Singapore !

Manager, Incident Response & Management

Singapore, Singapore STRIPE PAYMENTS SINGAPORE PTE. LTD.

Posted 7 days ago

Job Viewed

Tap Again To Close

Job Description

workfromhome
Who we are About Stripe

Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone’s reach while doing the most important work of your career.

About the team

The Incident Ops team is a global 24/7 team responsible for driving incident response and management of incidents from detection to resolution. Stripe is proud of its five 9s reliability and this team is at the forefront of ensuring we keep it that way - working hand-in-hand with Reliability Eng and across the Tech Org. This team of incident response managers (IRM) is defined by our sense of ownership and how we drive incidents to resolution - marshaling the necessary cross-functional resources to respond to and resolve service outages, critical bugs, security attacks and anything that significantly impacts the users of our products. The team is user-first and ensures appropriate external communications from Stripe and senior management to keep our users informed of disruption to their experience of Stripe. The team is skilled in program management, communications, incident handling and technical adeptness as incidents can arise from anywhere and cut across products and orgs in Stripe.

What you’ll do

As the Manager of Incident Response Managers, you’ll evolve a world class incident response team in APAC to maintain a high bar of reliability expected of Stripe and by Stripe’s users. You’ll work hand-in-hand with regional IRM teams in AMER and EMEA to ensure solid 24/7 coverage for how we detect, respond to incidents, communicate to users, improve related tooling and measure impact. You will lead and nurture a high-performing IRM team based in APAC who has a strong sense of urgency, focused on identifying incident impact, rapidly assembling incident responders, driving incident communications, and mitigating impact as quickly as possible. As a result, you’ll be seen as the protector of our users - in minimizing the impact of incidents on their business and ensuring that Stripe is always thinking of our users.

Responsibilities
  • Manage a team of frontline incident response managers
  • Provide coaching and development to each team member
  • Coordinate and manage incident resolution with speed, cross-functional collaboration, and accuracy, with a global and broad set of stakeholders.
  • Facilitate post incident reviews to identify technical or process problems which need to be remediated
  • Contribute to incident root cause analysis, identifying remediation opportunities for Incident Operations, partner teams on operations and engineering to execute upon.
  • Formulate strategy and deliver on communications to both internal stakeholders and Stripe’s users.
  • Collaborate with engineering and operations teams to align on and execute upon on-going improvements to processes, tooling, metrics, and the Incident Management framework.
  • Influence and make decisions through interpretation of data and consolidation of input from multiple stakeholders.
Who you are

We’re looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.

Minimum requirements
  • Have 5+ years of direct people management experience, an excellent coach
  • Have 3+ years of experience within a Major Incident Management team
  • Demonstrated employee and team development
  • Enjoy a fast paced work environment, crafting strategic and rapid fixes to high intensity problems with a keen eye for detail and a high bar for quality
  • Comfortable navigating ambiguity, while identifying areas for process improvement and establishing best practices
  • Strong written and verbal communication skills, able to deliver effective messaging to all levels of a technical organization
  • Can problem solve and translate complicated technical issues into solutions, while keeping a users-first mindset
  • Have an ability to execute on and deliver complex operational projects involving multiple stakeholders especially in partnering with engineering
Preferred qualifications
  • Have technical background, are proficient in SQL, Splunk, or equivalent query languages and the ability to use data to drive business decisions based on analytical research
  • Experience using infrastructure and application monitoring tools such as Signalfx, Prometheus, Sentry, Grafana and others
  • Experience at a high-growth technology company, especially within the payments or e-commerce space in particular for incident response
  • Experience working with both cloud and third-party solution providers
  • Experience with managing user-facing communications strategy during sensitive situations such as outages

Hybrid work at Stripe
Office-assigned Stripes spend at least 50% of the time in a given month in their local office or with users. This hits a balance between bringing people together for in-person collaboration and learning from each other, while supporting flexibility about how to do this in a way that makes sense for individuals and their teams.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Manager, Incident Response & Management

Singapore, Singapore Refine Group

Posted 11 days ago

Job Viewed

Tap Again To Close

Job Description

Who we are
About Stripe

Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone’s reach while doing the most important work of your career.

About the team

The Incident Response team is a global 24/7 team responsible for driving incident response and management from detection to resolution. Stripe is proud of its five 9s API reliability and this team is at the forefront of ensuring we keep it that way - working hand-in-hand with Reliability Eng and across the Tech Org. This team of incident response managers (IRM) is defined by our sense of ownership and how we drive incidents to resolution - marshaling the necessary cross-functional resources to respond to and resolve service outages, critical bugs, security attacks and anything that significantly impacts the users of our products. The team is user-first and ensures appropriate external communications from Stripe and senior management to keep our users informed of disruption to their experience of Stripe. The team is highly skilled in incident troubleshooting, program management, incident classifications, incident communications, incident escalation and technical adeptness as incidents can arise from anywhere and cut across products and orgs in Stripe.

What you’ll do

This position entails leading and optimizing Stripe's incident management processes and automation, ensuring efficiency and adherence to stringent incident response metrics. As the head of the incident response team, you will establish and maintain a best-in-class incident response framework, upholding the reliability standards expected of Stripe. Responsibilities include but are not limited to incident classification, escalation, and notification management, along with accountability for key incident response metrics (TTx). You will generate actionable insights to drive continuous improvement, collaborating with engineering leadership to refine incident detection, response, user communication, and tooling efficacy. Leadership and development of a highly effective 24/7 global incident response management team, characterized by urgency, programmatic ownership of incidents and communications, and the capacity to engage engineering teams, are crucial. Additionally, you will manage incident communications across multiple channels for executive and end-user audiences, and identify automation opportunities to streamline incident response workflows, thereby safeguarding users and minimizing disruption to their operations.

Responsibilities
  • Lead the global 24/7 team of regional managers and incident response managers with ability to be hands-on and support frontline on-call with speed, cross-functional collaboration and escalation
  • Develop and own Stripe's incident response and management strategy and cross-functional roadmap, ensuring it aligns with the company's reputation for reliability.
  • Spearhead and manage Stripe's AI-First strategy for automation of incident response workflows, partnering with the engineering team to implement required tooling enhancements.
  • Enhance Stripe's incident response by leading and implementing improvements derived from analyzing user-facing incidents and extracting actionable insights and learnings.
  • Collaborate closely with executive leadership, engineering, and operations teams to lead significant programs and reshape workflows and metrics concerning reliability and incident operations.
  • Manage relevant TTx metrics, particularly those related to communication and escalation. Collaborate with engineering leadership to implement necessary improvements for each metric.
  • Develop user-focused metrics and data to guide Stripe's incident response, reliability strategy, and user communications (including RCAs), ensuring impactful decision-making.
Who you are

We’re looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.

Minimum requirements
  • 5+ years of management experience, including 2+ years of experience managing managers with a proven record in building, growing and transforming teams.
  • Extensive experience (4+ years) leading incident response for complex, large-scale distributed services with high SLOs/SLAs, coupled with deep expertise in crisis management.
  • Demonstrated ability to lead, influence other leaders and deliver complex strategic projects involving multiple stakeholders
  • Strong analytical skills, and the ability to use data to drive business decisions
  • Possesses proficiency in basic incident troubleshooting and a reasonable understanding of system architecture. Fluent in using SQL, Splunk, or similar query languages.
  • Exceptional communication abilities, capable of adapting incident updates for diverse audiences (executives, external users, internal teams).
  • Affinity for a fast paced work environment, crafting strategic and rapid fixes to high intensity problems with a keen eye for detail and a high bar for quality
  • Comfort navigating ambiguity, while identifying areas for process improvement and establishing best practices
Preferred qualifications
  • Experience managing geographically dispersed teams
  • Experience using infrastructure and application monitoring tools such as Prometheus, Sentry and others
  • Experience in incident response at a high-growth technology company, preferably within the payments or e-commerce sectors.
  • Proven ability to apply Agentic and Generative AI to revolutionize incident response, coupled with a strong grasp of current industry trends in the incident response domain.
  • Demonstrated history of driving engineering and process enhancements to improve incident response efficiency within a rapidly expanding technology organization.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Manager, Incident Response & Management

Singapore, Singapore Stripe

Posted 11 days ago

Job Viewed

Tap Again To Close

Job Description

Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone’s reach while doing the most important work of your career.

About the team

The Incident Response team is a global 24/7 team responsible for driving incident response and management from detection to resolution. Stripe is proud of its five 9s API reliability and this team is at the forefront of ensuring we keep it that way - working hand-in-hand with Reliability Eng and across the Tech Org. This team of incident response managers (IRM) is defined by our sense of ownership and how we drive incidents to resolution - marshaling the necessary cross-functional resources to respond to and resolve service outages, critical bugs, security attacks and anything that significantly impacts the users of our products. The team is user-first and ensures appropriate external communications from Stripe and senior management to keep our users informed of disruption to their experience of Stripe. The team is highly skilled in incident troubleshooting, program management, incident classifications, incident communications, incident escalation and technical adeptness as incidents can arise from anywhere and cut across products and orgs in Stripe.

What you’ll do

This position entails leading and optimizing Stripe's incident management processes and automation, ensuring efficiency and adherence to stringent incident response metrics. As the head of the incident response team, you will establish and maintain a best-in-class incident response framework, upholding the reliability standards expected of Stripe. Responsibilities include but are not limited to incident classification, escalation, and notification management, along with accountability for key incident response metrics (TTx). You will generate actionable insights to drive continuous improvement, collaborating with engineering leadership to refine incident detection, response, user communication, and tooling efficacy. Leadership and development of a highly effective 24/7 global incident response management team, characterized by urgency, programmatic ownership of incidents and communications, and the capacity to engage engineering teams, are crucial. Additionally, you will manage incident communications across multiple channels for executive and end-user audiences, and identify automation opportunities to streamline incident response workflows, thereby safeguarding users and minimizing disruption to their operations.

Responsibilities
  • Lead the global 24/7 team of regional managers and incident response managers with ability to be hands-on and support frontline on-call with speed, cross-functional collaboration and escalation
  • Develop and own Stripe's incident response and management strategy and cross-functional roadmap, ensuring it aligns with the company's reputation for reliability.
  • Spearhead and manage Stripe's AI-First strategy for automation of incident response workflows, partnering with the engineering team to implement required tooling enhancements.
  • Enhance Stripe's incident response by leading and implementing improvements derived from analyzing user-facing incidents and extracting actionable insights and learnings.
  • Collaborate closely with executive leadership, engineering, and operations teams to lead significant programs and reshape workflows and metrics concerning reliability and incident operations.
  • Manage relevant TTx metrics, particularly those related to communication and escalation. Collaborate with engineering leadership to implement necessary improvements for each metric.
  • Develop user-focused metrics and data to guide Stripe's incident response, reliability strategy, and user communications (including RCAs), ensuring impactful decision-making.
Who you are

We’re looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.

  • 5+ years of management experience, including 2+ years of experience managing managers with a proven record in building, growing and transforming teams.
  • Extensive experience (4+ years) leading incident response for complex, large-scale distributed services with high SLOs/SLAs, coupled with deep expertise in crisis management.
  • Demonstrated ability to lead, influence other leaders and deliver complex strategic projects involving multiple stakeholders
  • Strong analytical skills, and the ability to use data to drive business decisions
  • Possesses proficiency in basic incident troubleshooting and a reasonable understanding of system architecture. Fluent in using SQL, Splunk, or similar query languages.
  • Exceptional communication abilities, capable of adapting incident updates for diverse audiences (executives, external users, internal teams).
  • Affinity for a fast paced work environment, crafting strategic and rapid fixes to high intensity problems with a keen eye for detail and a high bar for quality
  • Comfort navigating ambiguity, while identifying areas for process improvement and establishing best practices
Preferred qualifications
  • Experience managing geographically dispersed teams
  • Experience using infrastructure and application monitoring tools such as Prometheus, Sentry and others
  • Experience in incident response at a high-growth technology company, preferably within the payments or e-commerce sectors.
  • Proven ability to apply Agentic and Generative AI to revolutionize incident response, coupled with a strong grasp of current industry trends in the incident response domain.
  • Demonstrated history of driving engineering and process enhancements to improve incident response efficiency within a rapidly expanding technology organization.
Office-assigned Stripes spend at least 50% of the time in a given month in their local office or with users. This hits a balance between bringing people together for in-person collaboration and learning from each other, while supporting flexibility about how to do this in a way that makes sense for individuals and their teams.

The annual salary range for this role in the primary location is S$208,000 - S$312,000. This range may change if you are hired in another location. For sales roles, the range provided is the role’s On Target Earnings (“OTE”) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. This salary range may be inclusive of several career levels at Stripe and will be narrowed during the interview process based on a number of factors, including the candidate’s experience, qualifications, and specific location. Applicants interested in this role and who are not located in the primary location may request the annual salary range for their location during the interview process.

Specific benefits and details about what compensation is included in the salary range listed above will vary depending on the applicant’s location and can be discussed in more detail during the interview process. Benefits/additional compensation for this role may include: equity, company bonus or sales commissions/bonuses; retirement plans; health benefits; and wellness stipends.

At Stripe, we're looking for people with passion, grit, and integrity. You're encouraged to apply even if your experience doesn't precisely match the job description. Your skills and passion will stand out—and set you apart—especially if your career has taken some extraordinary twists and turns. At Stripe, we welcome diverse perspectives and people who think rigorously and aren't afraid to challenge assumptions. Join us.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.
 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Helpdesk Management Jobs