1,217 Software Reliability jobs in Singapore

Senior Software Reliability Engineer

Singapore, Singapore beBeeDevSecops

Posted today

Job Viewed

Tap Again To Close

Job Description

Job Overview

We are seeking a seasoned professional to enhance our software development lifecycle through automation, security integration, and infrastructure reliability.

In this role, you will collaborate with cross-functional teams to streamline CICD workflows , embed security controls into every stage of the software delivery process, and ensure compliance—especially for solutions tailored to the public sector.

You will design and manage automated CICD pipelines that include security, quality assurance, and compliance gates throughout the delivery process.

Automate the provisioning and configuration of infrastructure and applications using modern tools and practices.

Integrate code analysis tools such as SAST, DAST, and SCA into the development pipeline to identify and remediate vulnerabilities early.

Monitor and fine-tune deployment processes, system performance, and security posture across environments.

Promote secure coding standards and support the adoption of policies to improve code integrity and system resilience.

Work closely with developers, QA, and system admins to ensure efficient and secure software deployment.

Operate and support container platforms and cloud-native technologies (e.g., Docker, Kubernetes, public cloud services).

Maintain technical documentation and continuously improve DevSecOps toolchains and practices.

  • Design and manage automated CICD pipelines that include security, quality assurance, and compliance gates throughout the delivery process.
  • Automate the provisioning and configuration of infrastructure and applications using modern tools and practices.
  • Integrate code analysis tools such as SAST, DAST, and SCA into the development pipeline to identify and remediate vulnerabilities early.
  • Monitor and fine-tune deployment processes, system performance, and security posture across environments.
  • Promote secure coding standards and support the adoption of policies to improve code integrity and system resilience.
  • Work closely with cross-functional teams to ensure efficient and secure software deployment.
  • Operate and support container platforms and cloud-native technologies (e.g., Docker, Kubernetes, public cloud services).
  • Maintain technical documentation and continuously improve DevSecOps toolchains and practices.
Requirements
  • Min. 2-3 years in DevOps or DevSecOps roles, with a strong focus on automation and security integration.
  • Proficiency with CICD tools (e.g., GitLab CI/CD, Jenkins) and security tools such as Sonatype or AquaSec.
  • Solid hands-on experience with container technologies and orchestration platforms (e.g., Docker, Kubernetes).
  • Familiarity with cloud environments (AWS or Azure) and infrastructure automation using tools like Terraform or Ansible.
  • Strong understanding of DevSecOps principles and best practices for secure software delivery.
  • Experience working in Agile or Scrum-based development teams.
Technical Skills
  • Kubernetes
  • Azure
  • Pipelines
  • IT Infrastructure Design
  • Scripting
  • Administration
  • Cloud Infrastructure
  • Networking
  • Python
  • Containerization
  • Cloud Services
  • Network Infrastructure
  • Docker
  • Infrastructure
  • Ansible
  • System Architecture
  • Linux
This advertiser has chosen not to accept applicants from your region.

Senior Software Engineer, Site Reliability Engineering

Singapore, Singapore Crypto.com

Posted 7 days ago

Job Viewed

Tap Again To Close

Job Description

We are a team to design, develop, maintain, and improve software for various ventures projects, i.e., projects that are adjacent to our core businesses and are bootstrapped fast with a lean team. You will be actively involved in the design of various components behind scalable applications, from frontend UI to backend infrastructure.

What you’ll be doing
  • Ensure entire stack is healthy: hardware, software, application and network are operating at optimal performance
  • Perform deep dives into both systemic and latent reliability issues; partnering with other software and DevOps engineers across the organization to design, implement and roll out fixes
  • Continuously improve availability, reliability, and observability and reduce the burden of human toil with tooling and automation
  • Lead and drive SRE initiatives to improve operation efficiencies
  • Represent the SRE team in system design reviews and operational readiness exercises for new and existing services
What you need
  • Experience coding in Ruby and/or Go
  • Familiar with GitOps principles and tools (Github Actions, Docker, Kubernetes)
  • Experience in designing, analyzing, and troubleshooting large-scale distributed systems
  • Curiosity about finding root causes in incidents and outages
  • Ability to develop alignment to cultivate relationships and driving impact
  • Mindset in designing fault tolerance system architecture
  • Comfort with being uncomfortable in ambiguous situations
  • Involvement with incident management and response
  • Desire to grow expertise, inform, and educate others
  • Capable to pick up various technologies, a fast learner and have a “get things done” mentality
  • Humble to embrace better ideas from others, eager to make things better, open to challenges and possibilities
Desirable
  • Familiar with cloud platforms and micro-service based architecture (AWS is big plus)
  • Familiar with monitoring tools (e.g. Datadog, OpenTelemetry)
  • Familiar with CICD tools (e.g. Github Actions)
  • Familiar with IaC tools (e.g. Terraform, Spacelift)
  • Experience in designing resilient system architecture
  • Experience in optimizing performance of large-scale production system
Life @ Crypto.com

Empowered to think big. Try new opportunities while working with a talented, ambitious and supportive team.

Transformational and proactive working environment. Empower employees to find thoughtful and innovative solutions.

Growth from within. We help to develop new skill-sets that would impact the shaping of your personal and professional growth.

Work Culture. Our colleagues are some of the best in the industry; we are all here to help and support one another.

One cohesive team. Engage stakeholders to achieve our ultimate goal - Cryptocurrency in every wallet.

Work Flexibility Adoption. Flexi-work hour and hybrid or remote set-up.

Aspire career alternatives through us - our internal mobility program offers employees a new scope.

Work Perks: crypto.com visa card provided upon joining.

Benefits

Competitive salary.

Attractive annual leave entitlement including: birthday, work anniversary.

Work Flexibility Adoption. Flexi-work hour and hybrid or remote set-up.

Aspire career alternatives through us. Our internal mobility program can offer employees a diverse scope.

Work Perks: crypto.com visa card provided upon joining.

Our Crypto.com benefits packages vary depending on region requirements, you can learn more from our talent acquisition team.

About Crypto.com:

Founded in 2016, Crypto.com serves more than 80 million customers and is the world's fastest growing global cryptocurrency platform. Our vision is simple: Cryptocurrency in Every Wallet. Built on a foundation of security, privacy, and compliance, Crypto.com is committed to accelerating the adoption of cryptocurrency through innovation and empowering the next generation of builders, creators, and entrepreneurs to develop a fairer and more equitable digital ecosystem.

Learn more at

Crypto.com is an equal opportunities employer and we are committed to creating an environment where opportunities are presented to everyone in a fair and transparent way. Crypto.com values diversity and inclusion, seeking candidates with a variety of backgrounds, perspectives, and skills that complement and strengthen our team.

Personal data provided by applicants will be used for recruitment purposes only.

Please note that only shortlisted candidates will be contacted.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Senior Software Engineer, Site Reliability Engineering

Singapore, Singapore Crypto.com

Posted today

Job Viewed

Tap Again To Close

Job Description

full-time

We are a team to design, develop, maintain, and improve software for various ventures projects, i.e., projects that are adjacent to our core businesses and are bootstrapped fast with a lean team. You will be actively involved in the design of various components behind scalable applications, from frontend UI to backend infrastructure.

What you’ll be doing

  • Ensure entire stack is healthy: hardware, software, application and network are operating at optimal performance
  • Perform deep dives into both systemic and latent reliability issues; partnering with other software and DevOps engineers across the organization to design, implement and roll out fixes
  • Continuously improve availability, reliability, and observability and reduce the burden of human toil with tooling and automation
  • Lead and drive SRE initiatives to improve operation efficiencies
  • Represent the SRE team in system design reviews and operational readiness exercises for new and existing services

What you need

  • Experience coding in Ruby and/or Go
  • Familiar with GitOps principles and tools (Github Actions, Docker, Kubernetes)
  • Experience in designing, analyzing, and troubleshooting large-scale distributed systems
  • Curiosity about finding root causes in incidents and outages
  • Ability to develop alignment to cultivate relationships and driving impact
  • Mindset in designing fault tolerance system architecture
  • Comfort with being uncomfortable in ambiguous situations
  • Involvement with incident management and response
  • Desire to grow expertise, inform, and educate others
  • Capable to pick up various technologies, a fast learner and have a “get things done” mentality
  • Humble to embrace better ideas from others, eager to make things better, open to challenges and possibilities

Desirable

  • Familiar with cloud platforms and micro-service based architecture (AWS is big plus)
  • Familiar with monitoring tools (e.g. Datadog, OpenTelemetry)
  • Familiar with CICD tools (e.g. Github Actions)
  • Familiar with IaC tools (e.g. Terraform, Spacelift)
  • Experience in designing resilient system architecture
  • Experience in optimizing performance of large-scale production system

Life @ Crypto.com

Empowered to think big. Try new opportunities while working with a talented, ambitious and supportive team.

Transformational and proactive working environment. Empower employees to find thoughtful and innovative solutions.

Growth from within. We help to develop new skill-sets that would impact the shaping of your personal and professional growth.

Work Culture. Our colleagues are some of the best in the industry; we are all here to help and support one another.

One cohesive team. Engage stakeholders to achieve our ultimate goal - Cryptocurrency in every wallet.

Work Flexibility Adoption. Flexi-work hour and hybrid or remote set-up.

Aspire career alternatives through us - our internal mobility program offers employees a new scope.

Work Perks: crypto.com visa card provided upon joining.

Benefits

Competitive salary.

Attractive annual leave entitlement including: birthday, work anniversary.

Work Flexibility Adoption. Flexi-work hour and hybrid or remote set-up.

Aspire career alternatives through us. Our internal mobility program can offer employees a diverse scope.

Work Perks: crypto.com visa card provided upon joining.

Our Crypto.com benefits packages vary depending on region requirements, you can learn more from our talent acquisition team.

About Crypto.com:

Founded in 2016, Crypto.com serves more than 80 million customers and is the world's fastest growing global cryptocurrency platform. Our vision is simple: Cryptocurrency in Every Wallet. Built on a foundation of security, privacy, and compliance, Crypto.com is committed to accelerating the adoption of cryptocurrency through innovation and empowering the next generation of builders, creators, and entrepreneurs to develop a fairer and more equitable digital ecosystem.

Learn more at

Crypto.com is an equal opportunities employer and we are committed to creating an environment where opportunities are presented to everyone in a fair and transparent way. Crypto.com values diversity and inclusion, seeking candidates with a variety of backgrounds, perspectives, and skills that complement and strengthen our team.

Personal data provided by applicants will be used for recruitment purposes only.

Please note that only shortlisted candidates will be contacted.

#J-18808-Ljbffr

This advertiser has chosen not to accept applicants from your region.

Reliability Engineering Specialist

Singapore, Singapore AMD

Posted today

Job Viewed

Tap Again To Close

Job Description

Overview

Join to apply for the Reliability Engineering Specialist role at AMD .

Role

Join a dynamic global team dedicated to advanced reliability testing of module and system boards of AMD's cutting-edge products. Collaborate closely with cross-functional teams across AMD Global Operations & Quality, and Data Center organizations on accelerator-product system setup and reliability testing.

Key Responsibilities
  • System-level setup and testing
    • Plan, execute, and optimize system-level setups for accelerator products, including server rack and system configurations.
    • Ensure seamless integration and functionality of server systems with advanced cooling solutions and environmental management systems.
    • Validate and maintain reliability test scripts for automated and manual testing processes.
  • Reliability assessment and testing
    • Conduct comprehensive reliability assessments of accelerator systems, focusing on mechanical, thermal, and electrical stress factors.
    • Design and implement environmental stress tests to simulate data center conditions, including operational stress, thermal cycling, signal, and power integrity.
    • Evaluate material interactions and their impact on product reliability, ensuring robustness in diverse operating environments.
    • Analyze results to identify potential reliability risks and areas for design improvement.
  • Functional testing and fault isolation
    • Perform detailed functional testing to evaluate system performance under various operational conditions.
    • Identify, isolate, and troubleshoot faults using advanced diagnostic tools and methodologies.
  • Failure analysis and reporting
    • Perform root cause analysis for identified reliability failures and develop corrective actions for design and process enhancement.
    • Collaborate with cross-functional teams to conduct root cause analysis of reliability testing failures.
  • Collaboration and documentation
    • Work closely with design, manufacturing, and quality teams to align reliability goals with overall product requirements.
    • Generate comprehensive reports detailing reliability test results, analysis, and recommendations.
    • Maintain meticulous records of testing methodologies and outcomes for future reference and continuous improvement initiatives.
  • Mentorship
    • Effectively mentor junior engineers, providing guidance in both technical domains and professional skill development to foster growth and team success.
Preferred Experience
  • Knowledge of reliability engineering principles, product lifecycle, and standards in high-performance computing environments.
  • Proven experience in system-level setup and testing for accelerator products or similar technologies.
  • Proficiency in developing and executing reliability test scripts and protocols.
  • Familiarity with reliability standards and best practices in high-performance computing environments.
  • Familiarity with data center environmental management, server rack/system configurations, and integrated cooling solutions.
  • Strong understanding of environmental stress factors, including thermal, mechanical, and electrical stresses, in server systems (L6-L10).
  • Expertise in failure analysis techniques, including root cause analysis and fault isolation methodologies.
  • Excellent written and verbal communication skills for clear reporting and collaboration.
  • Strong analytical, problem-solving, and communication skills.
  • Experience with reliability testing tools, simulation software and statistical tools is an added advantage.
  • Knowledge in project and risk management is an added advantage.
  • Self-starter and able to independently drive tasks to completion.
  • Ability to structure and execute complex analysis, draw insights, and communicate summary conclusions/recommendations to senior management and AMD customers/partners.
  • Ability to network, build relationships, and collaborate to drive effective decision-making across multiple functions and levels within AMD.
Academic Credentials
  • Bachelor’s or Master’s degree in Electrical/Electronics Engineering (EE) or a related field.
Location

Singapore

Benefits offered are described: AMD benefits at a glance. AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

Seniority level
  • Not Applicable
Employment type
  • Full-time
Job function
  • Industries

Referrals increase your chances of interviewing at AMD by 2x

Get notified about new Engineering Specialist jobs in Singapore, Singapore.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Reliability Engineering MTS

Singapore, Singapore Systems on Silicon Manufacturing Company Pte Ltd (SSMC)

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

Overview

SSMC (Systems on Silicon Manufacturing Company Pte. Ltd.), a Joint Venture between NXP and TSMC, offers flexible and cost effective semiconductor fabrication solutions by maintaining fully equipped SMIF cleanroom environment, 100% equipment automation and proven wafer-manufacturing processes. We are looking for innovative, passionate, and talented people to join our team.

We’re searching for a Principal Engineer/ MTS to be part of our QRE Department diverse team of talent, to support Reliability Laboratory Operations and Manage PLR and WLR Reliability Test Equipment (Preventive Maintenance, Calibration). Lead High Voltage (HV) Process Technologies Reliability Tests & Support for Fab Monitoring / Qualification / Customer Issues / Engineering Change Evaluations.

What you will be working on
  • Lead and Setup New Process Technology Reliability Qualification
  • Define and Execute New Process Technology Reliability Qualification Plan Requirements to meet Technology Milestones requirements
  • Lead and Setup New Process Technology Reliability Monitoring
  • Conduct Process/Wafer Level Reliability (WLR) Tests and Analysis
  • Conduct Product Level Reliability (PLR) Tests and Analysis
  • Support Fab Monitoring / Qualification / Customer Issues / Engineering Change Evaluations and Perform Reliability Risk Assessments
  • Develop and Setup New or Enhanced Process and Product Reliability Tests / Analysis / Methodologies / Capabilities / Techniques
  • Schedule & Prioritize Reliability Tests Requests (Manpower, Skills, Tool resources)
  • Keep in-line with Industry and Mother-fabs’ Reliability Tests & Requirement Trends / Development
  • Support Reliability Laboratory Operations and Manage PLR and WLR Reliability Test Equipment (Preventive Maintenance, Calibration). Maintain Day-to-Day Reliability Laboratory Operations, Equipment Uptime
  • Drive Continuous Improvement in Safety, Quality, Productivity of work processes and environment to achieve assigned department targets
  • Training, Coaching and Development of Reliability Engineers
More about you
  • Master / Degree in Science or Engineering in Mechanical, Chemical Engineering or equivalent
  • Extensive Experience: >10 years in Wafer Fab / Semiconductor Environment and Leading Role in WLR / PLR Reliability.
  • In-depth understanding of Technologies, Trends and Needs
  • Experience with major Process Technologies like Automotive, Logic, High Voltage, FLASH / EE / Non-Volatile-Memory (NVM), General Purpose Processes。
  • In-depth Knowledge Front-End / Back-End Reliability Mechanisms, Test Methodology (GOI, TDDB, HCI, NBTI, BTS, JS, PID, ESD, LU, EM, SV, Low-K IMD) (HTOL, EFR, IFR, THB, HAST, TMCL, TH, HTS, Pre-Con, Reflow)
  • Good knowledge of International Standards & Requirements on Process & Product Reliability (AEC-Q100, JEDEC, JEP001)

SSMC is firmly committed to upholding equal employment opportunities for all individuals. We strictly adhere to the Tripartite Guidelines on Fair Employment Practices (TGFEP), the Singapore Food Safety and Security Act 2025 (FSSA 2025), and the Singapore Code of Advertising Practice. All qualified applicants will receive non-discriminatory consideration for employment on the basis of merit and regardless of age, race, gender, religion, marital status and family responsibilities, or disability, or any other attributes as protected by the relevant laws.

Seniority level
  • Mid-Senior level
Employment type
  • Full-time
Job function
  • Manufacturing, Project Management, and Engineering
Industries
  • Semiconductor Manufacturing and Industrial Machinery Manufacturing

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Reliability Engineering Lead

Singapore, Singapore beBeeLeadership

Posted today

Job Viewed

Tap Again To Close

Job Description

Reliability Engineering Lead

We are seeking a Reliability Engineering Lead to drive initiatives within the Quality and Reliability Engineering team.

The successful candidate will oversee laboratory operations, guide reliability testing for advanced process technologies, and ensure equipment and methods meet the highest standards. This leadership role is at the intersection of technology, operations, and mentoring.

Key Responsibilities:

  • Oversight of laboratory operations to ensure efficiency and effectiveness
  • Guidance of reliability testing for advanced process technologies
  • Ensuring equipment and methods meet the highest standards

Requirements:

  • Demonstrated experience in reliability engineering and leadership
  • Strong understanding of laboratory operations and testing protocols
  • Ability to mentor and guide cross-functional teams

Benefits:

  • Opportunity to work on cutting-edge technologies
  • Chance to develop leadership skills and mentor others
  • Collaborative work environment with experienced professionals

About Us:

This is an exciting opportunity to join a dynamic team and contribute to the development of innovative solutions. If you are a motivated individual with a passion for reliability engineering, we encourage you to apply.

This advertiser has chosen not to accept applicants from your region.

Reliability Engineering Expert

Singapore, Singapore beBeeEngineering

Posted today

Job Viewed

Tap Again To Close

Job Description

**Job Summary:**

We are seeking a highly skilled Reliability Engineer to join our team. The successful candidate will be responsible for leading the development and implementation of reliability qualification plans, conducting process and product reliability tests, and providing technical support to ensure the highest quality standards.

Main Responsibilities:

  • Lead and develop new process technology reliability qualification plans to meet technology milestones.
  • Define and execute new process technology reliability monitoring plans.
  • Conduct process and wafer level reliability tests and analysis.
  • Support fab monitoring, qualification, customer issues, and engineering change evaluations.
  • Develop and implement new or enhanced process and product reliability tests, analysis, methodologies, capabilities, techniques.
  • Schedule and prioritize reliability tests requests.
  • Stay up-to-date with industry and mother-fab reliability trends and requirements.

Requirements:

  • Masters/degree in science or engineering in mechanical, chemical engineering or equivalent.
  • More than 10 years of experience in wafer fab/semiconductor environment and leading role in WLR/PLR reliability.
  • In-depth understanding of technologies, trends, and needs.
  • Experience with major process technologies like automotive, logic, high voltage, flash/ee/non-volatile-memory (nvm), general purpose processes.
  • In-depth knowledge of front-end/back-end reliability mechanisms, test methodology (goi, tddb, hci, nbti, bts, js, pid, esd, lu, em, sv, low-k imd) (htol, efr, ifr, thb, hast, tmcl, th, hts, pre-con, reflow).
  • Good knowledge of international standards & requirements on process & product reliability (aec-q100, jedec, jep001).

About Us:

We are an equal opportunities employer and welcome applications from all qualified candidates. We are committed to providing a diverse and inclusive work environment and strive to create a workplace where everyone feels valued and respected.

This advertiser has chosen not to accept applicants from your region.
Be The First To Know

About the latest Software reliability Jobs in Singapore !

Reliability Engineering Specialist

Singapore, Singapore ADVANCED MICRO DEVICES (SINGAPORE) PTE LTD

Posted today

Job Viewed

Tap Again To Close

Job Description

Roles & Responsibilities

THE ROLE:

Join a dynamic global team dedicated to advanced reliability testing of module and system boards of AMD's cutting-edge products. Collaborate closely with cross-functional teams across AMD Global Operations & Quality, and Data Center organizations on accelerator-product system setup and reliability testing.

KEY RESPONSIBILITIES:

  • System-level setup and testing:
    • Plan, execute, and optimize system-level setups for accelerator products, including server rack and system configurations.
    • Ensure seamless integration and functionality of server systems with advanced cooling solutions and environmental management systems.
    • Validate and maintain reliability test scripts for automated and manual testing processes.
  • Reliability assessment and testing:
    • Conduct comprehensive reliability assessments of accelerator systems, focusing on mechanical, thermal, and electrical stress factors.
    • Design and implement environmental stress tests to simulate data center conditions, including operational stress, thermal cycling, signal, and power integrity.
    • Evaluate material interactions and their impact on product reliability, ensuring robustness in diverse operating environments.
    • Analyze results to identify potential reliability risks and areas for design improvement.
  • Functional testing and fault isolation:
    • Perform detailed functional testing to evaluate system performance under various operational conditions.
    • Identify, isolate, and troubleshoot faults using advanced diagnostic tools and methodologies.
  • Failure analysis and reporting:
    • Perform root cause analysis for identified reliability failures and develop corrective actions for design and process enhancement.
    • Collaborate with cross-functional teams to conduct root cause analysis of reliability testing failures.
  • Collaboration and documentation:
    • Work closely with design, manufacturing, and quality teams to align reliability goals with overall product requirements.
    • Generate comprehensive reports detailing reliability test results, analysis, and recommendations.
    • Maintain meticulous records of testing methodologies and outcomes for future reference and continuous improvement initiatives.
  • Mentorship:
    • Effectively mentor junior engineers, providing guidance in both technical domains and professional skill development to foster growth and team success.

PREFERRED EXPERIENCE:

  • Knowledge of reliability engineering principles, product lifecycle, and standards in high-performance computing environments.
  • Proven experience in system-level setup and testing for accelerator products or similar technologies.
  • Proficiency in developing and executing reliability test scripts and protocols.
  • Familiarity with reliability standards and best practices in high-performance computing environments.
  • Familiarity with data center environmental management, server rack/system configurations, and integrated cooling solutions.
  • Strong understanding of environmental stress factors, including thermal, mechanical, and electrical stresses, in server systems (L6–L10).
  • Expertise in failure analysis techniques, including root cause analysis and fault isolation methodologies.
  • Excellent written and verbal communication skills for clear reporting and collaboration.
  • Strong analytical, problem-solving, and communication skills.
  • Experience with reliability testing tools, simulation software and statistical tools is an added advantage.
  • Knowledge in project and risk management is an added advantage.
  • Self-starter and able to independently drive tasks to completion.
  • Ability to structure and execute complex analysis, draw insights, and communicate summary conclusions/recommendations to senior management and AMD customers/partners.
  • Ability to network, build relationships, and collaborate to drive effective decision-making across multiple functions and levels within AMD.

ACADEMIC CREDENTIALS:

  • Bachelor's or Master's degree in Electrical/Electronics Engineering (EE) or a related field.

LOCATION:

Singapore

Tell employers what skills you have

Cycling
Manual Testing
Budget Management
Ubuntu
Root Cause Analysis
Reliability
Administration Management
Reliability Engineering
Infrastructure Architecture
RedHat
Technical Consultation
Environmental Management Systems
Technical Engineering
Failure Analysis
This advertiser has chosen not to accept applicants from your region.

Reliability Engineering Leadership

Singapore, Singapore beBeeReliability

Posted today

Job Viewed

Tap Again To Close

Job Description

Lead Reliability Expert:

  • Develop and oversee technology qualification programs.
  • Direct reliability testing and lab operations for wafer-level and product-level products.
  • Implement test method improvements and optimize lab capabilities to meet evolving needs.
  • Support fab monitoring, customer requests, and technical evaluations.
  • Evaluate day-to-day lab performance and maintain high equipment uptime.
  • Promote a safe working environment with focus on quality and continuous improvement.
  • Mentor team members and develop technical skills within the group.

Requirements:

  • Degree in Engineering or Science (Mechanical, Chemical, or related).
  • At least 10 years of experience in semiconductor/wafer fab, with proven leadership in reliability engineering.
  • Deep knowledge of WLR/PLR methods, reliability mechanisms, and industry standards.
  • Experience across various process technologies such as automotive, logic, HV, Flash/NVM.
  • Familiarity with global standards including AEC-Q100, JEDEC, JEP001.
This advertiser has chosen not to accept applicants from your region.

Reliability Engineering Specialist

189767 $13000 Monthly ADVANCED MICRO DEVICES (SINGAPORE) PTE LTD

Posted 11 days ago

Job Viewed

Tap Again To Close

Job Description

THE ROLE:

Join a dynamic global team dedicated to advanced reliability testing of module and system boards of AMD's cutting-edge products. Collaborate closely with cross-functional teams across AMD Global Operations & Quality, and Data Center organizations on accelerator-product system setup and reliability testing.

KEY RESPONSIBILITIES:

  • System-level setup and testing:
    • Plan, execute, and optimize system-level setups for accelerator products, including server rack and system configurations.
    • Ensure seamless integration and functionality of server systems with advanced cooling solutions and environmental management systems.
    • Validate and maintain reliability test scripts for automated and manual testing processes.
  • Reliability assessment and testing:
    • Conduct comprehensive reliability assessments of accelerator systems, focusing on mechanical, thermal, and electrical stress factors.
    • Design and implement environmental stress tests to simulate data center conditions, including operational stress, thermal cycling, signal, and power integrity.
    • Evaluate material interactions and their impact on product reliability, ensuring robustness in diverse operating environments.
    • Analyze results to identify potential reliability risks and areas for design improvement.
  • Functional testing and fault isolation:
    • Perform detailed functional testing to evaluate system performance under various operational conditions.
    • Identify, isolate, and troubleshoot faults using advanced diagnostic tools and methodologies.
  • Failure analysis and reporting:
    • Perform root cause analysis for identified reliability failures and develop corrective actions for design and process enhancement.
    • Collaborate with cross-functional teams to conduct root cause analysis of reliability testing failures.
  • Collaboration and documentation:
    • Work closely with design, manufacturing, and quality teams to align reliability goals with overall product requirements.
    • Generate comprehensive reports detailing reliability test results, analysis, and recommendations.
    • Maintain meticulous records of testing methodologies and outcomes for future reference and continuous improvement initiatives.
  • Mentorship:
    • Effectively mentor junior engineers, providing guidance in both technical domains and professional skill development to foster growth and team success.

PREFERRED EXPERIENCE:

  • Knowledge of reliability engineering principles, product lifecycle, and standards in high-performance computing environments.
  • Proven experience in system-level setup and testing for accelerator products or similar technologies.
  • Proficiency in developing and executing reliability test scripts and protocols.
  • Familiarity with reliability standards and best practices in high-performance computing environments.
  • Familiarity with data center environmental management, server rack/system configurations, and integrated cooling solutions.
  • Strong understanding of environmental stress factors, including thermal, mechanical, and electrical stresses, in server systems (L6–L10).
  • Expertise in failure analysis techniques, including root cause analysis and fault isolation methodologies.
  • Excellent written and verbal communication skills for clear reporting and collaboration.
  • Strong analytical, problem-solving, and communication skills.
  • Experience with reliability testing tools, simulation software and statistical tools is an added advantage.
  • Knowledge in project and risk management is an added advantage.
  • Self-starter and able to independently drive tasks to completion.
  • Ability to structure and execute complex analysis, draw insights, and communicate summary conclusions/recommendations to senior management and AMD customers/partners.
  • Ability to network, build relationships, and collaborate to drive effective decision-making across multiple functions and levels within AMD.

ACADEMIC CREDENTIALS:

  • Bachelor’s or Master’s degree in Electrical/Electronics Engineering (EE) or a related field.

LOCATION:

Singapore


This advertiser has chosen not to accept applicants from your region.
 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Software Reliability Jobs