177 Site Reliability Engineer jobs in Singapore

Site Reliability Engineer

Singapore, Singapore DT One

Posted today

Job Viewed

Tap Again To Close

Job Description

Site Reliability Engineer role at DT One

Keeping more people, more connected, more often

DT One was founded with the aim to provide mobile carriers with the infrastructure and services they need to help migrant workers stay in touch with their family and friends back home.

Today, we operate a leading global network for mobile top-up solutions, innovative mobile rewards, and Phone-to-Phone solutions.

Our global network delivers better infrastructure and access to digital communications for over five billion across emerging economies, enabling them to stay better connected and as a result participate more actively in the global economy.

As a company, we’re forward-thinking, adaptable, and solutions-focused. We work closely with our network partners to provide them with valuable market insights and intelligent mobile technology that delivers more value to their business, and that ultimately benefits the end-consumer.

For more information, visit our website:

Context of the role

At DT One, we count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set, high availability, and extreme performance level. As we expand our platform infrastructure and applications, we are currently seeking talented Site Reliability Engineers to maintain, improve, and flawlessly operate our environments. We are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a globally distributed team to develop real-world solutions and positive user experiences at every interaction.

Key Responsibilities
  • Run the production environment by monitoring availability and taking a holistic view of system health
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
  • Establish and guarantee platform infrastructure, and applications service level objectives
  • Provide primary operational support and engineering for multiple large distributed software applications including on-call shifts
  • Build software and systems to manage network infrastructure, platform infrastructure, and applications
  • Improve reliability, quality, security, and time-to-market of our suite of software solutions
  • Partner with development teams to improve services through rigorous testing and release procedures
  • Participate in system design consulting, platform management, and capacity planning
  • Document every action turning findings into repeatable actions–and then into future automation
Professional Skills/Qualifications
  • Bachelor’s degree in computer science or other highly technical, scientific discipline
  • Experience with AWS cloud infrastructure management and related services
  • Experience with Infrastructure as Code and Configuration Management concepts and related tools and technologies, such as Terraform and Ansible
  • Hands-on experience with Linux administration, command-line interface, and shell scripting
  • Experience with dynamic resource management frameworks, and technologies, such as Kubernetes and Nomad
  • Experience with source code management tools, and related workflows
  • Experience with continuous integration and continuous deployment concepts and related tools and technologies, such as Jenkins, GitlabCI, Bitbucket Pipelines
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
  • Good communication skills in English
Preferred Qualifications
  • Previous success in technical engineering
  • Previous experience with multiple large distributed software applications operations
  • Previous experience defining and implementing deployment and release standards
  • Experience with database administration and performance tunings, such as PostgreSQL, MySQL, ElasticSearch, and Redis
  • Experience with monitoring tools, such as Prometheus, DataDog, and NewRelic
  • Experience with VPN configuration and administration
  • Coding experience beyond simple scripts
  • Strong Site Reliability principles oriented mindset
  • Sharing and mentoring mindset

Sound like you? Apply now!

Seniority level
  • Mid-Senior level
Employment type
  • Full-time
Job function
  • Information Technology
Industries
  • Financial Services and Technology, Information and Media

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Singapore, Singapore Viasat

Posted 5 days ago

Job Viewed

Tap Again To Close

Job Description

About us

One team. Global challenges. Infinite opportunities. At Viasat, we’re on a mission to deliver connections with the capacity to change the world. For more than 35 years, Viasat has helped shape how consumers, businesses, governments and militaries around the globe communicate. We’re looking for people who think big, act fearlessly, and create an inclusive environment that drives positive impact to join our team.


What you'll do

The Customer Engineering team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability, and performance of the Service to different Enterprise Customers. The Customer Engineering Team is empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in between. The team is also responsible to build and maintain Alerts to proactively monitor the service and act as the technical liaison between Customer facing teams and the Engineering teams.


The day-to-day

As a Site Reliability Engineer, you will:

  • Identify and investigate potential and actual customer performance problems, recommend, and prioritize remediation, and assess effectiveness of remediation actions
  • Participate in and provide feedback on product design, especially regarding reliability and availability
  • Drive initiatives with partner teams to improve the reliability and performance of the Service through improved system design
  • Drive a culture of intolerance to manual activity which results in a highly automated environment delivering scalable solution
  • Work Closely with Customer facing teams (Technical Account Mangers and Program Teams) to understand and prioritize the Customer issues
  • Drive monitoring and automation initiatives
  • Create and present Performance reports for technical and management stakeholders
  • Work closely with Engineering teams to communicate and prioritize the service impacting issues
  • Reproduce and test the Customer issues in the Lab
  • Develop Automated scripts and tools to Enable monitoring of the Service
  • Be part of on-call rotations

What you'll need

Requirements

  • 5+ years experience in troubleshooting and triage of technical issues in a fast paced environment, to support customers.
  • 5+ years experience in Network Operations or Product Support
  • Advanced knowledge of modern programming languages, especially Python
  • An ability to understand large complex systems and a passion to constantly improve environments
  • Strong networking knowledge: TCP/IP, IPSEC, VPN, NAT, Routing Protocols, AAA
  • Set priorities and work efficiently in a fast-paced environment
  • Demonstrated ability to deliver results on time with high quality and attention to detail
  • Demonstrated ability to work with ambiguous requirements, adapt, and learn
  • Experience with data analytics tools(Splunk, Kibana)
  • Keen (data-driven) decision making skills under incomplete information
  • Excellent face-to-face and remote customer rapport
  • Bachelor’s degree in electrical engineering, Computer Science, or Computer Engineering
  • Up to 10% travel

What will help you on the job

  • Experience analyzing data and trending to gain operational efficiencies
  • Telecom or related operational service experience, especially wireless networks
  • Previous technical role in a DevOps/SRE workflow
  • Experience with Satcom technology
  • Experience/knowledge GCP, AWS, Big Query

EEO Statement

Viasat is proud to be an equal opportunity employer, seeking to create a welcoming and diverse environment. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, ancestry, physical or mental disability, medical condition, marital status, genetics, age, or veteran status or any other applicable legally protected status or characteristic. If you would like to request an accommodation on the basis of disability for completing this on-line application, please click here .

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Singapore, Singapore Manpower Singapore

Posted 6 days ago

Job Viewed

Tap Again To Close

Job Description

This range is provided by Manpower Singapore. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range

Join a global leader in gaming to manage the reliability of game-related platforms and infrastructure across both cloud and on-premise environments.

Responsibilities:

  • Responsible for deployment, change, issues triage and infrastructure management of overseas games and relevant components and system, e.g. game monitor system, login services.
  • Responsible for monitoring and dashboarding for game observability, and ensure the game is reliable, scalable and secure.
  • Understand the game architecture, analyze, evaluate and respond to potential risks, such as hidden troubles and performance bottlenecks.
  • Responsible for daily communication and coordination between various teams.

Requirements:

  • Bachelor's Degree or above in Computer Science or comparable field.
  • More than 3 years of operations experience in Linux and Windows operating system.
  • Have a high sense of responsibility and teamwork spirit.
  • Proficiency in scripting programming such as Bash, Python, SQL.
  • Good understanding of cloud environment, such as AWS or Azure.
  • Experience with containerization technologies such as Docker and orchestration platforms like Kubernetes is a plus.
  • Experience with worldwide online game live operations is a plus.

Please note that your response to this advertisement and communications with us pursuant to this advertisement will constitute informed consent to the collection, use and/or disclosure of personal data by ManpowerGroup Singapore for the purpose of carrying out its business, in compliance with the relevant provisions of the Personal Data Protection Act 2012. To learn more about ManpowerGroup's Global Privacy Policy, please visit

Details
  • Seniority level: Entry level
  • Employment type: Contract
  • Job function: Analyst
  • Industries: Information Services

Referrals increase your chances of interviewing at Manpower Singapore by 2x

Singapore SGD100,000.00 - SGD125,000.00 • 7 hours ago

These related roles may also be available: Site Reliability Engineer (GovTech), Platform Engineer, Cloud/DevOps Engineer, and other SRE/Production Engineer opportunities.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Singapore, Singapore EC1 Partners

Posted 8 days ago

Job Viewed

Tap Again To Close

Job Description

Overview

Site Reliability Engineer – Global eFX Platform | Singapore

EC1 Partners is working with a leading global eFX trading platform that is expanding its technology presence in Singapore. This is a full-time, permanent role offering the opportunity to work in a fast-paced environment where scale, performance, and reliability are critical.

Responsibilities

The role focuses on reliability, automation, and performance of critical systems for a global eFX trading platform. The candidate will contribute to automation, scalability, and resilience of key services.

Key Requirements
  • Proficiency in Python and Bash scripting
  • Experience with automation tools such as Ansible, Puppet, or Chef
  • Strong background in cloud platforms (GCP or AWS)
  • Experience with containerization and orchestration (ideally Kubernetes )
  • Knowledge of CI/CD tools such as Jenkins or similar
  • Familiarity with Infrastructure as Code (Terraform or similar)
  • Programming knowledge in Go, C, C++, or Java
  • Experience building applications using Makefiles (C/C++) and Maven (Java)
Job Details
  • Seniority level : Mid-Senior level
  • Employment type : Full-time
  • Job function : Engineering and Information Technology
  • Industries : Financial Services, Information Services, and Capital Markets

To discuss this role further, please contact EC1 Partners or apply directly.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Singapore, Singapore CAREER INTERNATIONAL - FOS PTE. LTD.

Posted 8 days ago

Job Viewed

Tap Again To Close

Job Description

Job Responsibilities

  • Ensure the stability, reliability, and efficient operation of the Company's global business, maintaining high availability of services at all times.
  • Responsible for core operational tasks such as resource provisioning and management, incident response, capacity management, monitoring, and reliability improvements.
  • Review technical architecture design, assess soundness of the design, and proactively identify and resolve reliability risks.
  • Conduct in-depth analysis of systemic deficiencies, identify bottlenecks and develop optimization strategies; plan and execute projects to improve system reliability and ensure cost-effectiveness and highly availability of the systems.
  • Participate in 24/7 on-call rotation, promptly respond to and resolve production incidents to ensure service availability.
  • Analyze and improve processes to build stable, highly available systems; drive continuous automation improvements, and minimize manual intervention.
Job Requirements
  • Bachelor’s degree in Computer Science or a related field.
  • Proficiency in one of the following programming languages: Python, Go, or shell scripting, with demonstrated ability to independently develop modules or platforms.
  • Familiar with cloud computing; experience in managing multi-cloud or hybrid cloud platforms (e.g., Alibaba Cloud, Azure, AWS) is preferred.
  • Strong foundation in computer science, with hands-on experience in Linux, networking, load balancing, and designing high-availability and disaster recovery architectures.
  • A good team player with a strong sense of responsibility, self-driven and highly motivated.
  • Fluent in English and Mandarin (spoken) is a plus to effectively communicate with Mandarin-speaking clients.
How to Apply

Interested applicants, kindly send your resume in MS WORD format to or please click on Apply Now and provide the below details in your resume.

  1. Reasons for leaving ALL your employment
  2. Current and/or last drawn monthly salary (please provide breakdown)
  3. Expected monthly salary
  4. Availability

We regret only shortlisted candidates will be notified.

Important Note: Career International - FOS Pte Ltd is committed to safeguarding your personal data in accordance with the Personal Data Protection Act (PDPA). Please read our privacy statement on our corporate website

Career International - FOS Pte Ltd

EA License No: 14C6926

EA Personnel: Foong Mei Qi

EA Personnel Reg No: R

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Singapore, Singapore Thales Group

Posted 11 days ago

Job Viewed

Tap Again To Close

Job Description

Location: Singapore, SingaporeThales is a global technology leader trusted by governments, institutions, and enterprises to tackle their most demanding challenges. From quantum applications and artificial intelligence to cybersecurity and 6G innovation, our solutions empower critical decisions rooted in human intelligence. Operating at the forefront of aerospace and space, cybersecurity and digital identity, we’re driven by a mission to build a future we can all trust.In Singapore, Thales has been a trusted partner since 1973, originally focused on aerospace activities in the Asia-Pacific region. With 2,000 employees across three local sites, we deliver cutting-edge solutions across aerospace (including air traffic management), defence and security, and digital identity and cybersecurity sectors. Together, we’re shaping the future by enabling customers to make pivotal decisions that safeguard communities and power progress.You will be working in an Agile team within the Thales Digital Identity & Security (DIS) Business Line, for the On-Demand Connectivity (ODC) products. For more details, visitAs a Site Reliability Engineer in Thales ODC team, you will apply software engineering skills to deploy, operate, maintain and improve Thales platform, according to agreed Service levels and exceed customer expectations.**Responsibilities:**This role will be part of Digital Engineering & Services group:* You will work in a DevOps team to deploy, operate, maintain and improve ODC products in GCP Cloud, following the SRE approach.* You will be responsible for deployment of Thales products in cloud.* You will perform on-boarding test, communicate technical risk concerns and help prepare mitigation plans.* You will be responsible for System monitoring with real-time monitoring tools.* You will extend and acknowledge completion of handover milestones to Tiers I, II to comply with contractual SLAs.* You will be responsible for support operations tasks to shape the product roadmap and establish strong operational readiness across teams.* You will provide technical guidance for new or evolution of services and for consolidated technical analyses.* You will participate at the preparation and review of technical product & customer specific documentation.* You will be responsible to provide technical direction when CAB requires Tier II input, expertise or changes with high-risk impacts on customer SLAs.* You will ensure the integrity of the solution functional baseline and architecture.* You will develop and maintain IAC code and automation tools.* You will be responsible to perform regular performance tuning, technological watch and updates on service platform.* You will provide 24/7 on-call support in shifts.**Knowledge, Skills and Experience:*** Degree in Computer Science or any related discipline.* 4+ years of experience in relevant field.* Hands-on in deployment with **Kubernetes and GCP (preferred)/ AWS/ Azure administration and support** in production grade environment.* Hands-on experience in Continuous Integration and Continuous Delivery (CI/CD) tools like **Gitlab**, **Terraform, Ansible, Helm**, **Hashicorp** (any)* Strong knowledge of System Integration, Operation, Maintenance and proven experience with automation tools including Gitlab.* Strong working experience on one of the scripting language – **SHELL/Python** is required.* Knowledge of Agile methodology and Service Delivery best practices.* Knowledge on Cloud service provider i.e. GCP/AWS/Azure, monitoring tools, networking, infrastructure and Linux.* Experience in Telecom domain will be highly preferred.**Other information:*** Working Location: One North* Working Hours: Monday - Friday, 9am - 6pm* 24/7 oncall support in shift rotation (Average one shift per team member every 2 months)At Thales, we’re committed to fostering a workplace where respect, trust, collaboration, and passion drive everything we do. Here, you’ll feel empowered to bring your best self, thrive in a supportive culture, and love the work you do. Join us, and be part of a team reimagining technology to create solutions that truly make a difference – for a safer, greener, and more inclusive world.
#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Singapore, Singapore INMARSAT SOLUTIONS PTE. LTD.

Posted 12 days ago

Job Viewed

Tap Again To Close

Job Description

About us

One team. Global challenges. Infinite opportunities. At Viasat, we’re on a mission to deliver connections with the capacity to change the world. For more than 35 years, Viasat has helped shape how consumers, businesses, governments and militaries around the globe communicate. We’re looking for people who think big, act fearlessly, and create an inclusive environment that drives positive impact to join our team.

What you'll do

The Customer Engineering team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability, and performance of the Service to different Enterprise Customers. The Customer Engineering Team is empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in between. The team is also responsible to build and maintain Alerts to proactively monitor the service and act as the technical liaison between Customer facing teams and the Engineering teams.

The day-to-day
  • Identify and investigate potential and actual customer performance problems, recommend, and prioritize remediation, and assess effectiveness of remediation actions
  • Participate in and provide feedback on product design, especially regarding reliability and availability
  • Drive initiatives with partner teams to improve the reliability and performance of the Service through improved system design
  • Drive a culture of intolerance to manual activity which results in a highly automated environment delivering scalable solution
  • Work Closely with Customer facing teams (Technical Account Mangers and Program Teams) to understand and prioritize the Customer issues
  • Drive monitoring and automation initiatives
  • Create and present Performance reports for technical and management stakeholders
  • Work closely with Engineering teams to communicate and prioritize the service impacting issues
  • Reproduce and test the Customer issues in the Lab
  • Develop Automated scripts and tools to Enable monitoring of the Service
  • Be part of on-call rotations
What you'll need Requirements
  • 5+ years experience in troubleshooting and triage of technical issues in a fast paced environment, to support customers.
  • 5+ years experience in Network Operations or Product Support
  • Advanced knowledge of modern programming languages, especially Python
  • An ability to understand large complex systems and a passion to constantly improve environments
  • Strong networking knowledge: TCP/IP, IPSEC, VPN, NAT, Routing Protocols, AAA
  • Set priorities and work efficiently in a fast-paced environment
  • Demonstrated ability to deliver results on time with high quality and attention to detail
  • Demonstrated ability to work with ambiguous requirements, adapt, and learn
  • Experience with data analytics tools(Splunk, Kibana)
  • Keen (data-driven) decision making skills under incomplete information
  • Excellent face-to-face and remote customer rapport
  • Bachelor’s degree in electrical engineering, Computer Science, or Computer Engineering
  • Up to 10% travel
What will help you on the job
  • Experience analyzing data and trending to gain operational efficiencies
  • Telecom or related operational service experience, especially wireless networks
  • Previous technical role in a DevOps/SRE workflow
  • Experience with Satcom technology
  • Experience/knowledge GCP, AWS, Big Query

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.
Be The First To Know

About the latest Site reliability engineer Jobs in Singapore !

Site Reliability Engineer

Singapore, Singapore OPPO

Posted 14 days ago

Job Viewed

Tap Again To Close

Job Description

Overview

Here, you will participate in building OPPO's global infrastructure automation operations platform. You will leverage operational knowledge of Kubernetes (K8s), RDBMS, Linux, and related technologies as tools to enhance productivity within the organization, ensuring production systems run stably, efficiently, and cost-effectively. Specific work areas include but are not limited to:

  • Building a globally unified technology risk control platform.
  • Responsible for intelligent operation capabilities construction (monitoring, alerting, change management, incident response, capacity planning, disaster recovery, etc.).
Qualifications
  1. Bachelor's degree or higher in Computer Science, Telecommunications, Electronic Information, Communication Engineering, Software Engineering, or a related field.
  2. Proficient in at least one programming language: Java, Python, or Go.
  3. Solid understanding of fundamental data structures and algorithms.
  4. Experience with common frontend/backend development frameworks (e.g., Django, Vue.js, etc.); proficiency is a plus.
  5. Proficient in using MySQL or PostgreSQL; experience with performance optimization is a plus.
  6. Experience in microservices development is a plus.
  7. Familiarity with the Linux operating system; hands-on experience is a plus.
  8. Highly motivated, willing to share knowledge, strong self-drive, results-oriented, and good ability to work under pressure.
  9. Proficiency in Chinese sufficient for daily communication (at least).
Seniority level
  • Mid-Senior level
Employment type
  • Full-time
Job function
  • Information Technology and Engineering
Industries
  • Telecommunications

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Singapore, Singapore NetEase Games

Posted 15 days ago

Job Viewed

Tap Again To Close

Job Description

As a leading internet technology company based in China, NetEase, Inc. (NASDAQ: NTES and HKEX:999, “NetEase”) provides premium online services centered around content creation. With extensive offerings across its expanding gaming ecosystem, the Company develops and operates some of China’s most popular and longest-running mobile and PC games. Powered by industry-leading in-house R&D capabilities in China and globally, NetEase creates superior gaming experiences, inspires players, and passionately delivers value for its thriving community worldwide. By infusing play with culture and education with technology, NetEase transforms gaming into a meaningful vehicle to build a more entertaining and enlightened world.

NetEase’s ESG initiatives are among the best in the global media and entertainment industry, earning it a distinction as one of the S&P Global Industry Movers and an “A” rating from MSCI. For more information, please visit:

Job Description
  • Site Reliability Engineering (SRE) refers to using software engineering methods to manage systems, solve problems, and achieve operational automation to reduce trivial tasks and improve service availability. Responsibilities include but are not limited to:
  • Manage the operational work of NetEase Interactive Entertainment services, such as Eggy Party, Marvel Rivals, UU Accelerator, Ace Racer, and other online services, as well as internal research projects.
  • Design and select basic runtime environments (including servers, virtualization, cloud services, networks, databases, etc.) for game servers based on different games' service architecture, performance requirements, and business conditions, providing high-quality and efficient operational services at controllable costs.
  • Establish and monitor various operational metrics and customize data analysis standards.
  • Collaborate with product departments to identify issues, optimize technical architecture, and enhance user experience based on game and infrastructure conditions.
  • Participate in in-depth research on cutting-edge open-source software, virtualization, databases, and web services, and develop technical solutions for business implementation.
Job Requirements
  • Bachelor's degree or above, majors in computer science, networking, communications, automation, or related fields are preferred.
  • Familiar with the Linux operating system; knowledgeable about computer network architectures and common network protocols such as TCP/IP and HTTP.
  • Proficient in at least one programming language, including but not limited to C/C++, Shell, Python, Golang, Rust, or Java.
  • Passionate about open-source; experience or knowledge in open-source software such as Linux, Nginx, MySQL, K8S, and Istio is preferred.
  • Strong logical thinking, communication, and learning abilities; adept at research and problem-solving.
  • Skilled at teamwork, with a strong sense of collective honor, responsibility, and service awareness.
  • Open to trying new things, with excellent problem-solving skills and strong technical sensitivity; experience in contributing to open-source communities is a plus.
  • Proficiency in Chinese is required for this role, as daily communication and collaboration with key stakeholders and team members based in China are essential to the responsibilities of the position.
Apply for this job

Apply on the NetEase Careers page.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer

Singapore, Singapore Crystal Equation Corporation

Posted 17 days ago

Job Viewed

Tap Again To Close

Job Description

workfromhome

Overview

We are seeking a skilled Site Reliability Engineer (SRE) to join our team. SRE will be responsible for keeping all internal user-facing applications and other production systems running smoothly. This hybrid role involves a combination of both development and operations skills to build and manage systems that are both efficient and reliable.

The Enterprise Platforms Integration team (EPI) handles onboarding, deployment, ongoing support, automation and integrations of tools used by internal customer teams (engineering and non-engineering). We own over 30 third party applications with a combined user base of tens of thousands. EPI builds and supports the backend and infrastructure that the applications run on, supports users and collaborates with other teams to continue to refine, improve, and expand the products and their user bases. EPI has a rotating on-call assignment to help triage and resolve ad-hoc issues. In addition to this on-call rotation, our SREs are typically involved in any number of projects (to varying degrees) at any given time. These projects can often involve working with other internal teams as well as external vendors. Being able to effectively manage and maintain these cross-functional relationships while driving projects forward in a timely manner is critical. Thus, good interpersonal and organizational skills as well as an attention to detail are essential traits for success.

While our environment is dynamic and our workload is high-demand, we foster the kind of collaboration and teamwork that has earned us a reputation for consistent success. If you want the next step of your career to include challenging work and supportive teammates, get in touch with us.

Responsibilities:

  • Develop and maintain internal tooling that automates the provisioning, configuration and monitoring of the infrastructure and services.
  • Collaborate with software engineers to make applications resilient and scalable.
  • Participate in on-call rotations to ensure system uptime and performance.
  • Troubleshoot and resolve issues related to application development, deployment, and operations.
  • Reduce operational toil by automating repetitive tasks.
  • Develop and maintain documentation and diagrams detailing the operational architecture and flow of web traffic in multi-tiered application environments.
  • Conduct post-mortem reviews of incidents and implement preventive measures.

Qualifications:

  • Bachelor's degree in Computer Science or Information Systems.
  • 2+ years of proven experience as a Site Reliability Engineer or on similar hybrid and Software Engineering roles.
  • Proficiency in at least one programming language such as Python, Go, Java or Rust.
  • Good understanding and practical knowledge of the SDLC, design patterns, architecture patterns, SOLID principles, API maintenance, CI/CD.
  • Experience with automation and configuration management tools such as Chef, Puppet, Ansible.
  • Experience supporting modern services and web applications on Linux and Windows environments.
  • Experience with cloud platforms (AWS preferred).
  • Hands-on experience with containerization technologies (Docker, Kubernetes).
  • Strong problem-solving skills, with an ability to troubleshoot complex system issues and keen attention to detail.
  • Excellent communication skills, with the ability to collaborate effectively with a team and vendors.

THE PROMISES WE MAKE:

At Crystal Equation, we empower people and advance technology initiatives by building trust. Your recruiter will prep you for the interview, obtain feedback, guide you through any necessary paperwork and provide everything you need for a successful start. We will serve to empower you along the way and provide the path for your professional journey. Pay Range - 5500 to 5800K SGD Per month

For more information regarding our Privacy Policy, please visit crystalequation.com/privacy.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.
 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Site Reliability Engineer Jobs