974 Sre jobs in Singapore

Site Reliability Engineer (SRE)

Singapore, Singapore Sea

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

Join to apply for the Site Reliability Engineer (SRE) role at Sea .

Get AI-powered advice on this job and more exclusive features.

Responsibilities
  • Develop and maintain scripts to retrieve and process data from Google Workspace and Zoom, including users, groups, meeting rooms, licenses, activity logs, and configurations.
  • Normalize and structure data for analysis, reporting, and alerting.
  • Build automated alerting systems to identify anomalies, policy violations, or operational issues in Workspace or Zoom environments.
  • Design workflows to automate tasks such as account cleanup, license management, and configuration enforcement.
  • Build a secure internal web platform to standardize administrative actions, including reporting, dashboards, and visualizations.
  • Collaborate with IT Services and Support teams to prioritize features and gather automation requirements.
  • Implement Git workflows for collaboration and maintain well-documented code.
  • Deploy tools in containerized environments like Docker and support infrastructure such as databases and authentication mechanisms.
Requirements
  • 3–5 years of experience in software development, automation, or internal systems.
  • Proficiency in scripting/backend languages like Python, Node.js, or Go.
  • Experience with APIs such as Google Workspace Admin SDK, Zoom API, GAM.
  • Familiarity with Git and collaborative workflows.
  • Strong problem-solving skills and ability to work independently.
Additional Details
  • Seniority level: Entry level
  • Employment type: Full-time
  • Job function: Information Technology
  • Industries: Technology, Internet
#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer (SRE)

Singapore, Singapore Percept Solutions

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

Join to apply for the Site Reliability Engineer (SRE) role at Percept Solutions

Continue with Google Continue with Google

2 years ago Be among the first 25 applicants

Join to apply for the Site Reliability Engineer (SRE) role at Percept Solutions

Job Description

Job Description

Design and implementation of new solutions as well as enhancement and integration of existing ones to ensure pro-active monitoring

Working collaboration with internal teams and vendors to identify, monitor and improve Service Level Objective and Indicator

Support for incident management, investigation, resolution and post-mortem

Performance monitoring and capacity management

Automate manual operational tasks for self-healing

Administration to provide operational support for monitoring tools

Deployment and patching

System configuration

User access management

Incident management and investigation

Report and Dashboard generation

Job Requirements

SRE and automation tools like Ansible, Jenkins

Monitoring solutions such as Zabbix, Dynatrace,CloudWatch, eG, SolarWinds

Dashboard visualization such as Grafana

Proficient in SQL Scripting for data analytics

Familiar with database technology such as Oracle,MySQL, MS SQL

Familiar with Windows, Unix, Linux OS environments

EA Licence No.:18S9405 / EA Reg. No.:R1330864

Seniority level
  • Seniority level Mid-Senior level
Employment type
  • Employment type Full-time
Job function
  • Job function Engineering and Information Technology
  • Industries IT Services and IT Consulting

Referrals increase your chances of interviewing at Percept Solutions by 2x

Sign in to set job alerts for “Site Reliability Engineer” roles.

Continue with Google Continue with Google

Continue with Google Continue with Google

Project Intern, Digital Innovations & Solutions (Full Stack Developer) Web Frontend Engineer(Work Location: Remote in Taiwan) Software Engineering - Research Internship Software Developer – Life Sciences Technology Frontend Software Engineer, Data Platform - 2025 Start Python Developer (Singapore) – Elite Fintech Startup (up to $200K SGD + Bonus + Hybrid)

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer (SRE)

Singapore, Singapore COFFEE MEETS BAGEL PTE. LTD.

Posted 18 days ago

Job Viewed

Tap Again To Close

Job Description

We are a global dating app created to give everyone a chance at love. The sense of belonging and connectedness we get from relationships helps us survive and thrive, and we’re working to make it a little easier for people to find that. We’re inspired by the stories we hear from employees, friends, and family who have used our app to transform their lives, and you, too, can make a difference by joining us!

We are looking for a talented Senior Site Reliability Engineer to help design the future of dating. This individual will bring extensive experience in running large-scale data sources in the cloud and will be responsible for modernizing our data source handling and maintaining our core infrastructure and services on AWS.

This role will be based in Singapore and report directly to the CTO.

Responsibilities:
  • Architect, develop, and maintain our core infrastructure and services on AWS, focusing on high availability, performance, and scalability.
  • Specific AWS services of interest include EC2, RDS, S3, ElastiCache, CloudWatch, RedShift, OpenSearch, and VPC.
  • Implement and manage continuous deployment processes to achieve seamless deployment of services with minimal downtime.
  • Monitor system performance, identify bottlenecks, and apply necessary optimizations to ensure the smooth operation of our services.
  • Develop and maintain automated tools for infrastructure provisioning, configuration, and deployment.
  • Work closely with development teams to integrate infrastructure builds and operational best practices into the software development lifecycle.
  • Conduct root cause analysis for production errors and implement strategies to prevent future occurrences.
  • Manage and optimize network configurations to ensure secure and efficient data flow and access.
  • Administer and maintain databases, ensuring their reliability, performance, and security.
  • Lead capacity planning efforts to ensure that our infrastructure scales in line with demand while optimizing costs and maintaining performance.
  • Modernize data source handling (Redshift, Postgres, RDS, etc.).
  • Manage Kubernetes workloads.
Qualifications:
  • Bachelor's degree in Computer Science, Engineering, or a related field.
  • 5+ years of industry experience.
  • Proven experience as an SRE, DevOps Engineer, or similar role in a cloud-based environment.
  • Strong expertise in AWS services and tools.
  • Proficient understanding of networking principles, transport, and application protocols, especially TCP/IP, BGP, DNS, TLS, and HTTP/S.
  • Experience with database administration, including performance tuning, backup and recovery processes, and security management.
  • Proficiency in scripting languages (e.g., Python, Bash) and automation tools (e.g., Terraform).
  • Excellent problem-solving skills and the ability to work independently or as part of a team.
  • Strong Written and Verbal Communication: Fluent in English (both written and verbal); proficiency in Chinese is an advantage to work with Chinese stakeholders.
  • Significant experience in capacity planning and cost management within cloud environments.
  • Experience with Kubernetes.
  • Familiarity with Terraform for general systems maintenance.
  • Experience with data sources like Redshift, Postgres (Citus, Patroni), and RDS.
Preferred Qualifications:
  • AWS SysOps Administrator Associate or AWS Solutions Architect Professional (SAP) certification.
  • Experience with Spotinst for cost optimization.
  • Familiarity with additional scripting languages such as Go or JavaScript.

If you're passionate about tackling big challenges and have the skills to help us shape the future of online dating, we want to hear from you!

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer (SRE) (GovTech)

Singapore, Singapore Avepoint

Posted 2 days ago

Job Viewed

Tap Again To Close

Job Description

We are seeking a skilled and passionate Engineer to join our team to build and operate a Whole-of-Government (WoG) runtime platform.

As a Site Reliability Engineer, you will be responsible for designing and operating GitLab, AWSand Kubernetes-based infrastructure and solutions that power our platform, to ensure the stability, scalability, and performance of our runtime platform.

Responsibilities:

As a Site Reliability Engineer, you will be responsible for:
Toil Reduction & Automation
• Identify repetitive tasks and develop automation via CI/CD pipelines, ensuring integration with cross-functional teams to reduce manual intervention and improve operational efficiency.
Observability & System Health
• Implement comprehensive observability solutions (logs, metrics, traces, alerts) around the four Golden Signals (latency, traffic, errors, saturation), and build automation for proactive system health assessments and self-remediation.
Production Support & Incident Management
• Participate in on-call rotations, promptly respond to incidents to minimize MTTR, and conduct thorough post-incident reviews to implement preventive measures and improve system resilience.
Security & Compliance
• Design and implement solutions that are secure and compliant by collaborating with dedicated security teams, conducting regular audits, and integrating advanced vulnerability scanning tools.

Maintenance, Optimisation & Performance
• Identify and resolve performance bottlenecks and operational issues, define and track KPIs (e.g., MTTR, system uptime, cost efficiency), and drive ongoing optimisation efforts.
Strategic Customer Engagement
• Act as a technical advisor for tenants, guiding them on containerization, and best practices for cloud-native deployments, and participating in strategic initiatives to enhance platform scalability and performance.
Knowledge Sharing & Documentation
• Develop and maintain detailed playbooks, runbooks, and documentation to facilitate team-wide knowledge sharing, streamline incident response, and ensure that critical processes are well understood across the team.
Continuous Learning & Innovation
• Stay current with the latest AWS, Kubernetes, and industry developments, and proactively recommend improvements and innovative solutions to maintain a competitive and reliable platform.

Requirements:

• Bachelor's degree or Diploma in Computer Science, Engineering, or a related field (or equivalent experience).
• Proven experience as a Site Reliability Engineer or similar role, with a strong background in containerization, orchestration, and cloud-native technologies.
• Proven ability to troubleshoot and resolve complex technical issues in containerized applications.
• Demonstrated experience with incident management, including post-incident reviews and continuous improvement.
• Strong documentation skills and experience in knowledge sharing across teams.
• Deep understanding of AWS, Kubernetes (including AWS EKS), and operational best practices, with familiarity in multi-cloud or hybrid environments.
• Solid grasp of networking, security, and storage in both AWS and Kubernetes contexts.
• Experience integrating Kubernetes with AWS cloud technologies (e.g., Secrets Manager, Load Balancers) and using infrastructure-as-code (Terraform or similar).
• Hands-on experience with containerization tools (Kubernetes, Kustomize, Helm) and automation scripting (Go, Python, Bash, or equivalent).
• Ability to write and maintain automated tests or conduct thorough manual testing for automation scripts, ensuring the reliability and effectiveness of automated solutions.
• Familiarity with CI/CD tools (GitLab CI/CD, ArgoCD) and version control systems (Git).
• Experience with observability/monitoring tools (Prometheus, Grafana, ELK Stack) and defining SLOs and Error Budgets.
• Certifications such as Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD) are a plus.
• Experience with developing Kubernetes operators using Go, service mesh technologies, and Chaos Engineering is a plus.

Soft skills:

• Proactive in identifying problems and recommending strategic solutions.
• Excellent problem-solving skills with a robust analytical mindset.
• Clear, concise, and effective communication skills; adept at collaborating across crossfunctional teams, including development, security, and customer-facing groups.
• Ability to remain calm and effective under pressure, especially during incident response.
• Adaptability to rapid change with a continuous learning mindset, sharing knowledge to foster team growth.
• Customer-focused with the ability to translate technical insights into understandable, actionable guidance.
• Leadership and mentoring capabilities, contributing to the development of a resilient and collaborative team environment are a plus.

Any personal data you share with us during the application process will be processed strictly in compliance with applicable data protection laws and our Privacy Notice .

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer (SRE) (GovTech)

Singapore, Singapore AvePoint

Posted 21 days ago

Job Viewed

Tap Again To Close

Job Description

Site Reliability Engineer (SRE) (GovTech)

We are seeking a skilled and passionate Engineer to join our team to build and operate a Whole-of-Government (WoG) runtime platform.

As a Site Reliability Engineer, you will be responsible for designing and operating GitLab, AWS and Kubernetes-based infrastructure and solutions that power our platform, to ensure the stability, scalability, and performance of our runtime platform.

Responsibilities

As a Site Reliability Engineer, you will be responsible for:

Toil Reduction & Automation

  • Identify repetitive tasks and develop automation via CI/CD pipelines, ensuring integration with cross-functional teams to reduce manual intervention and improve operational efficiency.
Observability & System Health
  • Implement comprehensive observability solutions (logs, metrics, traces, alerts) around the four Golden Signals (latency, traffic, errors, saturation), and build automation for proactive system health assessments and self-remediation.
Production Support & Incident Management
  • Participate in on-call rotations, promptly respond to incidents to minimize MTTR, and conduct thorough post-incident reviews to implement preventive measures and improve system resilience.
Security & Compliance
  • Design and implement solutions that are secure and compliant by collaborating with dedicated security teams, conducting regular audits, and integrating advanced vulnerability scanning tools.
Maintenance, Optimisation & Performance
  • Identify and resolve performance bottlenecks and operational issues, define and track KPIs (e.g., MTTR, system uptime, cost efficiency), and drive ongoing optimisation efforts.
Strategic Customer Engagement
  • Act as a technical advisor for tenants, guiding them on containerization, and best practices for cloud-native deployments, and participating in strategic initiatives to enhance platform scalability and performance.
Knowledge Sharing & Documentation
  • Develop and maintain detailed playbooks, runbooks, and documentation to facilitate team-wide knowledge sharing, streamline incident response, and ensure that critical processes are well understood across the team.
Continuous Learning & Innovation
  • Stay current with the latest AWS, Kubernetes, and industry developments, and proactively recommend improvements and innovative solutions to maintain a competitive and reliable platform.
Requirements
  • Bachelor's degree or Diploma in Computer Science, Engineering, or a related field (or equivalent experience).
  • Proven experience as a Site Reliability Engineer or similar role, with a strong background in containerization, orchestration, and cloud-native technologies.
  • Proven ability to troubleshoot and resolve complex technical issues in containerized applications.
  • Demonstrated experience with incident management, including post-incident reviews and continuous improvement.
  • Strong documentation skills and experience in knowledge sharing across teams.
  • Deep understanding of AWS, Kubernetes (including AWS EKS), and operational best practices, with familiarity in multi-cloud or hybrid environments.
  • Solid grasp of networking, security, and storage in both AWS and Kubernetes contexts.
  • Experience integrating Kubernetes with AWS cloud technologies (e.g., Secrets Manager, Load Balancers) and using infrastructure-as-code (Terraform or similar).
  • Hands-on experience with containerization tools (Kubernetes, Kustomize, Helm) and automation scripting (Go, Python, Bash, or equivalent).
  • Ability to write and maintain automated tests or conduct thorough manual testing for automation scripts, ensuring the reliability and effectiveness of automated solutions.
  • Familiarity with CI/CD tools (GitLab CI/CD, ArgoCD) and version control systems (Git).
  • Experience with observability/monitoring tools (Prometheus, Grafana, ELK Stack) and defining SLOs and Error Budgets.
  • Certifications such as Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD) are a plus.
  • Experience with developing Kubernetes operators using Go, service mesh technologies, and Chaos Engineering is a plus.
Soft Skills
  • Proactive in identifying problems and recommending strategic solutions.
  • Excellent problem-solving skills with a robust analytical mindset.
  • Clear, concise, and effective communication skills; adept at collaborating across crossfunctional teams, including development, security, and customer-facing groups.
  • Ability to remain calm and effective under pressure, especially during incident response.
  • Adaptability to rapid change with a continuous learning mindset, sharing knowledge to foster team growth.
  • Customer-focused with the ability to translate technical insights into understandable, actionable guidance.
  • Leadership and mentoring capabilities, contributing to the development of a resilient and collaborative team environment are a plus.

Any personal data you share with us during the application process will be processed strictly in compliance with applicable data protection laws and our Privacy Notice.

Seniority level

Mid-Senior level

Employment type

Full-time

Job function

Engineering and Information Technology

Industries

Data Security Software Products

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer, Traffic Platform - Traffic SRE - 2025 Start

Singapore, Singapore ByteDance

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

Site Reliability Engineer, Traffic Platform - Traffic SRE - 2025 Start

Join to apply for the Site Reliability Engineer, Traffic Platform - Traffic SRE - 2025 Start role at ByteDance

Site Reliability Engineer, Traffic Platform - Traffic SRE - 2025 Start

3 days ago Be among the first 25 applicants

Join to apply for the Site Reliability Engineer, Traffic Platform - Traffic SRE - 2025 Start role at ByteDance

Get AI-powered advice on this job and more exclusive features.

Responsibilities

Team Introduction

The Traffic Infrastructure Global Engineering (TIGE)-Traffic Platform team at ByteDance builds and operates multi-cloud based large scale network services around the world that we use to accelerate and optimize network traffic for Tiktok and a variety of application services for ByteDance internal customers, which include but are not limited to layer 4 loadbalancing, layer 4/7 acceleration, global ingress, CMAF, FaaS and WAF, etc. By joining us, you can work within a brilliant team and learn how to build Tiktok scale network traffic platform which serves billions of users globally.

Responsibilities

  • Design and develop features of traffic software (DNS Server, L4 and L7 Proxy, Web Caching, and FaaS Runtime), integrate based on our traffic platform to process terabyte-scale data in real-time.
  • Build data pipelines and develop telemetry systems to support datadriven traffic control.
  • Develop API acceleration and other networking services that run on top of our multi-cloud based traffic platform.
  • Problem solving and performance tuning for online traffic.
  • Research new technologies for more efficient and scalable traffic processing.

Qualifications

Minimum Qualifications

  • Experience in developing network systems in Rust, C, C++, and/or Go, developing skills in Linux environment.
  • Bachelor's degree in Computer Science, Electrical Engineering, Computer Engineering or related majors.
  • Familiarity with network protocols such as TCP/IP, HTTP/HTTPs, and DNS.
  • Familiarity with Microservice architecture.
  • Familiarity with container and orchestration technologies such as Docker and Kubernetes.

Preferred Qualifications

  • Experience in building large scale network services on cloud (AWS, GCP, OCI).
  • Experience in designing and developing high performance network loadbalancer products.
  • Experience in developing proxy software such as Nginx and Envoy is a big plus.
  • Experience in System Programming using low level system calls such as epoll, io-uring, etc., is a big plus.
  • Experience in developing Web Caching software such as Apache Traffic Server and Varnish, etc.
  • Experience in Edge Computing and FaaS Runtime development.
  • Experience in building distributed or cloud service based management system.
  • Proficiency in networking newer protocols such as HTTP2, HTTP3/QUIC, TLS1.3, etc.

By submitting an application for this role, you accept and agree to our global applicant privacy policy, which may be accessed here: you have any questions, please reach out to us at

Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Lemon8, CapCut and Pico as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.

Seniority level
  • Seniority level Entry level
Employment type
  • Employment type Full-time
Job function
  • Job function Engineering and Information Technology
  • Industries Software Development

Referrals increase your chances of interviewing at ByteDance by 2x

Get notified about new Site Reliability Engineer jobs in Singapore .

Production Engineer / Site Reliability Engineer WeChat - Senior Site Reliability Engineer Site Reliability Engineer (EMEA, Japan, Singapore, Australia) Information Technology - Cloud/DevOps Engineer Site Reliability Engineer (Elite Fintech) $175,000 +Bonus Site Reliability Engineer-(Fresh-Grad)(A98145) Engineer/Senior Engineer, Site Reliability Engineering Site Reliability Engineer Intern - 2025 Start Head of Engineering, Systems & Services - APAC Site Reliability Engineer (SRE) (GovTech) Site Reliability Engineer, Monetization Technology DevOps Engineering, Engineer (1 year contract) System Engineer (Operating System) - System Technologies and Engineering

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer, Traffic Platform - Traffic SRE - 2025 Start

Singapore, Singapore BYTEDANCE PTE. LTD.

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

About Us

Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Lemon8, CapCut and Pico as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.

Why Join ByteDance

Inspiring creativity is at the core of ByteDance's mission. Our innovative products are built to help people authentically express themselves, discover and connect – and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity and enrich life - a mission we work towards every day.

As ByteDancers, we strive to do great things with great people. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. By constantly iterating and fostering an "Always Day 1" mindset, we achieve meaningful breakthroughs for ourselves, our Company, and our users. When we create and grow together, the possibilities are limitless. Join us.

Diversity & Inclusion

ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

Job highlights

Positive team atmosphere, Career growth opportunity, Paid leave, Meals provided, Competitive compensation

Responsibilities

Team Introduction

The Traffic Infrastructure Global Engineering (TIGE)-Traffic Platform team at ByteDance builds and operates multi-cloud based large scale network services around the world that we use to accelerate and optimize network traffic for Tiktok and a variety of application services for ByteDance internal customers, which include but are not limited to layer 4 loadbalancing, layer 4/7 acceleration, global ingress, CMAF, FaaS and WAF, etc. By joining us, you can work within a brilliant team and learn how to build Tiktok scale network traffic platform which serves billions of users globally.

Responsibilities

- Design and develop features of traffic software (DNS Server, L4 and L7 Proxy, Web Caching, and FaaS Runtime), integrate based on our traffic platform to process terabyte-scale data in real-time.

- Build data pipelines and develop telemetry systems to support datadriven traffic control.

- Develop API acceleration and other networking services that run on top of our multi-cloud based traffic platform.

- Problem solving and performance tuning for online traffic.

- Research new technologies for more efficient and scalable traffic processing.

Qualifications

Minimum Qualifications

- Experience in developing network systems in Rust, C, C++, and/or Go, developing skills in Linux environment.

- Bachelor's degree in Computer Science, Electrical Engineering, Computer Engineering or related majors.

- Familiarity with network protocols such as TCP/IP, HTTP/HTTPs, and DNS.

- Familiarity with Microservice architecture.

- Familiarity with container and orchestration technologies such as Docker and Kubernetes.

Preferred Qualifications

- Experience in building large scale network services on cloud (AWS, GCP, OCI).

- Experience in designing and developing high performance network loadbalancer products.

- Experience in developing proxy software such as Nginx and Envoy is a big plus.

- Experience in System Programming using low level system calls such as epoll, io-uring, etc., is a big plus.

- Experience in developing Web Caching software such as Apache Traffic Server and Varnish, etc.

- Experience in Edge Computing and FaaS Runtime development.

- Experience in building distributed or cloud service based management system.

- Proficiency in networking newer protocols such as HTTP2, HTTP3/QUIC, TLS1.3, etc.

By submitting an application for this role, you accept and agree to our global applicant privacy policy, which may be accessed here:

If you have any questions, please reach out to us at

#J-18808-Ljbffr
This advertiser has chosen not to accept applicants from your region.
Be The First To Know

About the latest Sre Jobs in Singapore !

Site Reliability Engineer, Traffic Platform - Traffic SRE - 2025 Start

Singapore, Singapore BYTEDANCE PTE. LTD.

Posted today

Job Viewed

Tap Again To Close

Job Description

About Us

Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Lemon8, CapCut and Pico as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.

Why Join ByteDance

Inspiring creativity is at the core of ByteDance's mission. Our innovative products are built to help people authentically express themselves, discover and connect - and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity and enrich life - a mission we work towards every day.

As ByteDancers, we strive to do great things with great people. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. By constantly iterating and fostering an "Always Day 1" mindset, we achieve meaningful breakthroughs for ourselves, our Company, and our users. When we create and grow together, the possibilities are limitless. Join us.

Diversity & Inclusion

ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

Job highlights

Positive team atmosphere, Career growth opportunity, Paid leave, Meals provided, Competitive compensation

Responsibilities

Team Introduction

The Traffic Infrastructure Global Engineering (TIGE)-Traffic Platform team at ByteDance builds and operates multi-cloud based large scale network services around the world that we use to accelerate and optimize network traffic for Tiktok and a variety of application services for ByteDance internal customers, which include but are not limited to layer 4 loadbalancing, layer 4/7 acceleration, global ingress, CMAF, FaaS and WAF, etc. By joining us, you can work within a brilliant team and learn how to build Tiktok scale network traffic platform which serves billions of users globally.

Responsibilities

- Design and develop features of traffic software (DNS Server, L4 and L7 Proxy, Web Caching, and FaaS Runtime), integrate based on our traffic platform to process terabyte-scale data in real-time.

- Build data pipelines and develop telemetry systems to support datadriven traffic control.

- Develop API acceleration and other networking services that run on top of our multi-cloud based traffic platform.

- Problem solving and performance tuning for online traffic.

- Research new technologies for more efficient and scalable traffic processing.

Qualifications

Minimum Qualifications

- Experience in developing network systems in Rust, C, C++, and/or Go, developing skills in Linux environment.

- Bachelor's degree in Computer Science, Electrical Engineering, Computer Engineering or related majors.

- Familiarity with network protocols such as TCP/IP, HTTP/HTTPs, and DNS.

- Familiarity with Microservice architecture.

- Familiarity with container and orchestration technologies such as Docker and Kubernetes.

Preferred Qualifications

- Experience in building large scale network services on cloud (AWS, GCP, OCI).

- Experience in designing and developing high performance network loadbalancer products.

- Experience in developing proxy software such as Nginx and Envoy is a big plus.

- Experience in System Programming using low level system calls such as epoll, io-uring, etc., is a big plus.

- Experience in developing Web Caching software such as Apache Traffic Server and Varnish, etc.

- Experience in Edge Computing and FaaS Runtime development.

- Experience in building distributed or cloud service based management system.

- Proficiency in networking newer protocols such as HTTP2, HTTP3/QUIC, TLS1.3, etc.

By submitting an application for this role, you accept and agree to our global applicant privacy policy, which may be accessed here:

If you have any questions, please reach out to us at
This advertiser has chosen not to accept applicants from your region.

Site Reliability Engineer, Traffic Platform - Traffic SRE - 2025 Start

Singapore, Singapore BYTEDANCE PTE. LTD.

Posted today

Job Viewed

Tap Again To Close

Job Description

About Us
Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Lemon8, CapCut and Pico as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.
Why Join ByteDance
Inspiring creativity is at the core of ByteDance's mission. Our innovative products are built to help people authentically express themselves, discover and connect - and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity and enrich life - a mission we work towards every day.
As ByteDancers, we strive to do great things with great people. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. By constantly iterating and fostering an "Always Day 1" mindset, we achieve meaningful breakthroughs for ourselves, our Company, and our users. When we create and grow together, the possibilities are limitless. Join us.
Diversity & Inclusion
ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.
Job highlights
Positive team atmosphere, Career growth opportunity, Paid leave, Meals provided, Competitive compensation
Responsibilities
Team Introduction
The Traffic Infrastructure Global Engineering (TIGE)-Traffic Platform team at ByteDance builds and operates multi-cloud based large scale network services around the world that we use to accelerate and optimize network traffic for Tiktok and a variety of application services for ByteDance internal customers, which include but are not limited to layer 4 loadbalancing, layer 4/7 acceleration, global ingress, CMAF, FaaS and WAF, etc. By joining us, you can work within a brilliant team and learn how to build Tiktok scale network traffic platform which serves billions of users globally.
Responsibilities
- Design and develop features of traffic software (DNS Server, L4 and L7 Proxy, Web Caching, and FaaS Runtime), integrate based on our traffic platform to process terabyte-scale data in real-time.
- Build data pipelines and develop telemetry systems to support datadriven traffic control.
- Develop API acceleration and other networking services that run on top of our multi-cloud based traffic platform.
- Problem solving and performance tuning for online traffic.
- Research new technologies for more efficient and scalable traffic processing.
Qualifications
Minimum Qualifications
- Experience in developing network systems in Rust, C, C++, and/or Go, developing skills in Linux environment.
- Bachelor's degree in Computer Science, Electrical Engineering, Computer Engineering or related majors.
- Familiarity with network protocols such as TCP/IP, HTTP/HTTPs, and DNS.
- Familiarity with Microservice architecture.
- Familiarity with container and orchestration technologies such as Docker and Kubernetes.
Preferred Qualifications
- Experience in building large scale network services on cloud (AWS, GCP, OCI).
- Experience in designing and developing high performance network loadbalancer products.
- Experience in developing proxy software such as Nginx and Envoy is a big plus.
- Experience in System Programming using low level system calls such as epoll, io-uring, etc., is a big plus.
- Experience in developing Web Caching software such as Apache Traffic Server and Varnish, etc.
- Experience in Edge Computing and FaaS Runtime development.
- Experience in building distributed or cloud service based management system.
- Proficiency in networking newer protocols such as HTTP2, HTTP3/QUIC, TLS1.3, etc.
By submitting an application for this role, you accept and agree to our global applicant privacy policy, which may be accessed here:
If you have any questions, please reach out to us
This advertiser has chosen not to accept applicants from your region.

Junior/Senior SRE

Singapore, Singapore DADACONSULTANTS PTE. LTD.

Posted today

Job Viewed

Tap Again To Close

Job Description

Roles & Responsibilities

Site Reliability Engineer (SRE)

Responsibilities:

Assist in deploying and managing microservices on Kubernetes cloud platforms.

Work with Cloud and DevOps teams to deploy services across multiple cloud providers (AWS, OCI, Azure, GCP).

Conduct load and chaos testing to ensure system scalability and reliability.

Support disaster recovery planning and troubleshoot production issues.

Automate processes using Python, Go, or Bash.

Define and maintain KPIs (SLA/SLO/SLI) for cloud microservices.

Maintain technical documentation and ensure compliance with security standards (ISO27001, SOC2, GDPR).

Participate in incident response and post-incident analysis.

Assist in technology selection and proof-of-concept implementation.

Provide on-call support as needed.

Requirements:

Bachelor's degree in Computer Science, IT, or related field.

Minimum 3 years of experience in SRE, DevOps, or cloud operations.

Proficiency in backend language.

Understanding of cloud security and best practices.

Strong problem-solving and teamwork skills.

Cloud certifications (AWS, Azure, GCP).

Experience with Kubernetes and container orchestration.

About Us

Dada Consultants was established in 2017, with the commitment of providing the best recruitment services in Singapore. We are comprised of a dynamic head-hunting team dedicated to sourcing for highly competent professionals in IT industry. We provide enterprises with customized talent solutions, and bring talents to career advancement.

EA Registration Number: R25128548

Tell employers what skills you have

Scalability
Kubernetes
Azure
AWS
Private Cloud
Microservices
Reliability
Compliance
Python
Cloud-based
Cloud
Java
Orchestration
Teamwork Skills
Disaster Recovery
Hybrid Cloud
This advertiser has chosen not to accept applicants from your region.
 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Sre Jobs