CoDev

SRE Platform Engineer

ID 2025-2866
Office Location : Location
PH--Negros Occidental
Job Locations
PH-Bacolod-Negros Occidental | PH-Cebu-Cebu City | PH-Davao del Sur-Davao City | PH-Metro Manila-Makati
Shift Schedule
4 pm - 1 am PHT, 9 pm - 6 am PHT
Work Set Up
Remote

Responsibilities

  • Design, implement, and support scalable, reliable infrastructure to power
    production and development environments.
  • Manage and enhance our container orchestration systems, with a focus on
    Kubernetes (EKS), while maintaining a balanced view of other critical AWS
    services such as EC2, ALB, IAM, and VPC networking.
  • Build and maintain automation for application and infrastructure deployment,
    scaling, and lifecycle management.
  • Partner with software engineering teams to improve build, release, and
    deployment processes across CI/CD pipelines.
  • Monitor and improve system availability, latency, and performance across the full
    stack—from cloud infrastructure to backend services.
  • Develop internal tools and scripts to enhance operational efficiency, resilience,
    and security.
  • Play a key role in incident response efforts, including root cause analysis and
    long-term remediation.
  • Participate in architecture reviews and help guide decisions on infrastructure
    design, resilience, and observability.
  • Stay informed on industry trends in reliability engineering, cloud-native tooling,
    and DevOps practices, and integrate improvements into our operational
    playbook.
  • Champion security, scalability, and cost-efficiency in all infrastructure decisions.

Qualifications

  • 5+ years of experience in a DevOps, SRE, or infrastructure engineering role
    supporting production systems at scale.
  • Strong knowledge of AWS services and how they integrate to support modern
    cloud architectures.
  • Proficiency with Infrastructure as Code (IaC) tools such as Terraform, and
    configuration management tools.
  • Experience designing and supporting CI/CD pipelines (e.g., Jenkins, GitHub
    Actions, ArgoCD, etc.).
  • Scripting or programming skills in Python, Go, or similar languages, used for
    automation and tooling.
  • Deep understanding of systems observability, including logging, metrics, and
    tracing (e.g., Prometheus, Grafana, CloudWatch).
  • Ability to diagnose and troubleshoot complex issues across distributed systems,
    including performance bottlenecks and availability challenges.
  • Familiarity with security best practices for cloud and containerized environments.
  • Clear and proactive communicator, comfortable working cross-functionally in a
    fast-paced environment.

Options

Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.