top of page
  • Linkedin

Site-Reliability Engineer

Location: Toronto

Job ID: 2124

Job Description

Requirement:

  • Proven experience optimizing batch workloads for performance, reliability, and cost.

  • Proficiency with CI/CD pipelines (GitHub Actions, Azure DevOps, Jenkins) and Infrastructure as Code (Terraform, Ansible).

  • Proven experience with containers and orchestration (Docker, Kubernetes).

  • Excellent incident management and root cause analysis skills.

  • Linux Systems Expertise: Kernel/OS tuning, networking, filesystem optimization, process management, and troubleshooting.

  • Dynatrace Mastery: Custom dashboards, KPIs, anomaly detection, tagging strategy, and alerting configuration.

  • Experience with a more modern development languages (Python, Java, etc.)

  • Airflow Expertise: DAG design best practices, SLA management, scheduler/executor tuning, and scaling strategies.

Contact Us

Thanks for submitting!

Tel. (+1) 647-865-2985

© 2022 by MetiSign. 

bottom of page