Devops Tech Lead

 

Description:

How you will make an impact:

  • Design, implement, and operate Kubernetes clusters at scale. You will lead the deployment and management of Kubernetes clusters in production environments, ensuring reliability and performance at scale. You will develop and maintain custom Kubernetes Operators and CSI drivers to extend cluster functionality and meet specific operational needs.
  • Engineer and automate on-premise infrastructure. You will design, maintain, and automate bare-metal and co-location (Colo) environments without reliance on public cloud providers. This role requires a deep understanding of physical infrastructure, data center operations, and custom hardware integrations to optimize performance and reliability.
  • Develop automation solutions for enterprise networking. You will create and maintain automation workflows for enterprise networking environments, particularly those using Cisco technologies. Your work will ensure seamless integration of network changes and configurations into CI/CD and Infrastructure as Code (IaC) workflows, improving operational efficiency and reducing manual effort.
  • Build and maintain production-grade software and tools. You will develop infrastructure automation and management tools using Go, Python, Bash, TypeScript, and Rust. Through high-quality, maintainable code, your work will improve system monitoring, operational efficiency, and platform reliability.
  • Design and automate VMware environments. You will architect and manage VMware infrastructure, including vSphere, vCenter, and vSAN, ensuring seamless integration with Kubernetes and CI/CD workflows. Your work will enhance virtualization efficiency, automation, and scalability across environments.
  • Administer Linux and Windows systems. You will manage Linux environments (RHEL, Debian, Ubuntu) at an advanced level, ensuring stability, security, and performance. Additionally, you will support Windows Server environments, including Active Directory integration, to maintain interoperability across platforms.
  • Lead Infrastructure as Code (IaC) and configuration management. You will drive the adoption and implementation of Terraform and Ansible to enable version-controlled, repeatable, and automated infrastructure deployment. Your expertise will ensure consistency, scalability, and efficiency in infrastructure provisioning.
  • Architect and optimize CI/CD pipelines. You will design and maintain CI/CD pipelines and processes using GitLab CI, Jenkins, ArgoCD, and other automation tools. Your contributions will enhance deployment velocity, reliability, and security, supporting continuous delivery and operational excellence.
  • Mentor and support engineering team members. You will provide guidance, mentorship, and code reviews for junior and intermediate engineers, sharing best practices and fostering a collaborative learning environment. Your leadership will help the team overcome challenges and improve technical skills.
  • Implement and manage monitoring and observability tools. You will deploy and maintain real-time monitoring and observability solutions such as NetData, New Relic, Prometheus, and Grafana, ensuring proactive system health monitoring and performance optimization.
  • Work within structured ITIL processes. You will operate within an ITIL-based framework, contributing to incident management, change management, and problem resolution. Your involvement will support continual service improvements and operational efficiency.
  • Apply and advocate for DevOps methodologies. You will promote and implement DevOps principles, ensuring seamless alignment between development, operations, and business objectives. Your work will foster a culture of automation, collaboration, and continuous improvement.
  • Lead technical design discussions and innovation. You will actively participate in architectural discussions and strategic planning, challenging existing approaches and introducing innovative solutions to improve scalability, security, and performance.
  • Identify and deliver innovative solutions. You will proactively identify, propose, and implement solutions to complex infrastructure challenges, often requiring custom-built tools and creative problem-solving approaches.
  • Participate in on-call rotations and incident response. As part of an on-call rotation, you will respond to production incidents and critical issues as needed, ensuring minimal downtime and rapid resolution.

What you bring:

  • The technical expertise. You have worked as a Senior DevOps Engineer, Systems Engineer, or in similar roles, ideally in high-availability production environments. You have extensive hands-on Kubernetes experience in production, including custom controllers/operators, CSI drivers, and multi-cluster management. You have a deep understanding of co-location and bare-metal environments, including rack/stack, PXE booting, provisioning, and physical hardware management. You have networking experience, including Cisco enterprise networking (routing, switching, VLANs, firewalls), and have automated network configurations using Ansible or similar tools.
  • The software development expertise. You have expert-level coding skills in one or more of the following: Go, Python, Bash, TypeScript, or Rust, including building automation tools, operators, and CLIs. You have deep knowledge of VMware (vSphere, vCenter, ESXi, vSAN), with experience scripting and automating tasks using PowerCLI or similar tools. You have expert-level Linux administration (RedHat, Debian/Ubuntu) and a solid working knowledge of Windows Server, including Active Directory integrations and DNS.
  • The infrastructure and automation knowledge. You have strong experience with Infrastructure as Code (Terraform, Ansible) and configuration management principles. You have advanced CI/CD expertise, including pipeline design, artifact management, security scanning, and deployment strategies (GitOps, ArgoCD, FluxCD). You have experience working with observability stacks, such as NetData, New Relic, Prometheus, Grafana, Loki, and ELK. You have strong knowledge of ITIL processes, with experience operating within structured incident, problem, and change management environments. You have exposure to ultra-low-latency or real-time environments.
  • The critical thinking skills. You can innovate and find creative solutions to complex problems without relying on cloud-native offerings. You are highly analytical, able to assess trade-offs, and committed to optimizing performance, security, and reliability. You are comfortable with ambiguity and willing to figure things out when no clear path or process is outlined.
  • The interpersonal skills. You have applied DevOps principles (CI/CD, IaC, GitOps, immutable infrastructure) in enterprise environments and can clearly communicate their value to both technical and non-technical stakeholders. You can build trusting relations with in-person and remote teams. You can lead, mentor, and collaborate with cross-functional teams, including development, operations, and security teams. You quickly identify when priorities need to shift and take feedback from leaders and peers.

Organization illumin
Industry IT / Telecom / Software Jobs
Occupational Category DevOps Tech Lead
Job Location Toronto,Canada
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Intermediate
Experience 2 Years
Posted at 2025-04-10 7:07 am
Expires on 2026-01-19