Description:
OpenText, a global leader in information management, is seeking a Senior Site Reliability Administrator (Intermediate/Senior Level) to join its dynamic Cloud Application Engineering team. This role is pivotal in ensuring the availability, performance, and stability of OpenText’s cloud-based services while automating repetitive tasks within a high-scale cloud DevOps environment.
As part of OpenText’s mission to drive innovation and digital transformation, this position offers the opportunity to work with cutting-edge cloud infrastructure, microservices, and automation tools, while collaborating with cross-functional teams including Agile developers, sustain teams, and business partners.
Key Responsibilities:
Develop and implement solutions to enhance operational readiness through logging, monitoring, and metrics.
Create proactive monitoring and alerting systems to reduce incidents.
Lead incident resolution processes, including root cause analysis (RCA) and SWAT investigations.
Collaborate with application owners to design effective risk mitigation and audit remediation strategies.
Develop and maintain runbooks, patterns, and automation pipelines using tools such as GitOps, Ansible, Rundeck, or Argo CD.
Support production environments, middleware technologies (Apache, Tomcat, Spring), and Java applications.
Work with RDBMS and No-SQL databases (Oracle, Postgres, MariaDB, Cassandra).
Engage in on-call rotation (24/7/365 support) and provide high-level technical troubleshooting.
Partner with IT, business, and development teams to implement KPIs and enhance system visibility.
Key Skills & Expertise:
Deep knowledge of Linux systems and scripting languages (shell, Perl, Python, JavaScript).
Hands-on experience with cloud platforms (Google Cloud, AWS, or Azure).
Strong understanding of containerization (Docker, Kubernetes, Cloud Foundry).
Familiarity with microservices, RESTful architecture, and message brokers (Kafka, RabbitMQ).
Experience with APM tools (New Relic, Dynatrace, AppDynamics) and monitoring solutions (Zabbix, check_mk).
Knowledge of centralized logging systems (Graylog, Kibana) and API gateways (APIGEE, OAuth 2.0).
Strong problem-solving, analytical, and organizational skills with the ability to manage multiple tasks.
Security best practices expertise and ITIL principles knowledge (certification is a plus).
About the Team:
The OpenText Cloud Application Engineering team focuses on improving service performance, stability, and scalability while fostering a collaborative and innovative work environment. This team operates with a strong sense of urgency, customer focus, and data-driven insights to ensure operational excellence.
Diversity & Inclusion:
OpenText is committed to creating an inclusive work environment that respects all backgrounds, identities, and perspectives. Reasonable accommodations are available during the application process.
| Organization | Open Text |
| Industry | Engineering Jobs |
| Occupational Category | Site Reliability Engineer |
| Job Location | Ontario,Canada |
| Shift Type | Morning |
| Job Type | Full Time |
| Gender | No Preference |
| Career Level | Intermediate |
| Experience | 2 Years |
| Posted at | 2025-08-22 1:49 pm |
| Expires on | 2026-01-07 |