Job Description
A Cloud Ops Engineer is responsible for maintaining, monitoring, troubleshooting, and supporting cloud-based applications and infrastructure.
This role focuses on:
- Production support
- Cloud operations
- Incident management
- Infrastructure monitoring
- Automation
- System reliability
The engineer ensures applications and services remain:
- Stable
- Secure
- Highly available
- Scalable
This role also involves:
- DevOps practices
- Kubernetes environments
- Cloud deployments
- Application release support
- 24×7 operational support
Responsibilities
Incident & Production Support
- Troubleshoot production and non-production issues
- Resolve incidents within SLA timelines
- Analyze logs and debug system issues
- Perform Root Cause Analysis (RCA)
- Minimize customer impact during outages
- Communicate incident updates to stakeholders
Cloud Operations
- Monitor cloud infrastructure and applications
- Support deployments across multiple environments
- Maintain high availability and system reliability
- Support cloud platforms such as AWS and OCI
Automation & DevOps
- Automate repetitive operational tasks
- Improve operational efficiency using scripts and DevOps tools
- Support CI/CD pipelines and infrastructure automation
Monitoring & Observability
Work with monitoring tools such as:
- Prometheus
- Grafana
- Zabbix
- ELK Stack
to track:
- System health
- Performance
- Logs
- Alerts
Infrastructure & Middleware Support
Support technologies such as:
- Linux servers
- Windows servers
- JBoss
- Apache
- Tomcat
- NGINX
Container & Kubernetes Support
- Manage containerized applications
- Support Kubernetes clusters
- Work with Helm Charts and ArgoCD
Collaboration & Documentation
- Work with DevOps, developers, and infrastructure teams
- Use JIRA and Confluence for tracking and documentation
- Participate in Agile/Scrum processes
Required Skills
Operating Systems
- Linux (Ubuntu, CentOS, Amazon Linux, OEL)
- Windows Server
Cloud & DevOps Skills
- AWS
- OCI
- Jenkins
- Terraform
- Git
- Ansible
Scripting Skills
- Bash/Shell Scripting
- Python
- Perl
- Groovy
Container & Orchestration Skills
- Docker
- Kubernetes
- Helm Charts
- ArgoCD
Networking Skills
- DNS
- Firewalls
- LDAP
- SFTP
- File Systems
Monitoring Tools
- Prometheus
- Grafana
- ELK Stack
- Zabbix
Middleware Technologies
- JBoss
- Apache
- Tomcat
- NGINX
Preferred Skills
- Agile/Scrum experience
- Oracle/RDBMS knowledge
- Hadoop/Spark exposure
- Elasticsearch/Kibana
- Informatica
- Looker
- WebSphere/WebLogic
- AWS Certification