Job Description
Role Overview
We are looking for a hands-on Machine Learning Engineer (5+ years experience) who can design, develop, deploy, and optimize machine learning models and pipelines in a cloud-native AWS environment.
The ideal candidate has strong programming fundamentals, deep experience with Python and ML frameworks, and solid exposure to MLOps, containerization, and production-grade ML systems. You will work independently within an agile product engineering team, focusing on scalability, performance, reliability, and maintainability of ML solutions.
Key Responsibilities
Machine Learning Development
- Design, develop, train, and optimize machine learning models for production use.
- Perform feature engineering, hyperparameter tuning, and model evaluation.
- Monitor and evaluate model performance using validation, drift, and accuracy metrics.
- Handle data drift and model drift using metrics such as PSI, KS test, F1 Score, ROC-AUC, and RMSE.
MLOps & Model Deployment
- Build and deploy ML models as containerized APIs/services using:
- AWS SageMaker
- FastAPI / Flask (or similar frameworks)
- Manage the full ML lifecycle: training, validation, deployment, monitoring.
- Implement CI/CD pipelines for ML workflows (GitLab CI, MLflow integration).
- Maintain model and dataset versioning.
- Implement observability (logging, metrics, monitoring) for ML services.
- Work with Docker and orchestration tools for scalable deployments.
- Good to have: Experience with Databricks for large-scale ML workflows.
Databases & Data Storage
- Work with relational databases such as PostgreSQL or MySQL.
- Write optimized SQL queries with proper joins, indexing, and transactions.
- Experience working with Vector Databases.
- Exposure to Redis or NoSQL databases is a plus.
API Development & Integration
- Design and develop RESTful APIs for ML model serving.
- Document APIs using Swagger / OpenAPI.
- Implement request/response validation, API versioning, and error handling.
- Integrate ML services with production systems.
Engineering Best Practices (Mandatory)
- Write clean, maintainable, and well-structured code.
- Follow Git-based workflows and participate in code reviews.
- Implement proper logging, exception handling, and unit tests.
- Apply software engineering fundamentals (modular design, reusable components).
Performance, Reliability & Security
Performance
- Optimize database queries and API performance.
- Understand caching, pagination, and async processing basics.
Reliability
- Implement retry mechanisms, timeouts, and fallback strategies.
- Ensure resilience of production ML systems.
Security
- Implement API security practices (OAuth2, authentication, authorization).
- Enforce rate limiting and input validation.
- Follow secure coding practices to prevent SQL injection and vulnerabilities.
- Ensure compliance with data privacy standards (GDPR, encryption).
Collaboration & Soft Skills
- Strong English communication skills.
- Ability to explain technical decisions clearly to stakeholders.
- Comfortable working in Agile (Scrum/Kanban) teams.
- Participate actively in stand-ups, sprint planning, reviews, and retrospectives.
- Proactive, dependable, and able to work independently once requirements are defined.
- Comfortable collaborating across time zones.
Expected Deliverables
- Production-ready ML models deployed via APIs and containers.
- CI/CD pipelines for ML training and deployment.
- High-quality, standards-compliant code.
- Bug fixes, enhancements, and production support.
- Accurate sprint updates and timely delivery.
- Clear technical documentation for APIs and ML services.
Experience & Qualifications
Experience
- 5+ years overall experience as a:
- Machine Learning Engineer
- MLOps Engineer
Education
- UG: Any Graduate
- PG: Any Postgraduate
Key Skills
Python, Machine Learning, TensorFlow, PyTorch, scikit-learn, Pandas, NumPy, SQL, PostgreSQL, Docker, AWS, SageMaker, MLflow, Git, CI/CD, REST APIs, FastAPI, Flask, Secure Coding, Model Deployment, Data Drift, MLOps, Vector Databases
Good-to-Have Certifications
- AWS Certified Machine Learning – Specialty (Optional)