Job Description
We are looking for a Lead Data Engineer to architect and scale modern Lakehouse platforms across multi-cloud environments. The role involves leading Snowflake and Databricks implementations, building robust DataOps pipelines, and mentoring high-performing engineering teams.
Key Responsibilities
Architecture & Platform Design
- Architect scalable Data Lake / Lakehouse platforms across AWS, Microsoft Azure, and Google Cloud Platform
- Design multi-cloud data architectures aligned with Data Mesh / Data Fabric principles
- Ensure high availability (HA), disaster recovery (DR), security, and governance
Data Engineering & Pipelines
- Lead development of batch and real-time data pipelines
- Build and optimize DataOps workflows using orchestration tools
- Implement streaming architectures using Kafka and Spark Structured Streaming
- Develop SQL-based transformations using dbt with testing and documentation standards
Snowflake & Databricks Leadership
- Drive enterprise-grade Snowflake implementations:
- Snowpipe, Streams & Tasks
- Zero-Copy Cloning
- Clustering & partitioning
- Cost optimization & FinOps
- Lead Databricks solutions:
- Delta Lake
- Unity Catalog
- Photon Engine performance tuning
DevOps & DataOps
- Implement Infrastructure as Code (IaC) using Terraform / Pulumi / CloudFormation
- Build containerized workloads using Docker & Kubernetes (EKS / AKS / GKE)
- Design and maintain CI/CD pipelines using GitHub Actions, Azure DevOps, or Jenkins
Leadership & Collaboration
- Lead code reviews and enforce engineering best practices
- Mentor data engineers and contribute to technical roadmaps
- Collaborate with analytics, ML, and business teams
Required Skills (Must-Have)
Experience
- 8+ years of experience in Data Engineering
- 2+ years in a Lead / Architect role
Programming
- Strong proficiency in Python (Pandas, PySpark, APIs)
- Advanced SQL
- Scala or Java is a plus
Cloud Platforms
Hands-on experience with at least 2 cloud platforms:
- AWS: Glue, Lambda, S3
- Azure: Synapse, ADLS Gen2
- GCP: BigQuery
Data Platforms & Tools
- Snowflake: Expert-level knowledge
- Databricks: Strong hands-on experience
- Orchestration: Airflow / Prefect / Dagster / Azure Data Factory
- Streaming: Kafka / Kinesis / Event Hubs
- Transformations: dbt
- Ingestion: Fivetran / Airbyte
- Governance: Alation / Collibra
BI & Analytics
- Integration with Power BI, Tableau, or Looker
- Semantic layer design and performance optimization
Domain & Architecture Knowledge
- Multi-cloud data lake and lakehouse architectures
- Performance tuning for Snowflake and Databricks
- Cost optimization and FinOps strategies
- MLOps concepts: feature stores, ML pipelines
- Real-time data processing architectures
Preferred (Nice-to-Have)
- Certifications:
- AWS Data Analytics
- Azure DP-203
- SnowPro Core / Advanced
- AI / GenAI exposure:
- Vector Databases
- RAG pipelines
- GenAI workflows
Education
- UG: Any Graduate
Key Skills
Lead Data Engineer, Data Engineering, Databricks, Snowflake, Azure, AWS, GCP, Python, PySpark, SQL, Kafka, Airflow, CI/CD, DevOps, DataOps, MLOps, Terraform, Kubernetes, Spark, ETL, Data Lake
📩 Interested candidates can share their resumes at:
soumi.das@nzminds.com