Job Description
We are looking for a Data Engineer to design, build, and maintain scalable data pipelines and data infrastructure. The role focuses on handling large datasets, ensuring data quality, and enabling efficient data processing for analytics and business insights.
The ideal candidate will have strong experience with AWS technologies, data processing frameworks, and modern programming languages, along with a solid understanding of data modeling and database systems.
Responsibilities
Data Pipeline Development
- Design, build, and maintain scalable data pipelines
- Develop ETL/ELT processes for efficient data transformation
Cloud & Data Infrastructure
- Work with AWS services such as S3, Redshift, Glue, and EMR
- Optimize data storage and processing in cloud environments
Data Modeling & Quality
- Implement data models and ensure data consistency
- Apply data quality best practices and validation techniques
Data Processing & Tools
- Use tools like Airflow and Spark for data orchestration and processing
- Write efficient SQL queries for data extraction and transformation
Collaboration & Analysis
- Work with cross-functional teams to understand data needs
- Provide reliable datasets for analytics and reporting
Requirements
- Bachelor’s degree in Computer Science or related field
- 3–5 years of experience in data engineering
- Strong knowledge of AWS services (S3, Redshift, Glue, EMR)
- Experience with data pipeline tools (Airflow, Spark)
- Strong SQL and database skills
- Proficiency in at least one programming language (Python, Java, or Scala)
- Understanding of data modeling and data quality practices
- Strong analytical and problem-solving skills
- Good communication and teamwork abilities
Skills
- Data Engineering
- AWS (S3, Redshift, Glue, EMR)
- SQL (Advanced)
- Python / Java / Scala
- Apache Airflow
- Apache Spark
- Data Modeling
- ETL/ELT Pipelines
- Data Quality & Validation
- Cloud Data Architecture