Description
We are seeking a skilled and motivated Data Engineer to play a key role in building scalable data pipelines and preparing data for advanced AI use cases. This role will focus on leveraging Azure DataBricks and implementing the medallion architecture to support robust data engineering workflows.
Key Responsibilities
Design, develop, and maintain scalable data pipelines using Azure DataBricks.
Implement and optimize the medallion architecture (bronze, silver, gold layers) to support AI/ML use cases.
Collaborate with data scientists, analysts, and business stakeholders to understand data requirements.
Ensure data quality, integrity, and governance across all layers of the architecture.
Automate data ingestion, transformation, and enrichment processes.
Monitor and troubleshoot data workflows and performance issues.
Document data engineering processes and contribute to best practices.
Required Skills & Qualifications
Proven experience with Azure DataBricks and the medallion architecture.
Strong proficiency in Python, SQL, and Spark.
Experience preparing data for AI/ML models, including feature engineering and data normalization.
Familiarity with Delta Lake, Azure Synapse, and Azure Data Factory.
Understanding of data governance, security, and compliance principles.
Excellent problem-solving and communication skills.
Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience).
Preferred Qualifications
Experience with MLflow, Unity Catalog, or other MLOps tools.
Knowledge of CI/CD pipelines for data engineering.
Familiarity with real-time data processing and streaming architectures.