Data Engineer (Azure & Databricks)

Hybrid

Published 14 hours ago

Role Overview

We are seeking a highly skilled Data Engineer with strong experience in Azure and Databricks, who will play a critical role in designing, transforming, and operationalizing data pipelines within a modern Lakehouse architecture.

The role primarily focuses on transforming data from the Bronze layer into curated analytics-ready datasets, building automated CI/CD pipelines, and developing high-quality Python and PySpark-based data solutions. The engineer will also collaborate closely with Data Scientists and Software Engineers and should be open to contributing to data-driven UI/UX initiatives.

Data Engineering & Transformation

Design, develop, and maintain scalable data transformation pipelines using Python (with tools like PySpark, ADF) and SQL in Azure Databricks
Implement transformation logic to move data from Bronze to Silver/Gold layers following data engineering best practices
Apply strong data engineering principles to ensure data reliability, quality, performance, and reusability
Work with structured and semi-structured data at scale

Databricks, Azure & Cloud ETL

Build and manage Databricks notebooks, jobs, Delta Lake tables, and orchestrated workflows
Hands-on experience with Cloud-based ETL platforms

(Preferred: Microsoft Azure Databricks, Synapse, Azure Functions; otherwise AWS or Google Cloud)

Optimize data pipelines for performance, scalability, and cost efficiency

Python Applications, APIs & Automation

Design, develop, and maintain Python applications, scripts, and APIs for data processing and automation
Write production-grade Python code with strong focus on readability, maintainability, and testing
Leverage Python for orchestration, validation, and integration with downstream systems

Collaboration with Data Science & Engineering Teams

Collaborate closely with Data Scientists and Data Analysts to understand data, analytical models, and consumption requirements
Enable and support advanced analytics and data science workflows by preparing high-quality feature datasets
Translate analytical needs into scalable data engineering solutions

CI/CD, DevOps & Platform Engineering

Build and maintain automated CI/CD pipelines for data and Databricks workloads
Hands-on experience with DevOps tools and practices, including Git-based version control
Exposure to containerization and orchestration platforms such as Kubernetes / OpenShift
Ensure smooth promotion of code and pipelines across environments (Dev/Test/Prod)

Data Modeling & Querying

Design and implement robust data models optimized for analytics and reporting
Strong hands-on knowledge of SQL and exposure to KQL or other query languages
Apply best practices in data structures, indexing, and performance tuning UI / UX & Data Applications (Additional Advantage)
Open to contributing to data-driven UI/UX components, dashboards, or lightweight data applications
Work with analytics and business teams to improve data usability and customer experience

Required Skills & Qualifications

Must-Have

Strong hands-on expertise in Python (with frameworks like PySpark)
Solid foundation in Data Engineering principles and large-scale data processing
Experience with Azure Databricks and cloud-based ETL platforms
Strong knowledge of SQL and data querying techniques
Experience with CI/CD pipelines and DevOps practices
Experience in pipeline monitoring and alerting
Ability to design efficient, scalable solutions to complex data problems

Good-to-Have

Experience with Azure Synapse, Azure Functions
Exposure to AWS or Google Cloud data platforms
Hands-on experience with OpenShift
Knowledge of data science concepts and workflows
Familiarity with analytics platforms, dashboards, and UI/UX considerations

APPLY

Full time

Mid-Senior Level

Data Science & Analytics

Hybrid

APPLY

Report Job