Data Engineer (Azure & Databricks)

Hybrid

Published 14 hours ago

Role Overview

We are seeking a highly skilled Data Engineer with strong experience in Azure and Databricks, who will play a critical role in designing, transforming, and operationalizing data pipelines within a modern Lakehouse architecture.

The role primarily focuses on transforming data from the Bronze layer into curated analytics-ready datasets, building automated CI/CD pipelines, and developing high-quality Python and PySpark-based data solutions. The engineer will also collaborate closely with Data Scientists and Software Engineers and should be open to contributing to data-driven UI/UX initiatives.

Data Engineering & Transformation

  • Design, develop, and maintain scalable data transformation pipelines using Python (with tools like PySpark, ADF) and SQL in Azure Databricks
  • Implement transformation logic to move data from Bronze to Silver/Gold layers following data engineering best practices
  • Apply strong data engineering principles to ensure data reliability, quality, performance, and reusability
  • Work with structured and semi-structured data at scale

Databricks, Azure & Cloud ETL

  • Build and manage Databricks notebooks, jobs, Delta Lake tables, and orchestrated workflows
  • Hands-on experience with Cloud-based ETL platforms

(Preferred: Microsoft Azure Databricks, Synapse, Azure Functions; otherwise AWS or Google Cloud)

  • Optimize data pipelines for performance, scalability, and cost efficiency

Python Applications, APIs & Automation

  • Design, develop, and maintain Python applications, scripts, and APIs for data processing and automation
  • Write production-grade Python code with strong focus on readability, maintainability, and testing
  • Leverage Python for orchestration, validation, and integration with downstream systems

Collaboration with Data Science & Engineering Teams

  • Collaborate closely with Data Scientists and Data Analysts to understand data, analytical models, and consumption requirements
  • Enable and support advanced analytics and data science workflows by preparing high-quality feature datasets
  • Translate analytical needs into scalable data engineering solutions

CI/CD, DevOps & Platform Engineering

  • Build and maintain automated CI/CD pipelines for data and Databricks workloads
  • Hands-on experience with DevOps tools and practices, including Git-based version control
  • Exposure to containerization and orchestration platforms such as Kubernetes / OpenShift
  • Ensure smooth promotion of code and pipelines across environments (Dev/Test/Prod)

Data Modeling & Querying

  • Design and implement robust data models optimized for analytics and reporting
  • Strong hands-on knowledge of SQL and exposure to KQL or other query languages
  • Apply best practices in data structures, indexing, and performance tuning UI / UX & Data Applications (Additional Advantage)
  • Open to contributing to data-driven UI/UX components, dashboards, or lightweight data applications
  • Work with analytics and business teams to improve data usability and customer experience

 

Required Skills & Qualifications

Must-Have

  • Strong hands-on expertise in Python (with frameworks like PySpark)
  • Solid foundation in Data Engineering principles and large-scale data processing
  • Experience with Azure Databricks and cloud-based ETL platforms
  • Strong knowledge of SQL and data querying techniques
  • Experience with CI/CD pipelines and DevOps practices
  • Experience in pipeline monitoring and alerting
  • Ability to design efficient, scalable solutions to complex data problems

Good-to-Have

  • Experience with Azure Synapse, Azure Functions
  • Exposure to AWS or Google Cloud data platforms
  • Hands-on experience with OpenShift
  • Knowledge of data science concepts and workflows
  • Familiarity with analytics platforms, dashboards, and UI/UX considerations

Full time

Mid-Senior Level

Data Science & Analytics

Hybrid