Allata logo
Allata

Data Engineer (Databricks + Python + Azure)

🕐 28 dias atrás📍 Dallas, TX

Role & Responsibilities:

  • Design, develop, and maintain scalable data pipelines using Databricks (PySpark) and Python.
  • Build and optimize ETL/ELT processes within Azure cloud environments.
  • Implement data models following modern Data Lakehouse principles (e.g., Medallion architecture).
  • Ensure data quality, consistency, and performance across ingestion, staging, and curated layers.
  • Collaborate with data architects, analysts, and business stakeholders to translate healthcare data requirements into technical solutions.
  • Develop reusable data transformation logic and modular processing components.
  • Support deployment processes following CI/CD and DevOps best practices.
  • Monitor and optimize data workflows for performance, scalability, and reliability.
  • Contribute to data governance, security, and compliance practices relevant to healthcare environments.

Hard Skills - Must have:

  • Current knowledge of and using modern data tools like Databricks, FiveTran, Data Fabric and others.
  • Core experience with data architecture, data integrations, data warehousing, and ETL/ELT processes.
  • Applied experience with developing and deploying custom whl and or in session notebook scripts for custom execution across parallel executor and worker nodes.
  • Applied experience in SQL, Stored Procedures, and PySpark based on area of data platform specialization.
  • Strong knowledge of cloud and hybrid relational database systems, such as MS SQL Server, PostgreSQL, Oracle, Azure SQL, AWS RDS, Aurora or a comparable engine.
  • Strong experience with batch and streaming data processing techniques and file compactization strategies.

Hard Skills - Nice to have/It's a plus:

  • Strong hands-on experience with Databricks in Azure environments.
  • Advanced proficiency in Python and PySpark for distributed data processing.
  • Experience building and optimizing data pipelines in Azure (Azure Data Factory, Azure SQL, Data Lake Storage, etc.).
  • Solid understanding of data warehousing, data lakehouse concepts, and ETL/ELT frameworks.
  • Experience working with relational databases such as SQL Server, PostgreSQL, Oracle, or similar.
  • Knowledge of batch and streaming data processing patterns.
  • Experience working with large, complex datasets in cloud-based distributed environments.

Soft Skills / Business Specific Skills:

  • Strong analytical and problem-solving skills.
  • Ability to work effectively in cross-functional and distributed teams.
  • Clear communication skills, with the ability to explain technical concepts to non-technical stakeholders.
  • Proactive mindset with a strong sense of ownership.
  • Commitment to delivering high-quality, reliable data solutions.

🇧🇷 Essa vaga exige inglês. Você está pronto?

A DevSpeak Academy prepara desenvolvedores brasileiros para conquistar vagas internacionais. Domine o inglês técnico com professores que entendem o mundo dev.

Conheça a DevSpeak Academy