Role & Responsibilities:
- Design, develop, and maintain scalable data pipelines using Databricks (PySpark) and Python.
- Build and optimize ETL/ELT processes within Azure cloud environments.
- Implement data models following modern Data Lakehouse principles (e.g., Medallion architecture).
- Ensure data quality, consistency, and performance across ingestion, staging, and curated layers.
- Collaborate with data architects, analysts, and business stakeholders to translate healthcare data requirements into technical solutions.
- Develop reusable data transformation logic and modular processing components.
- Support deployment processes following CI/CD and DevOps best practices.
- Monitor and optimize data workflows for performance, scalability, and reliability.
- Contribute to data governance, security, and compliance practices relevant to healthcare environments.
Hard Skills - Must have:
- Current knowledge of and using modern data tools like Databricks, FiveTran, Data Fabric and others.
- Core experience with data architecture, data integrations, data warehousing, and ETL/ELT processes.
- Applied experience with developing and deploying custom whl and or in session notebook scripts for custom execution across parallel executor and worker nodes.
- Applied experience in SQL, Stored Procedures, and PySpark based on area of data platform specialization.
- Strong knowledge of cloud and hybrid relational database systems, such as MS SQL Server, PostgreSQL, Oracle, Azure SQL, AWS RDS, Aurora or a comparable engine.
- Strong experience with batch and streaming data processing techniques and file compactization strategies.
Hard Skills - Nice to have/It's a plus:
- Strong hands-on experience with Databricks in Azure environments.
- Advanced proficiency in Python and PySpark for distributed data processing.
- Experience building and optimizing data pipelines in Azure (Azure Data Factory, Azure SQL, Data Lake Storage, etc.).
- Solid understanding of data warehousing, data lakehouse concepts, and ETL/ELT frameworks.
- Experience working with relational databases such as SQL Server, PostgreSQL, Oracle, or similar.
- Knowledge of batch and streaming data processing patterns.
- Experience working with large, complex datasets in cloud-based distributed environments.
Soft Skills / Business Specific Skills:
- Strong analytical and problem-solving skills.
- Ability to work effectively in cross-functional and distributed teams.
- Clear communication skills, with the ability to explain technical concepts to non-technical stakeholders.
- Proactive mindset with a strong sense of ownership.
- Commitment to delivering high-quality, reliable data solutions.
🇧🇷 Essa vaga exige inglês. Você está pronto?
A DevSpeak Academy prepara desenvolvedores brasileiros para conquistar vagas internacionais. Domine o inglês técnico com professores que entendem o mundo dev.
Conheça a DevSpeak Academy