Accountabilities:
- Design and implement advanced reinforcement learning algorithms to improve decision-making, policy optimization, and system performance across simulated and real-world environments.
- Run controlled experiments, track performance metrics, evaluate outcomes against benchmarks, and iterate on model improvements through empirical analysis.
- Develop and curate high-quality simulation environments and training datasets aligned with domain-specific requirements and learning objectives.
- Debug and optimize RL pipelines, addressing challenges such as exploration strategy, reward stability, sample efficiency, and training convergence.
- Collaborate with engineering and research teams to integrate RL agents into production systems and ensure measurable real-world performance gains.
- Define evaluation frameworks and continuously monitor deployed systems to support robustness, scalability, and domain adaptation.
Requirements:
- Advanced degree in Computer Science, Machine Learning, or related field; PhD preferred with strong academic research background and publications in top-tier conferences.
- Proven experience running large-scale reinforcement learning projects, including modern online RL techniques such as policy optimization methods and actor-critic frameworks.
- Deep understanding of reinforcement learning theory and practice, including policy gradients, exploration-exploitation trade-offs, and optimization strategies for stability and efficiency.
- Strong hands-on expertise with PyTorch and RL frameworks, including building full pipelines from simulation to training and deployment.
- Demonstrated ability to solve complex RL challenges such as sample inefficiency, reward noise, and training instability through empirical and algorithmic innovation.
- Strong analytical mindset with ability to design robust experiments, interpret results, and continuously improve model performance.
Benefits:
- Fully remote work environment with global team collaboration.
- Opportunity to work on cutting-edge AI and reinforcement learning research at scale.
- High-impact role influencing production-level AI systems and real-world applications.
- Competitive compensation aligned with experience and expertise.
- Exposure to advanced research, multimodal AI systems, and state-of-the-art infrastructure.
- Flexible working culture supporting autonomy and innovation.
🇧🇷 Essa vaga exige inglês. Você está pronto?
A DevSpeak Academy prepara desenvolvedores brasileiros para conquistar vagas internacionais. Domine o inglês técnico com professores que entendem o mundo dev.
Conheça a DevSpeak Academy