Accountabilities
In this role, you will be responsible for designing, optimizing, and maintaining the inference layer that enables high-performance AI execution on edge devices. You will ensure systems are robust, efficient, and scalable across diverse hardware environments.
- Develop and optimize C++-based inference systems for deploying AI models on edge devices.
- Enhance and adapt inference engines such as llama.cpp, ggml, and ONNX for improved performance and compatibility.
- Improve runtime efficiency, focusing on memory usage, latency, throughput, and long-session stability.
- Collaborate with research teams to transition models from experimentation to production-ready deployments.
- Define and maintain core abstractions that support scalable and maintainable inference capabilities.
- Integrate AI-driven features into existing products, ensuring seamless performance and reliability.
- Continuously evaluate and implement new technologies to improve system capabilities and efficiency.
Requirements
You are a highly skilled engineer with a strong foundation in systems programming and machine learning, capable of working on complex, performance-critical AI infrastructure.
- Strong programming expertise in C++, with additional experience in JavaScript considered a plus.
- Proven experience with inference frameworks such as llama.cpp, ggml, ONNX, or similar technologies.
- Solid understanding of deep learning concepts, including transformers, LLMs, and diffusion models.
- Experience deploying and optimizing machine learning models on edge devices or constrained environments.
- Ability to quickly learn and apply new technologies in a fast-evolving AI landscape.
- Strong problem-solving skills with attention to performance, scalability, and reliability.
- Degree in Computer Science, AI, Machine Learning, or a related field, or equivalent practical experience.
Benefits
- Fully remote, globally distributed work environment
- Opportunity to work on cutting-edge AI and decentralized technologies
- High ownership and impact on core product infrastructure
- Collaboration with top talent in AI, systems engineering, and fintech
- Dynamic, fast-paced environment focused on innovation and experimentation
- Exposure to advanced AI frameworks and next-generation product development
- Competitive compensation aligned with experience and expertise
🇧🇷 Essa vaga exige inglês. Você está pronto?
A DevSpeak Academy prepara desenvolvedores brasileiros para conquistar vagas internacionais. Domine o inglês técnico com professores que entendem o mundo dev.
Conheça a DevSpeak Academy