AI Inference Engineer QVAC @ Jobgether on behalf of a partner company — Vaga Remota

Accountabilities

In this role, you will be responsible for designing, optimizing, and maintaining the inference layer that enables high-performance AI execution on edge devices. You will ensure systems are robust, efficient, and scalable across diverse hardware environments.

Develop and optimize C++-based inference systems for deploying AI models on edge devices.
Enhance and adapt inference engines such as llama.cpp, ggml, and ONNX for improved performance and compatibility.
Improve runtime efficiency, focusing on memory usage, latency, throughput, and long-session stability.
Collaborate with research teams to transition models from experimentation to production-ready deployments.
Define and maintain core abstractions that support scalable and maintainable inference capabilities.
Integrate AI-driven features into existing products, ensuring seamless performance and reliability.
Continuously evaluate and implement new technologies to improve system capabilities and efficiency.

Requirements

You are a highly skilled engineer with a strong foundation in systems programming and machine learning, capable of working on complex, performance-critical AI infrastructure.

Strong programming expertise in C++, with additional experience in JavaScript considered a plus.
Proven experience with inference frameworks such as llama.cpp, ggml, ONNX, or similar technologies.
Solid understanding of deep learning concepts, including transformers, LLMs, and diffusion models.
Experience deploying and optimizing machine learning models on edge devices or constrained environments.
Ability to quickly learn and apply new technologies in a fast-evolving AI landscape.
Strong problem-solving skills with attention to performance, scalability, and reliability.
Degree in Computer Science, AI, Machine Learning, or a related field, or equivalent practical experience.

Benefits

Fully remote, globally distributed work environment
Opportunity to work on cutting-edge AI and decentralized technologies
High ownership and impact on core product infrastructure
Collaboration with top talent in AI, systems engineering, and fintech
Dynamic, fast-paced environment focused on innovation and experimentation
Exposure to advanced AI frameworks and next-generation product development
Competitive compensation aligned with experience and expertise

AI Inference Engineer QVAC

Accountabilities

Requirements

Benefits

🇧🇷 Essa vaga exige inglês. Você está pronto?