Jobgether on behalf of a partner company logo
Jobgether on behalf of a partner company

AI Inference Engineer QVAC

🕐 27 dias atrás📍 Sweden🌍 Remoto

Accountabilities

In this role, you will be responsible for designing, optimizing, and maintaining the inference layer that enables high-performance AI execution on edge devices. You will ensure systems are robust, efficient, and scalable across diverse hardware environments.

  • Develop and optimize C++-based inference systems for deploying AI models on edge devices.
  • Enhance and adapt inference engines such as llama.cpp, ggml, and ONNX for improved performance and compatibility.
  • Improve runtime efficiency, focusing on memory usage, latency, throughput, and long-session stability.
  • Collaborate with research teams to transition models from experimentation to production-ready deployments.
  • Define and maintain core abstractions that support scalable and maintainable inference capabilities.
  • Integrate AI-driven features into existing products, ensuring seamless performance and reliability.
  • Continuously evaluate and implement new technologies to improve system capabilities and efficiency.

Requirements

You are a highly skilled engineer with a strong foundation in systems programming and machine learning, capable of working on complex, performance-critical AI infrastructure.

  • Strong programming expertise in C++, with additional experience in JavaScript considered a plus.
  • Proven experience with inference frameworks such as llama.cpp, ggml, ONNX, or similar technologies.
  • Solid understanding of deep learning concepts, including transformers, LLMs, and diffusion models.
  • Experience deploying and optimizing machine learning models on edge devices or constrained environments.
  • Ability to quickly learn and apply new technologies in a fast-evolving AI landscape.
  • Strong problem-solving skills with attention to performance, scalability, and reliability.
  • Degree in Computer Science, AI, Machine Learning, or a related field, or equivalent practical experience.

Benefits

  • Fully remote, globally distributed work environment
  • Opportunity to work on cutting-edge AI and decentralized technologies
  • High ownership and impact on core product infrastructure
  • Collaboration with top talent in AI, systems engineering, and fintech
  • Dynamic, fast-paced environment focused on innovation and experimentation
  • Exposure to advanced AI frameworks and next-generation product development
  • Competitive compensation aligned with experience and expertise

🇧🇷 Essa vaga exige inglês. Você está pronto?

A DevSpeak Academy prepara desenvolvedores brasileiros para conquistar vagas internacionais. Domine o inglês técnico com professores que entendem o mundo dev.

Conheça a DevSpeak Academy