AI Research Engineer (Kernel & Inference Optimization) @ Jobgether (on behalf of a partner company) — Vaga Remota

Accountabilities:

Design, develop, and optimize advanced model serving architectures focused on high throughput, low latency, and efficient memory utilization.
Build scalable inference pipelines capable of running across cloud, edge, and resource-constrained environments.
Conduct controlled inference experiments in simulated and production environments to evaluate system performance and reliability.
Monitor and analyze key performance metrics such as latency, throughput, memory consumption, token response time, and error rates.
Develop and maintain benchmarking methodologies and performance validation frameworks for AI inference systems.
Identify bottlenecks in serving pipelines, including batch processing inefficiencies, network overhead, and excessive memory usage.
Optimize inference frameworks and deployment strategies for scalability, resilience, and operational efficiency.
Collaborate with cross-functional engineering and research teams to integrate optimized inference solutions into production environments.
Create high-quality testing datasets and deployment scenarios that reflect real-world operational challenges.
Continuously improve inference infrastructure through experimentation, iteration, and adoption of cutting-edge AI serving techniques.

Strong experience in AI/ML engineering with a focus on inference optimization, model serving, or AI systems performance.
Deep understanding of model deployment architectures and inference frameworks for large-scale AI applications.
Expertise in optimizing latency, throughput, scalability, and memory footprint in production AI systems.
Hands-on experience with performance monitoring, benchmarking, profiling, and bottleneck analysis.
Strong knowledge of advanced AI model architectures, including multi-modal systems and resource-efficient models.
Experience building and deploying AI systems across cloud, edge, or low-resource hardware environments.
Proficiency in programming languages commonly used in AI infrastructure and optimization workflows.
Strong analytical and problem-solving abilities with a research-oriented mindset.
Ability to work independently in a highly distributed and fast-moving global environment.
Excellent English communication skills and ability to collaborate across technical and non-technical teams.
Passion for innovation, experimentation, and scalable AI infrastructure development.

Fully remote global work environment with flexible location options.
Opportunity to work on cutting-edge AI, blockchain, and fintech technologies.
Collaborative international team of highly skilled engineers and researchers.
Exposure to innovative projects involving AI infrastructure, digital finance, and decentralized technologies.
High-impact role with significant technical ownership and influence on product direction.
Fast-paced and innovation-driven culture focused on experimentation and growth.
Opportunities for continuous learning and professional development.
Work environment that values autonomy, creativity, and technical excellence.
Participation in projects with global reach and real-world scalability challenges.