Machine Learning Engineer - Model Performance - Associate
New Yesterday
Machine Learning Engineer - Model Performance - Associate Join to apply for the Machine Learning Engineer - Model Performance - Associate role at Jobright.ai
Make sure to apply quickly in order to maximise your chances of being considered for an interview Read the complete job description below.
Machine Learning Engineer - Model Performance - Associate 1 day ago Be among the first 25 applicants
Join to apply for the Machine Learning Engineer - Model Performance - Associate role at Jobright.ai
Get AI-powered advice on this job and more exclusive features.
Inference.net is a company focused on building a distributed LLM inference network, and they are seeking a Machine Learning Engineer to optimize the performance of their AI inference systems. The role involves deploying and maintaining large language models, implementing optimization techniques, and collaborating with the engineering team on new features.
Responsibilities:
• Design and implement optimization techniques to increase model throughput and reduce latency across our suite of models
• Deploy and maintain large language models at scale in production environments
• Deploy new models as they are released by frontier labs
• Contribute regularly to open source projects such as SGLang and vLLM
• Deep dive into underlying codebases of TensorRT, PyTorch, TensorRT-LLM, vLLM, SGLang, CUDA, and other libraries to debug ML performance issues
• Collaborate with the engineering team to bring new features and capabilities to our inference platform
• Develop robust and scalable infrastructure for AI model serving
• Create and maintain technical documentation for inference systems
Qualifications:
Required:
• 3+ years of experience writing high-performance, production-quality code
• Strong proficiency with Python and deep learning frameworks, particularly PyTorch
• Demonstrated experience with LLM inference optimization techniques
• Hands-on experience with SGLang and vLLM, with contributions to these projects strongly preferred
• Familiarity with Docker and Kubernetes for containerized deployments
• Experience with CUDA programming and GPU optimization
• Strong understanding of distributed systems and scalability challenges
• Proven track record of optimizing AI models for production environments
Preferred:
• Familiarity with TensorRT and TensorRT-LLM
• Knowledge of vision models and multimodal AI systems
• Experience implementing techniques like quantization and speculative decoding
• Contributions to open source machine learning projects
• Experience with large-scale distributed computing
Company:
At inference.net, we empower developers and startups to seamlessly integrate advanced AI capabilities into their applications. Founded in 2023, the company is headquartered in Bozeman, Montana, USA, with a team of 2-10 employees. The company is currently Early Stage. inference.net has a track record of offering H1B sponsorships.
Seniority level Seniority levelAssociate
Employment type Employment typeFull-time
Job function IndustriesSoftware Development
Referrals increase your chances of interviewing at Jobright.ai by 2x
Inferred from the description for this job Medical insurance
Vision insurance
401(k)
Get notified when a new job is posted.
Sign in to set job alerts for “Machine Learning Engineer” roles.AI/ML Engineer (Multiple roles and seniority levels)Software Engineer, AI Platform - New Grad Mountain View, CA $145,000.00-$170,000.00 1 week ago
Software Engineer, Machine Learning (Multiple Levels) - Slack San Francisco, CA $167,300.00-$334,600.00 1 day ago
San Jose, CA $113,500.00-$250,000.00 5 days ago
Fall 2025 Intern, Artificial Intelligence/Machine Learning San Jose, CA $120,700.00-$228,600.00 1 week ago
Machine Learning Engineer (I, II, or Sr.) San Jose, CA $119,000.00-$177,000.00 1 day ago
San Jose, CA $113,500.00-$250,000.00 3 days ago
Machine Learning Scientist, NLP (All Levels) San Francisco, CA $200,000.00-$300,000.00 5 months ago
New Grads 2025 - Software Engineer, Algorithm San Jose, CA $120,000.00-$165,000.00 10 months ago
New Grads 2025 - Software Engineer - Computer Vision/Deep Learning San Jose, CA $120,000.00-$165,000.00 9 months ago
San Francisco, CA $130,000.00-$238,000.00 2 weeks ago
San Francisco, CA $180,000.00-$240,000.00 2 days ago
Redwood City, CA $167,200.00-$250,800.00 1 day ago
Machine Learning Scientist, NLP (All Levels) San Francisco, CA $200,000.00-$300,000.00 4 months ago
Mountain View, CA $138,225.00-$207,575.00 2 weeks ago
Research Engineer - Machine Learning (ML)Machine Learning Engineer (I, II, or Sr.) San Jose, CA $142,700.00-$257,600.00 2 weeks ago
San Francisco, CA $140,000.00-$215,000.00 1 month ago
Machine Learning Engineer Graduate (Search E-Commerce - San Jose) - 2025 Start (BS/MS) San Jose, CA $118,657.00-$177,000.00 2 weeks ago
Menlo Park, CA $180,000.00-$200,000.00 1 month ago
New Grads 2025 - General Software Engineer San Jose, CA $120,000.00-$165,000.00 5 months ago
San Francisco, CA $88,000.00-$140,000.00 1 month ago
San Francisco, CA $160,000.00-$185,000.00 4 days ago
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr
- Location:
- San Francisco, CA
- Salary:
- $250
- Category:
- Engineering