Senior Distributed System Engineer

New Today

Senior Distributed ML Systems Engineer
Take the next step in your career now, scroll down to read the full role description and make your application. Inference.net is seeking a Senior Distributed ML Systems Engineer to join our team. This role involves developing large-scale, fault-tolerant distributed systems that handle millions of large language model inference requests per day. If you are passionate about developing next-generation ML systems that operate at scale, we want to hear from you. You will be responsible for designing and implementing the core systems that power our globally distributed LLM inference network. You'll work on problems at the intersection of distributed systems, machine learning, and resource optimization. Inference.net is building a distributed LLM inference network that combines idle GPU capacity from around the world into a single cohesive plane of compute that can be used for running large-language models like DeepSeek and Llama 4. At any given moment, we have over 5,000 GPUs and hundreds of terabytes of VRAM connected to the network. We are a small, well-funded team working on difficult, high-impact problems at the intersection of AI and distributed systems. We primarily work in-person from our office in downtown San Francisco. Our investors include A16z CSX and Multicoin. We are high-agency, adaptable, and collaborative. We value creativity alongside technical prowess and humility. We work hard, and deeply enjoy the work that we do. Key Responsibilities Design and implement scalable distributed systems for our inference network Develop models for efficient resource allocation across a network of heterogeneous hardware and quickly changing topology Optimize network latency, throughput, and availability Build robust logging and metrics systems to monitor network health and performance Conduct reviews of architecture and system design to ensure use of best practices Collaborate with founders, engineers, and other stakeholders to improve our infrastructure and product offerings What We're Looking For Very strong problem-solving skills and ability to work in a startup environment5+ years of experience in building high performance systemsStrong programming skills in Typescript, Python, and one of Go, Rust, or C++Solid understanding of distributed systems conceptsKnowledge of orchestrators and schedulers like Kubernetes and NomadUse of AI tooling in development workflow (ChatGPT, Claude, Cursor, etc)Experience with LLM inference engines like vLLM or TensorRT-LLM is plusExperience with GPU programming and optimization (CUDA experience is a plus)Experience with Postgres or NATS.io is a bonus Compensation We offer competitive compensation, equity in a high-growth startup, and comprehensive benefits. The base salary range for this role is $180,000 - $250,000, plus equity and benefits, depending on experience. Equal Opportunity Inference.net is an equal opportunity employer. We welcome applicants from all backgrounds and don't discriminate based on race, color, religion, gender, sexual orientation, national origin, genetics, disability, age, or veteran status.
#J-18808-Ljbffr
Location:
San Francisco, CA
Salary:
$200
Category:
Engineering

We found some similar jobs based on your search