Founding Engineer, Machine Learning, San Francisco

Founding Engineer, Machine Learning

New Yesterday

Job Description

About the Role

We’re an early-stage stealth startup building a new kind of platform for generative media. Our mission is to enable the future of real-time generative applications: we’re building the foundational tools and infrastructure that make entirely new categories of generative experiences and applications finally possible.

We’re a small, focused team of ex-YC and unicorn founders and senior engineers with deep experience across 3D, generative video, developer platforms, and creative tools. We're backed by top-tier investors and top angels, and we're building a new technical foundation purpose-built for the next era of generative media.

We’re operating at the edge of what’s technically possible: high-performance inference and real-time orchestration of multimodal models. As one of our founding engineers, you’ll play a key role in architecting the core platform, shaping system design decisions, and owning critical infrastructure from day one.

If you're excited about architecting and building high-performance infrastructure that empowers the next generation of developers and unlocks entirely new products categories, we’d love to talk.

About the Role

We're looking for a Founding Machine Learning Engineer to build the core infrastructure powering high-performance inference for generative media models, including diffusion and transformer architectures. You’ll be instrumental in designing low-latency, high-throughput systems that serve state-of-the-art models in real time. As an early technical leader, you'll shape both our systems and culture from day one.

What You’ll Do

Architect and implement the inference engine for diffusion transformer-based generative models
Optimize model execution across the stack — memory, compute, and networking
Drive performance engineering to minimize latency and maximize throughput
Work closely with research to productionize new generative techniques and model variants
Build the tools, services, and monitoring that make these systems robust and scalable
Set the technical bar and help define engineering culture as an early team member

Requirements

About You

3+ years of experience building high-performance ML or systems infrastructure
Deep fluency with PyTorch and production-grade Python
Strong understanding of GPU systems (CUDA, memory hierarchies, scheduling, etc.)
Experience optimizing inference for generative models (e.g., diffusion, transformers)
Bonus: Familiarity with Triton, CUDA, TensorRT, or model parallelism techniques
Startup-ready: you take ownership, move quickly, and solve hard problems end to end

Minimum Qualifications

Strong Python + PyTorch skills
Proven experience optimizing inference for generative models
Deep systems knowledge, especially GPU performance tuning
High agency and eagerness to build from scratch

Benefits

Competitive SF salary and foundational team equity

Apply

Location:: San Francisco
Category:: Engineering

Start a New Search