Founding Engineer, Machine Learning

New Yesterday

Job Description

Job Description
About the Role

We’re an early-stage stealth startup building a new kind of platform for generative media. Our mission is to enable the future of real-time generative applications: we’re building the foundational tools and infrastructure that make entirely new categories of generative experiences and applications finally possible.

We’re a small, focused team of ex-YC and unicorn founders and senior engineers with deep experience across 3D, generative video, developer platforms, and creative tools. We're backed by top-tier investors and top angels, and we're building a new technical foundation purpose-built for the next era of generative media.

We’re operating at the edge of what’s technically possible: high-performance inference and real-time orchestration of multimodal models. As one of our founding engineers, you’ll play a key role in architecting the core platform, shaping system design decisions, and owning critical infrastructure from day one.

If you're excited about architecting and building high-performance infrastructure that empowers the next generation of developers and unlocks entirely new products categories, we’d love to talk.

About the Role

We're looking for a Founding Machine Learning Engineer to build the core infrastructure powering high-performance inference for generative media models, including diffusion and transformer architectures. You’ll be instrumental in designing low-latency, high-throughput systems that serve state-of-the-art models in real time. As an early technical leader, you'll shape both our systems and culture from day one.

What You’ll Do
  • Architect and implement the inference engine for diffusion transformer-based generative models
  • Optimize model execution across the stack — memory, compute, and networking
  • Drive performance engineering to minimize latency and maximize throughput
  • Work closely with research to productionize new generative techniques and model variants
  • Build the tools, services, and monitoring that make these systems robust and scalable
  • Set the technical bar and help define engineering culture as an early team member

Requirements

About You
    • 3+ years of experience building high-performance ML or systems infrastructure
    • Deep fluency with PyTorch and production-grade Python
    • Strong understanding of GPU systems (CUDA, memory hierarchies, scheduling, etc.)
    • Experience optimizing inference for generative models (e.g., diffusion, transformers)
    • Bonus: Familiarity with Triton, CUDA, TensorRT, or model parallelism techniques
    • Startup-ready: you take ownership, move quickly, and solve hard problems end to end
Minimum Qualifications
    • Strong Python + PyTorch skills
    • Proven experience optimizing inference for generative models
    • Deep systems knowledge, especially GPU performance tuning
    • High agency and eagerness to build from scratch

Benefits

  • Competitive SF salary and foundational team equity
Location:
San Francisco
Category:
Engineering

We found some similar jobs based on your search