AI Infrastructure / LLM Back-End Engineer
New Today
This is a remote position.
Job Title: AI Infrastructure / LLM Back-End Engineer
Location: Remote
Team: AI Infrastructure & Engineering
Employment Type: Full-Time
*Superstaffed.ai is part of Remote Workmate PTY LTD
About the Role:
We’re hiring a Back-End Software Engineer to lead the design and development of high-performance infrastructure that powers AI-first applications. In your first 90 days, you'll take ownership of one or more back-end product verticals—from LLM integration to API delivery—working directly with our product and ML teams to deploy real-time intelligent features used by global clients.
You'll be building scalable APIs, vector-based search pipelines, and inference systems using tools like OpenAI, Hugging Face, and LangChain. Your goal: reduce latency, optimize infrastructure for cost, and continuously ship measurable improvements.
This role is ideal for engineers who thrive in autonomous environments, think in systems, and deliver fast. You’ll work with a lean, high-output team where your decisions directly impact feature performance, reliability, and user experience.
Ready to Apply?
If this opportunity excites you and your skills align with the role, we'd love to learn more about you.
You can begin the application process right away by completing a short, self-paced video interview with “Alex,” our AI interviewer. This helps us fairly assess your experience, communication style, and fit for the role.
Start the interview here:
*Note: Applications without a video interview will not be processed.
Responsibilities:
Design and maintain APIs for AI-powered features (FastAPI, Flask)
Integrate and fine-tune LLMs (OpenAI, Hugging Face, LangChain)
Build pipelines for vector embeddings, semantic search, and RAG
Optimize back-end systems for latency, scalability, and cost
Collaborate with ML engineers to deploy and monitor inference systems
Implement observability (Sentry, Prometheus, Grafana) for debugging
Manage CI/CD and infrastructure-as-code (Docker, GitHub Actions, Terraform)
Own full product verticals from API to deployment
Requirements:
3+ years in back-end/API engineering (Python, FastAPI/Flask)
Experience with PostgreSQL, Docker, and containerized development
Proven use of OpenAI APIs, Hugging Face, LangChain, or Transformers
Familiar with vector databases like Pinecone, Qdrant, or Weaviate
Experience in CI/CD, observability, and monitoring systems
Bonus: Knowledge of asyncio, aiohttp, k8s, or serverless environments
Strong communication, async-first documentation, and remote collaboration skills
Performance Milestones:
First 30 Days
Set up staging and dev environments
Review codebase and system architecture
Deploy test API integrating a basic OpenAI or HF model
By Day 60
Launch a production-ready AI feature (e.g., vector store or RAG endpoint)
Improve model response latency by 30–50%
Implement >80% test coverage
By Day 90
Own back-end infrastructure for a product line
Reduce compute costs through caching/async strategies
Contribute to LLM scaling roadmap
Success Metrics (KPOs):
API latency Uptime ≥ 99.5% on core services
Test coverage > 85%
1–2 production deployments per week
LLM inference ≤ 3s with retries/failure handling
Tech Stack:
AI Platforms: OpenAI, Hugging Face, LangChain
Frameworks: FastAPI, Flask, SQLAlchemy
Databases: PostgreSQL, Redis, Pinecone, Qdrant
DevOps: Docker, GitHub Actions, Terraform
Monitoring: Prometheus, Grafana, Sentry
Collaboration: Slack, Notion, ChatGPT
- Location:
- Austin
- Job Type:
- FullTime