Software Engineer, Cloud Infrastructure (Multiple Seniority Levels)

New Yesterday

About Beacon AI
Beacon AI is developing AI pilot assistant technology to transform aviation, flight safety, operational efficiency, and pilot capabilities. We are on a mission to leverage the power of artificial intelligence and advanced data analytics to revolutionize the aviation industry. Join us to be at the cutting edge of technological innovation for the second century of aviation. Role Overview
We are seeking skilled Cloud and ML Infrastructure Engineers to lead the buildout of our AWS foundation and our LLM platform. You will design, implement, and operate services that are scalable, reliable, and secure.
The broad scope means focus areas in LLM/ML Infra and IoT infra are strong bonus points. For ML Infra, build the stack that powers retrieval-augmented generation and application workflows built with frameworks like LangChain. Experience with IoT AWS services is a plus.
You will work closely with other engineers and product management. The ideal candidate is hands-on, comfortable with ambiguity, and excited to build from first principles.
Key Responsibilities Cloud Infrastructure Setup and Maintenance Design, provision, and maintain AWS infrastructure using IaC tools such as AWS CDK or Terraform. Build CI/CD and testing for apps, infra, and ML pipelines using GitHub Actions, CodeBuild, and CodePipeline. Operate secure networking with VPCs, PrivateLink, and VPC endpoints. Manage IAM, KMS, Secrets Manager, and audit logging.
LLM Platform and Runtime Stand up and operate model endpoints using AWS Bedrock and/or SageMaker; evaluate when to use ECS/EKS, Lambda, or Batch for inference jobs. Build and maintain application services that call LLMs through clean APIs, with streaming, batching, and backoff strategies. Implement prompt and tool execution flows with LangChain or similar, including agent tools and function calling.
RAG Data Systems and Vector Search Design chunking and embedding pipelines for documents, time series, and multimedia. Orchestrate with Step Functions or Airflow. Operate vector search using OpenSearch Serverless, Aurora PostgreSQL with pgvector, or Pinecone. Tune recall, latency, and cost. Build and maintain knowledge bases and data syncs from S3, Aurora, DynamoDB, and external sources.
Evaluation, Observability, and Cost Governance Create offline and online eval harnesses for prompts, retrievers, and chains. Track quality, latency, and regression risk. Instrument model and app telemetry with CloudWatch and OpenTelemetry. Build token usage and cost dashboards with budgets and alerts. Add guardrails, rate limits, fallbacks, and provider routing for resilience.
Safety, Privacy, and Compliance Implement PII detection and redaction, access controls, content filters, and human-in-the-loop review where needed. Use Bedrock Guardrails or policy services to enforce safety standards. Maintain audit trails for regulated environments.
Data Pipeline Construction Build ingestion and processing pipelines for structured, unstructured, and multimedia data. Ensure integrity, lineage, and cataloging with Glue and Lake Formation. Optimize bulk data movement and storage in S3, Glacier, and tiered storage. Use Athena for ad-hoc analysis.
IoT Deployment Management Manage infrastructure that deploys to and communicates with edge devices. Support secure messaging, identity, and over-the-air updates.
Analytics and Application Support Partner with product and application teams to integrate retrieval services, embeddings, and LLM chains into user-facing features. Provide expert troubleshooting for cloud and ML services with an emphasis on uptime and performance.
Performance Optimization Tune retrieval quality, context window use, and caching with Redis or Bedrock Knowledge Bases. Optimize inference with model selection, quantization where applicable, GPU/CPU instance choices, and autoscaling strategies.
What Will Make You Successful End-to-End Ownership: Drives work from design through production, including on-call and continuous improvement. LLM Systems Experience: Shipped or operated LLM-powered applications in production. Familiar with RAG design, prompt versioning, and chain orchestration using LangChain or similar. AWS Depth: Strong with core AWS services such as VPC, IAM, KMS, CloudWatch, S3, ECS/EKS, Lambda, Step Functions, Bedrock, and SageMaker. Data Engineering Skills: Comfortable building ingestion and transformation pipelines in Python. Familiar with Glue, Athena, and event-driven patterns using EventBridge and SQS. Security Mindset: Applies least privilege, secrets management, network isolation, and compliance practices appropriate to sensitive data. Evaluation and Metrics: Uses quantitative evals, A/B testing, and live metrics to guide improvements. Clear Communication: Explains tradeoffs and aligns partners across product, security, and application engineering. Bonus Points 4+ years working with serverless or container platforms on AWS. Experience with vector databases, OpenSearch, or pgvector at scale. Hands-on with Bedrock Guardrails, Knowledge Bases, or custom policy engines. Familiarity with GPU workloads, Triton Inference Server, or TensorRT-LLM. Experience with big data tools for large-scale processing and search. Background in aviation data or other safety-critical domains. DevOps or DevSecOps experience automating CI/CD for ML and app services. This is a hybrid role and requires working from our San Carlos, CA office at least three days a week, with the option to work remotely the remaining days.
Perks & Benefits for Full-Time Employees: Comprehensive Healthcare Coverage: Enjoy peace of mind with our generous health benefits, with 80% of medical costs covered by the company for the employee and 25% for their dependents. Paid Time Off: Recharge and relax with 3 weeks of paid vacation, in addition to 14 company-paid holidays each year. Connectivity Stipend: Stay connected with our cell phone benefit, ensuring you have the tools you need to excel in your role. Health and Wellness Allowance: Use this towards a gym membership or subscription to a meditation app, empowering you to prioritize self-care and maintain a healthy lifestyle. Financial Planning: Prepare for the future with our 401(k) program. While we currently do not offer matching, we are committed to enhancing this benefit in the future.
For nearly all roles, the following may apply:
At this time, due to United States Department of State regulations, we are only able to hire U.S. Persons. A U.S. Person is a lawful permanent resident (U.S. citizen, legal immigrant with a 'Green Card', or a protected individual who has been granted permanent asylum or refugee status.) These aerospace restrictions mean that we are unable to provide visa sponsorship or consider candidates who require visa transfers. Applicants must be authorized to work in the United States without the need for visa sponsorship now or in the future. All work must be performed in the United States.
Beacon AI provides equal employment opportunities to all employees and applicants for employment without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. In addition to federal law requirements, employer complies with applicable state and local laws governing nondiscrimination in employment. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training. Employer expressly prohibits any form of workplace harassment based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. Improper interference with the ability of employees to perform their job duties may result in discipline up to and including discharge.
Location:
San Carlos, CA, United States
Job Type:
FullTime
Category:
Computer And Mathematical Occupations