Machine Learning Engineer – Speech Processing – EMEA
New Today
This position is for the curious, the experimental, the brave. Are you someone who builds just to see how far an idea can go? Do you enjoy breaking conventions, simplifying the overengineered, and questioning the status quo?At CloudWalk R&D, we are creating space to rethink the fundamentals of digital experiences - so we can uncover what is truly worth building, and help shape what reaches production.Our team is for people who chase ideas that will not let them sleep. Who prototype to explore, test the edges of what is possible, and bring bold concepts to life. If that sounds like you - we would love to meet you. About the company: CloudWalk is a fintech company reimagining the future of financial services. We are building intelligent infrastructure powered by AI, blockchain, and thoughtful design. Our products serve millions of entrepreneurs across Brazil and the US every day, helping them grow with tools that are fast, fair, and built for how business actually works. Learn more at . About the team: We are a small team within CloudWalk’s larger R&D effort, focused on bold ideas and new product directions. Our mission is to explore new concepts early, build fast, and uncover what is worth turning into real products. Sometimes that means prototyping features for existing experiences; other times, it means testing entirely new directions that could shape the future.We learn through prototyping and fast iteration. Endless curiosity drives our exploration, and collaboration is at the core of how we work - whether we are building together or challenging each other’s thinking. Your role:
We are looking for an ML engineer with deep expertise in automatic speech recognition (ASR) and audio processing to help us bring real-time voice-driven interactions to life. You will be at the forefront of prototyping voice-to-action features - exploring how voice can drive more intuitive customer experiences.
You will: Design and implement the speech processing core of our prototypes - from ASR model selection to inference pipelines and tuning.
Collaborate with a mobile developer to integrate speech input into working prototypes.
Work closely with a performance-focused ML engineer to ensure low-latency, robust streaming pipelines under real-world constraints like noise and spotty connectivity.
This role combines deep knowledge of speech and ML systems with a builder’s mindset, and is ideal for someone eager to turn research into working, user-facing prototypes. What we expect:
Our projects often start with ambiguity and evolve rapidly - so we look for people who are deeply curious, adaptable, and driven by the desire to make ideas real.
This role offers the opportunity to explore broadly - but it starts with a clear focus: building the foundations for voice-driven experiences. We are looking for someone with deep, hands-on expertise in speech and audio processing who is excited to apply that knowledge to shape new, intuitive ways for people to interact with technology.
Required skills: Speech & Audio Processing: Deep, hands-on experience with ASR and real-time audio processing. You know how to build streaming pipelines, handle noise and interruptions.
Machine Learning Expertise: Solid foundation in machine learning and deep learning - particularly for sequential data, with working knowledge of architectures like RNNs, CNNs, or Transformers in audio contexts.
Python & ML Engineering: Strong proficiency in Python, with the ability to build, debug, and tune ML pipelines using frameworks like PyTorch or TensorFlow.
MLOps Tooling: Experience with tools like MLflow, Weights & Biases, or other systems for model tracking, versioning, and lifecycle management.
Experimental Mindset: Ability to design and run meaningful experiments, interpret results carefully, and iterate based on what the data says - not assumptions.
Curiosity & Adaptability: A deep drive to explore new technologies, ask better questions, and stay flexible in the face of evolving requirements.
Nice-to-have skills:
Conversational AI: Experience designing or experimenting with dialogue systems, intent workflows, or agent-like architectures.
Frontend Development: Understanding of how user interfaces are built (especially in Flutter) and interest in connecting backend intelligence to user-facing experiences.
Cloud Deployment Experience: Familiarity with deploying and managing ML models on Google Cloud.
Systems Efficiency: Awareness of strategies for performance tuning, edge deployment, or handling constrained environments (connectivity, device limits, etc.).
Our recruitment process:
Our process is designed to help us understand how you think, build, and collaborate. If you have a relevant GitHub repository, portfolio, or past project that reflects your work and interests, please share the link when you apply.
The process consists of three stages: Take-home challenge: This is a short exercise designed to showcase your technical skills, development process, and problem-solving approach.
Technical interview: This in-depth discussion will assess your technical knowledge, problem-solving skills, and experience relevant to the role.
Cultural interview: We will explore your work style, values, and how you might contribute to and thrive within our team culture.
Our goal is to ensure a good mutual fit and set the foundation for a successful collaboration. We appreciate your time and commitment throughout this process and aim to provide a clear and timely response after each stage. Diversity and inclusion: We believe in social inclusion, respect, and appreciation of all people. We promote a welcoming work environment, where each CloudWalker can be authentic, regardless of gender, ethnicity, race, religion, sexuality, mobility, disability, or education.
- Location:
- Remote