The Efficiency Engineering team is all about our passion for crafting innovative tools and applications that empower IT operations and devops teams to achieve new levels of efficiency. We're a tight-knit crew of experienced developers, engineers and problem solvers fueled by a shared vision: streamlining operations, reducing manual workload, and empowering teams to do their best work.
In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time. Responsibilities:
- Responsible for the design and development of large-scale ML system architecture such as solving technical system problems on high concurrency, reliability, scalability, etc
- Develop end-to-end solutions on deep model inference for internal business units such as Search and relevant Large Language Model (LLM) based systems etc - Provide highly automated and extremely performant model optimization solutions for frameworks such as PyTorch and TensorFlow. Some technical solutions includes subgraph matching, compilation optimization, model quantization, heterogeneous hardware, etc.
- Manage the large-scale GPU computing power cluster for our global businesses by improving utilization rates of the computing power through methods such as elastic scheduling, GPU overselling, and task orchestration;
- Engage in cross functional collaboration with the algorithm department to conduct joint optimization of algorithms and systems.
Minimum Qualifications:
- B. Sc or higher degree in Computer Science or related fields from accredited and reputable institutions.
- Proficient in C/C++/Python, and have solid programming skills.
- Familiar with deep learning frameworks (TensorFlow/Pytorch).
- Experience in developing and deploying large-scale systems.
- Good communication and teamwork skills to clearly communicate technical concepts with other teammates.
- Experience on improving core machine learning infrastructure(TensorFlow, Pytorch, and Jax).
- 4+ years of industry experience with solid theoretical foundation of machine learning. Preferred Qualifications:
- Experience in designing large scale LLM powered applications. - Agile, quick self learner, highly self-motivated with strong sense of product ownership and creative problem solver
- Deeply passionate about software coding/development and building great web applications
- Ability to perform independent research to solve complex technical problems - Good collaborator and team player, comfortable working in a fast moving, culturally diverse and globally distributed team environment
- Passionate about techniques and solving challenging problems.
- Experience of driving collaboration across cross-functional teams on delivering shared goals.
- Strong communication and teamwork skills. Candidates for this position must be legally authorized to work in the United States. This position is not eligible for visa sponsorship or support.