Site Reliability Engineer, Trust & Safety - USDS

New Today

Team Intro:The Trust and Safety (TnS) engineering team of US Tech Service department at TikTok is fast growing and responsible for building machine learning models and systems to identify and defend internet abuse and fraud on our platform. Our mission is to protect billions of users and publishers across the globe every day. We embrace the state-of-the-art machine learning technologies and scale them to detect and improve trust and safety system using the tremendous amount of data generated on the platform. With the continuous efforts from our team, TikTok is able to provide the best user experience and bring joy to everyone in the world. In our team, you’ll have the opportunity to manage the complex challenges of scale, while using expertise in coding, algorithms, complexity analysis, and large-scale system design. We embrace a culture of diversity, intellectual curiosity, openness, and problem-solving. We encourage close collaboration while promoting self-direction. In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time. Responsibilities: - Manage day-to-day operations of data service, realtime/batch data pipelines, such as SLA management, system deployment, performance tuning and trouble shooting - Create tools and automation to improve system administration and operation efficiency - Participate in regular on-call duties - Engage in and improve the whole lifecycle of services from inception and design, throughout development, capacity planning, and launch reviews, to deployment, operation, and refinement - Scale systems sustainably through mechanisms such as automation; evolve systems reliability, efficiency, and velocity by pushing for changes - Practice sustainable user support, incident response, and postmortems
Minimum Qualifications: - Bachelor or above degree in computer science or a related technical discipline - 2+ years of industrial experience - Experience programming in one of the following programmings: Python, Go, C, C++, Java and Rust - Familiar with backend systems such as MySQL/Redis/Nginx/Kafka/Kubernetes/Docker and big data technologies such as Hadoop/Spark/Flink/Hive/OLAP/ClickHouse, etc. - Familiar with Unix/Linux system internals, networking, and distributed systems Preferred Qualifications: - Good communication and coordination skills - Demonstrated independent thinking capabilities and troubleshooting skills Candidates for this position must be legally authorized to work in the United States. This position is not eligible for visa sponsorship or support.
Location:
San Jose

We found some similar jobs based on your search