Senior Data Engineer

1 Days Old

Summary:
We are seeking a highly experienced Senior Data Engineer with a deep understanding of PySpark using Databricks or AWS Glue or AWS EMR and cloud-based databases such as Snowflake. Proficiency in workflow management tools like Airflow a is essential. Healthcare industry experience is a significant advantage. The ideal candidate will be responsible for designing, implementing, and maintaining data pipelines, while ensuring the highest levels of performance, security, and data quality. Responsibilities: Design, develop, and maintain scalable, reliable, and secure data pipelines to process large volumes of structured and unstructured healthcare data using PySpark and cloud-based databases. Collaborate with data architects, data scientists, and analysts to understand data requirements and implement solutions that meet business and technical objectives. Leverage AWS or Azure cloud services for data storage, processing, and analytics, optimizing cost and performance. Utilize tools like Airflow for workflow management and Kubernetes for container orchestration to ensure seamless deployment, scaling, and management of data processing applications. Develop and implement data ingestion, transformation, and validation processes to ensure data quality, consistency, and reliability across various healthcare datasets. Monitor and troubleshoot data pipelines, proactively identifying and resolving issues to minimize downtime and ensure optimal performance. Establish and enforce data engineering best practices, ensuring compliance with data privacy and security regulations specific to the healthcare industry. Continuously evaluate and adopt new tools, technologies, and frameworks to improve the data infrastructure and drive innovation. Mentor and guide junior data engineers, fostering a culture of collaboration, learning, and growth within the team. Collaborate with cross-functional teams to align data engineering efforts with broader organizational goals and strategies. Requirements: Bachelor's or Master’s degree in Computer Science, Engineering, or a related field. 5+ years of experience in data engineering, with a strong background in Apache Spark and cloud-based databases such as Snowflake. Strong Knowledge in Big Data Technologies, PySpark, thorough in one or more programming language like Python Proven experience with AWS or Azure cloud services for data storage, processing, and analytics. Expertise in workflow management tools like Airflow and container orchestration systems such as Kubernetes. Strong knowledge of SQL and NoSQL databases, as well as data modeling and schema design principles. Familiarity with healthcare data standards, terminologies, and regulations, such as HIPAA and GDPR, is highly desirable. Excellent problem-solving, communication, and collaboration skills, with the ability to work effectively in cross-functional teams. Demonstrated ability to manage multiple projects, prioritize tasks, and meet deadlines in a fast-paced environment. A strong desire to learn, adapt, and contribute to a rapidly evolving data landscape. Maintain controls, reporting and transparency necessary to comply with customer and BCBSA confidentiality, HIPAA, PHI and PII standards, as well as SOC2 and HiTrust certifications. The actual salary an employee can expect to receive, plus bonus pursuant to the terms of any bonus plan if applicable, will depend on experience, seniority, geographic location, and other factors permitted by law. To review benefits, please visit Base salary range: 140k - 170k base We offer competitive compensation and benefits packages, along with opportunities for career growth and development. We offer visa sponsorship for this role. Join our team of passionate and talented data professionals as we drive innovation in the healthcare industry.
Location:
Us