Data Engineer

New Yesterday

Mid-Level Data Engineer

We are seeking a mid-level data engineer to support healthcare data transformation and integration work involving legacy clinical systems. You'll work closely with JSON and Parquet extracts from the RPMS/VistA ecosystem and help translate these into structured HL7v2 and FHIR-compliant representations. Your work will directly enable semantic interoperability across tribal and federal health systems, supporting public health delivery for some of the most underserved populations in the country.

We are a company committed to creating inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity employer that believes everyone matters. Qualified candidates will receive consideration for employment opportunities without regard to race, religion, sex, age, marital status, national origin, sexual orientation, citizenship status, disability, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to Human Resources Request Form.

Skills and requirements:

  • 25 years of experience in data engineering or healthcare data roles.
  • Strong Python skills, especially with pandas, json, and transformation of semi-structured data.
  • Experience with or strong interest in healthcare data standards HL7v2, FHIR, or both.
  • Familiarity with reading and transforming Parquet and JSON datasets.
  • Ability to reason through and normalize undocumented, legacy healthcare data.
  • Comfortable using Git and collaborating via GitLab or GitHub.
  • Strong communication and documentation habits.
  • Comfortable working in Docker-based environments for development or testing.
  • Familiarity with Git-based CI workflows (e.g., GitHub Actions, GitLab CI).
  • Writes modular, testable code and is comfortable debugging pipelines in containerized setups.
  • Experience managing environments via requirements.txt, pyproject.toml, or similar.
  • Familiarity with HL7v2 segments (e.g., PID, OBR, OBX) and FHIR bundles/resources.
  • Experience transforming clinical data to meet interoperability or public health reporting standards.
  • Exposure to Azure Synapse Pipelines, Spark, or other big data frameworks.
  • Experience with Nix, container-based dev environments, or CI/CD workflows.
  • Prior experience with federal, tribal, or public health systems.
  • Experience with AWS services (e.g., S3, Lambda, Glue) or container orchestration tools like Kubernetes.
  • Bonus points for Linux-first workflows and familiarity with Neovim or terminal-based tooling.
  • Experience using Nix or other reproducible development environments is a huge plus.
Location:
Herndon

We found some similar jobs based on your search