DevJobs

Head of Data Intelligence

Overview
Skills
  • SQL SQL
  • Python Python
  • Linux Linux
  • Azure Azure
  • AWS AWS
  • GCP GCP
  • Docker Docker
  • CLI tools
  • data lakehouse
  • data modeling
  • large-scale data processing
  • orchestration
  • OCI
  • tool integrations
  • OMOP
  • prompt engineering
  • RAG
  • SNOMED
  • object storage
  • LOINC
  • LLM tooling
  • lineage tools
  • ICD
  • HL7
  • FHIR
  • evaluation
  • data catalogs
  • CDISC

About the Position

Lead and build the Data Intelligence function that ingests, validates, harmonizes and serves collaborator and internal data - including data for a production clinical component - to power ML/LLM research and product workflows. Hands-on technical leader who owns data platform, LLM/tooling orchestration, clinical/biological/chemical fidelity, and regulatory/QMS-ready governance.


Core responsibilities

  • Own end-to-end internal multimodal data platform (collaborators’ EHR, imaging, omics, assays).
  • Design, build and validate the clinical data environment (generation, augmentation, fidelity metrics and privacy safeguards) for model training and experiments.
  • Architect, deploy, and operate robust LLM systems and tools to structure data and reach the company's KPIs. Execute rigorous evaluation pipelines (hallucination guardrails, accuracy, and reliability tracking).
  • Hire and lead a cross-functional team (data engineering, clinical annotators, bioinformatics) and own standards and tech choices.
  • Partner with clinical, product, compliance and bizdev teams to translate domain requirements into systematically defined data products.
  • Lead secure, auditable collaborator data transfer onboarding: schema mapping, transfer, de-identification, provenance (data lineage), and monitoring.


Must-have qualifications

  • Proven leader of cross-functional technical teams
  • 5+ years building and operating large data platforms/pipelines; demonstrable experience with clinical or biomedical datasets, data lakehouses.
  • 3+ years of building Multimodal data pipelines, schema, and data harmonization.
  • MS/PhD in CS, Bioinformatics, Computational Biology, Biomedical Engineering or clinical/chemical discipline.
  • Strong software & data engineering skills: Python, SQL, orchestration, Docker, data modeling, modern data lakehouse, and large-scale data processing.
  • Experience with Linux environment and CLI tools for data transfer.
  • Familiarity with object storage and cloud providers (AWS/GCP/Azure/OCI).


A Significant Advantage

  • Hands-on production experience with LLMs and LLM tooling (RAG, orchestration, tool integrations, evaluation, prompt engineering).
  • Domain experience in oncology, immunology, assays or pharma/CRO partnerships.
  • Experience working with clinical data models and standards (e.g., FHIR/HL7, OMOP, ICD, SNOMED, LOINC, CDISC) in real-world datasets.
  • Experience implementing data catalogs and lineage tools.

Imagene AI