The Department of Medicine, Division of Nephrology Quantitative Health is seeking a full time Data Scientist III. This position supports a multi-institutional, NIH-funded research initiative within the Computational Microscopy Imaging Lab (CMIL) focused on integrating digital pathology, spatial omics, and clinical datasets into an AI-enabled modeling platform. The Data Scientist III – Imaging & Omics Lead is responsible for developing analytic workflows that extract features from histology images and spatial molecular assays. This includes leading data processing, quality control, and harmonization efforts for tissue and omics data streams, and ensuring their reproducible integration into research pipelines. The position reports to Dr. Pinaki Sarder, Principal Investigator.
Essential functions;
Imaging and Omics Pipeline Development –
- Design and implement data pipelines for histopathology and spatial omics sources (e.g., spatial transcriptomics, CODEX).
- Apply image processing, segmentation, and normalization techniques using Python-based libraries.
- Ensure pipelines are reproducible and version-controlled to meet analytic standards.
Feature Extraction and Model Input Generation –
- Use deep learning and statistical approaches to extract meaningful features from tissue images and molecular assays.
- Generate structured representations suitable for integration with AI/ML models.
- Collaborate with model developers to align feature formats with input requirements.
Data Harmonization and Quality Assurance –
- Coordinate with institutional collaborators to harmonize data formats, metadata, and preprocessing standards.
- Conduct QC reviews, troubleshoot data artifacts, and document all analytic transformations.
- Ensure compatibility between imaging/omics data and other project modalities.
Documentation and Cross-Team Collaboration –
- Maintain technical documentation for workflows and codebases.
- Participate in project meetings, share updates, and contribute to team deliverables.
- Support communication of imaging and omics data workflows to non-technical collaborators.
Mentorship and Innovation Support –
- Provide informal guidance to student researchers or junior analysts.
- Recommend new tools or analytic methods to improve pipeline performance