Gradient Biotech
U

Menu

Data upload

Oncology studies combine molecular datasets with clinical cohort metadata. Register samples and endpoints in the metadata service before dispatching analysis jobs.

Samples

Register samples via the study Data page or the metadata API:

curl -X POST http://localhost:8000/oncology/studies/{study_id}/samples \
  -H "Content-Type: application/json" \
  -d '{
    "external_id": "S1",
    "patient_id": "P1",
    "timepoint": "baseline",
    "tumor_site": "lung",
    "treatment": "anti-pd1",
    "response_status": "partial_response",
    "group_label": "responder"
  }'

Key fields:

FieldPurpose
external_idSample identifier used across pipelines
patient_idPatient key for clinical endpoint joins
timepointLongitudinal position (baseline, on-treatment, relapse)
group_labelCohort comparison group (responder, non-responder, arm A/B)
response_statusClinical response category

Clinical endpoints

Add survival and outcome records per patient:

curl -X POST http://localhost:8000/oncology/studies/{study_id}/endpoints \
  -H "Content-Type: application/json" \
  -d '{
    "patient_id": "P1",
    "endpoint_type": "overall_survival",
    "time_to_event_days": 900,
    "event_observed": false,
    "response_status": "partial_response"
  }'

Survival analysis reads clinical CSV files with patient_id, time_to_event_days, and event_observed columns. Endpoint records in the metadata service provide the structured source for export.

Mutation datasets

Register MAF or VCF files linked to the study. The mutation landscape pipeline requires a file path in job parameters:

ParameterRequiredPurpose
mutation_pathYesPath to MAF-style CSV on the analysis server
copy_number_pathNoCopy number alteration table
panel_size_mbNoSequencing panel size for TMB calculation (default 38.0)
top_genesNoNumber of genes in oncoprint (default 20)

Repertoire datasets

Register TCR/BCR sequencing files for immune profile analysis. Pass the repertoire path in job parameters:

ParameterRequiredPurpose
expression_pathYesBulk or pseudo-bulk expression matrix
repertoire_pathNoTCR/BCR clonotype table

Expression data for communication and immune pipelines

Cell-cell communication requires per-cell expression and metadata CSV files:

FileRequired columns
Expression CSVcell_id plus gene columns
Metadata CSVcell_id, cell_type; optional condition for comparison

Immune profile requires a bulk expression matrix CSV with sample rows and gene columns.

File storage

Uploads and registered dataset paths live under data/oncology/. The metadata API records file locations in biochem_onco_* tables in Supabase. These paths are gitignored.

Next steps