Gradient Biotech
U

Menu

File formats

MAF / somatic mutation tables

Mutation landscape accepts MAF-style CSV files. Required columns:

ColumnPurpose
Hugo_SymbolGene name
Tumor_Sample_BarcodeSample identifier
Variant_ClassificationMissense, nonsense, frameshift, splice site, silent, etc.
ChromosomeChromosome label
Start_PositionVariant start coordinate
Reference_AlleleReference base
Tumor_Seq_Allele2Alternate allele

VCF files can be registered as mutation datasets; the pipeline reads MAF-style tabular exports for analysis.

Copy number alterations

Optional CSV for mutation landscape copy number summary:

ColumnRequiredPurpose
sample_idYesSample identifier
geneYesGene symbol
chromosomeNoChromosome label
alterationYesamplification or deletion

Clinical endpoints

Survival analysis reads clinical CSV files:

ColumnRequiredPurpose
patient_idYesPatient identifier
time_to_event_daysYesSurvival or PFS time in days
event_observedYes1 = event, 0 = censored
treatment_armNoTreatment group for stratification

Molecular feature matrix

Optional CSV for survival stratification and Cox regression:

ColumnRequiredPurpose
patient_idYesJoin key to clinical table
Feature columnsNoTMB, immune scores, gene signatures, etc.

Longitudinal trajectories

Optional CSV for treatment timepoint analysis:

ColumnRequiredPurpose
patient_idYesPatient identifier
timepointYesbaseline, on_treatment, response, relapse
featureYesFeature name
valueYesNumeric measurement

Per-cell expression (communication)

ColumnRequiredPurpose
cell_idYesCell identifier
Gene columnsYesNumeric expression values

Cell metadata (communication)

ColumnRequiredPurpose
cell_idYesCell identifier
cell_typeYesAnnotated cell type label
conditionNoTreatment or response group for comparison

Ligand-receptor database (optional)

Custom CSV for cell-cell communication:

ColumnRequiredPurpose
ligandYesLigand gene symbol
receptorYesReceptor gene symbol
pathwayNoSignaling pathway label

Bulk expression (immune profile)

Samples-as-rows, genes-as-columns CSV. First column should identify samples.

Repertoire (immune profile)

TCR/BCR clonotype table with clonotype identifiers, CDR3 sequences, and frequency counts. Format follows scirpy-compatible tabular exports.

Artifact outputs

Pipeline results under data/oncology/artifacts/{run_id}/:

ArtifactContents
result.jsonFull pipeline output with summary, tables, and visualization data

Single-cell and spatial omics

10x Genomics, Visium, Xenium, and bulk RNA-seq formats are handled by the Computational Biology area. Oncology links compbio datasets to study samples via shared patient/sample keys.

Whole-slide images

H&E and IHC whole-slide images are handled by the Pathology area. Oncology uses pathology outputs for spatial TME context without duplicating WSI ingestion.