Sample data
Test fixtures support manual testing, demos, and automated regression checks. All examples below mirror the oncology test suite under backend-analysis/tests/oncology/.
Mutation landscape fixture
import pandas as pd
maf = pd.DataFrame({
"Hugo_Symbol": ["TP53", "KRAS", "TP53", "PIK3CA", "BRCA1", "EGFR", "CDKN2A"],
"Tumor_Sample_Barcode": ["S1", "S1", "S2", "S2", "S3", "S3", "S3"],
"Variant_Classification": [
"Missense_Mutation", "Missense_Mutation", "Nonsense_Mutation",
"Frame_Shift_Del", "Missense_Mutation", "Silent", "Splice_Site",
],
"Chromosome": ["17", "12", "17", "3", "17", "7", "9"],
"Start_Position": [7579472, 25398284, 7578406, 178936091, 43071077, 55249071, 21971120],
"Reference_Allele": ["C", "G", "C", "T", "C", "A", "G"],
"Tumor_Seq_Allele2": ["T", "A", "A", "TA", "G", "G", "T"],
})
cna = pd.DataFrame({
"sample_id": ["S1", "S2", "S3"],
"gene": ["EGFR", "PTEN", "CDKN2A"],
"chromosome": ["7", "10", "9"],
"alteration": ["amplification", "deletion", "deletion"],
})
maf.to_csv("/tmp/cohort.maf", index=False)
cna.to_csv("/tmp/copy_number.csv", index=False)
Expected results (from test_mutation_landscape.py):
- 3 samples, 7 variants
- TMB per megabase > 0 for all samples
- TP53 ranked as top mutated gene
- Oncoprint matrix includes TP53
- Dominant mutational signature assigned
Cell-cell communication fixture
import pandas as pd
expression = pd.DataFrame({
"cell_id": ["tumor_1", "tumor_2", "t_1", "t_2", "myeloid_1", "myeloid_2"],
"TGFB1": [5.0, 4.0, 0.0, 0.0, 2.0, 1.0],
"TGFBR1": [1.0, 1.0, 4.0, 4.0, 2.0, 2.0],
"CXCL9": [0.0, 0.0, 3.0, 4.0, 1.0, 1.0],
"CXCR3": [1.0, 1.0, 5.0, 5.0, 0.0, 0.0],
"CD274": [6.0, 5.0, 0.0, 0.0, 1.0, 1.0],
"PDCD1": [0.0, 0.0, 5.0, 4.0, 0.0, 0.0],
})
metadata = pd.DataFrame({
"cell_id": ["tumor_1", "tumor_2", "t_1", "t_2", "myeloid_1", "myeloid_2"],
"cell_type": ["Tumor", "Tumor", "T cell", "T cell", "Myeloid", "Myeloid"],
"condition": ["baseline", "treated", "baseline", "treated", "baseline", "treated"],
})
expression.to_csv("/tmp/expression.csv", index=False)
metadata.to_csv("/tmp/metadata.csv", index=False)
Expected: 3 LR pairs tested, 3 cell types, pathway scores > 0, condition comparison deltas present.
Survival analysis fixture
import pandas as pd
clinical = pd.DataFrame({
"patient_id": ["P1", "P2", "P3", "P4", "P5", "P6"],
"time_to_event_days": [900, 820, 760, 320, 280, 240],
"event_observed": [0, 0, 1, 1, 1, 1],
"treatment_arm": ["anti-pd1", "anti-pd1", "anti-pd1", "chemo", "chemo", "chemo"],
})
features = pd.DataFrame({
"patient_id": ["P1", "P2", "P3", "P4", "P5", "P6"],
"tmb": [9.0, 8.0, 7.5, 2.0, 1.0, 1.5],
"cd8_score": [2.0, 1.8, 1.6, -0.8, -1.0, -0.9],
})
clinical.to_csv("/tmp/clinical.csv", index=False)
features.to_csv("/tmp/features.csv", index=False)
Expected: 6 patients, 2 KM groups when stratified by treatment_arm, log-rank and Cox outputs present.
Test fixtures location
Automated tests live under:
backend-analysis/tests/oncology/
Tests create temporary fixtures in pytest tmp_path — no committed binary fixtures required.
Suggested demo workflow
- Create oncology study from
/areas/oncology/studies/new - Write MAF fixture to
/tmp/cohort.maf - Run mutation landscape job via API or Mutations page
- Write clinical fixture to
/tmp/clinical.csv - Run survival job with
stratify_by: treatment_arm - Launch interpret job with both run IDs
- Review results on workflow pages and in run history
Full product details: products/oncology/product.md and products/oncology/todo.md.