Key concepts

Oncology in Gradient Biotech is organized around tumor samples, molecular alterations, immune context, signaling networks, clinical endpoints, and reproducible analysis runs. This page explains the oncology concepts behind the workflows and the app concepts used to manage them.

Tumor cohorts and samples

Most Oncology workflows begin with a study cohort: a set of patients, samples, and clinical metadata organized around a tumor type, treatment, response group, trial arm, or translational research question.

Common cohort concepts:

Patient: person or participant represented in the study.
Sample: molecular or tissue specimen linked to a patient.
Timepoint: collection stage, such as baseline, on-treatment, response, relapse, or recurrence.
Tumor site: anatomical location or lesion source.
Treatment arm: therapy or experimental group.
Response status: outcome label such as responder, non-responder, stable disease, progression, or pathological response.
Group label: category used for comparison.

Good oncology analysis depends on consistent patient and sample identifiers. These keys connect expression, mutation, repertoire, pathology, and clinical endpoint tables.

Clinical endpoints

Clinical endpoints describe outcomes that can be linked to molecular features.

Common endpoint concepts:

Overall survival: time from a defined start point to death from any cause.
Progression-free survival: time to progression or death, depending on study definition.
Censoring: subject has not had the event by the last follow-up time.
Event observed: indicator that the endpoint event occurred.
Response Evaluation Criteria in Solid Tumors (RECIST): radiographic response framework used in many solid tumor studies.
Pathological response: tissue-based response assessment, often after neoadjuvant therapy.

Endpoints feed survival analysis and responder-versus-non-responder comparisons. Their meaning depends on study design, follow-up time, treatment context, and censoring rules.

Tumor microenvironment

The tumor microenvironment (TME) includes tumor cells, immune cells, stromal cells, vasculature, extracellular matrix, and signaling factors surrounding the tumor.

Oncology uses Computational Biology infrastructure for single-cell clustering, spatial transcriptomics, differential expression, and biomarker workflows, then adds oncology-specific interpretation such as tumor microenvironment composition, immune infiltration, communication networks, mutation context, and clinical outcomes.

Tumor microenvironment outputs should be interpreted with tissue source, sampling bias, tumor purity, spatial context, and annotation quality.

Cell-cell communication

Cell-cell communication estimates signaling between cell populations using per-cell expression data, cell metadata, ligand-receptor pairs, pathway aggregation, and condition comparison.

Core concepts:

Ligand: signaling molecule expressed or secreted by a source cell population.
Receptor: target molecule expressed by a receiving cell population.
Ligand-receptor interaction: candidate signaling relationship between source and receiver cell types.
Interaction score: computed evidence or strength for an interaction.
Sender and receiver network: directed view of which cell types may send or receive signals.
Pathway aggregation: grouping interactions into signaling pathways.
Condition comparison: change in signaling between response groups, treatments, or disease states.

Communication outputs are computational evidence. They are strongest when supported by expression quality, expected biology, spatial proximity, and experimental validation.

Immuno-oncology profiling

Immuno-oncology profiling summarizes immune context in a tumor cohort. It can use bulk expression, immune deconvolution, exhaustion programs, response phenotypes, and optional receptor repertoire data.

Important concepts:

Immune deconvolution: estimation of immune cell fractions from bulk expression data.
Tumor Immune Dysfunction and Exclusion (TIDE): framework for immune escape and checkpoint response hypotheses.
Exhaustion score: gene program score associated with T cell dysfunction or chronic stimulation.
Checkpoint target: immune regulatory molecule such as programmed cell death protein 1 or programmed death-ligand 1.
Immune phenotype: summary label describing immune-inflamed, excluded, exhausted, or related states.
Immuno-oncology (IO) response prediction: research-only score or label intended for exploratory stratification.

These outputs support hypothesis generation about immune infiltration, exclusion, exhaustion, and therapy response. They are not treatment recommendations.

Repertoire analysis

Repertoire analysis summarizes T cell receptor and B cell receptor sequencing data when it is registered with the study.

Key concepts:

T cell receptor (TCR): receptor sequence used by T cells to recognize antigen context.
B cell receptor (BCR): receptor sequence linked to antibody-producing lineages.
Complementarity-determining region 3 (CDR3): highly variable receptor region often used for clonotype definitions.
Clonotype: group of cells or sequences inferred to share receptor identity.
Clonal expansion: enrichment of a clonotype across cells, samples, or timepoints.
Diversity: distribution of clonotypes in a sample or cohort.

Expanded clonotypes can be biologically important, but they do not identify antigen targets by themselves. Interpret repertoire outputs with assay depth, chain pairing, tissue source, and immune state.

Mutation landscape

The mutation landscape summarizes somatic alterations across tumor samples.

Common mutation concepts:

Somatic variant: alteration found in tumor deoxyribonucleic acid (DNA) relative to a reference or matched normal context.
Mutation Annotation Format (MAF): tabular format commonly used for somatic mutation records.
Variant Call Format (VCF): file format for genomic variant records.
Variant classification: mutation type, such as missense, nonsense, frameshift, splice site, or silent.
Driver alteration: mutation or copy number event believed to contribute to tumor biology.
Tumor mutational burden (TMB): number of mutations normalized by sequenced genomic territory.
Oncoprint: compact matrix visualization of alterations across samples and genes.
Mutational signature: pattern of mutation types associated with biological or technical processes.

Mutation results depend on panel size, variant filtering, tumor purity, sequencing assay, and whether germline or matched-normal filtering was performed.

Copy number and pathway context

Copy number alterations summarize amplifications and deletions in genomic regions or genes. They can be interpreted alongside point mutations and pathway enrichment.

Examples:

Amplification: increased copy number of a region or gene.
Deletion: reduced copy number of a region or gene.
Co-occurrence: alterations that appear together more often than expected.
Mutual exclusivity: alterations that rarely appear together.
Driver pathway enrichment: mapping altered genes to cancer-relevant pathways.

Copy number and pathway summaries help connect individual alterations to larger biological programs, but they should be reviewed against the assay type and cohort size.

Survival analysis

Survival analysis connects clinical endpoint timing to groups or molecular features.

Common survival concepts:

Kaplan-Meier curve: estimate of survival probability over time.
Log-rank test: statistical comparison between survival curves.
Cox proportional hazards regression: model estimating associations between covariates and event hazard.
Hazard ratio: relative event hazard between groups or feature levels.
Longitudinal trajectory: feature values tracked across timepoints.
Stratification: splitting subjects by treatment arm, mutation status, immune phenotype, or another feature.

Survival outputs depend on endpoint definitions, follow-up length, censoring, sample size, and covariate selection. They are research statistics, not clinical predictions.

Artificial intelligence interpretation

Artificial intelligence interpretation summarizes completed oncology pipeline outputs with metric citations.

Interpretation contexts include:

Context	When	Output
Cell-cell communication	Communication run complete	Signaling axis summary with cited interaction scores and pathway rankings
Immune phenotype	Immune profile run complete	Deconvolution, Tumor Immune Dysfunction and Exclusion phenotype, exhaustion, and repertoire narrative
Mutation landscape	Mutation run complete	Tumor mutational burden, driver alterations, signature, and oncoprint interpretation
Survival	Survival run complete	Kaplan-Meier and Cox findings with cited hazard ratios and P values
Multi-modal summary	Multiple source runs	Integrated narrative connecting tumor microenvironment, immune, mutation, and outcome findings

Interpretations do not compute new metrics, diagnose cancer, or recommend treatment. They explain completed outputs and include a research disclaimer.

Study

A study is the top-level container for one oncology project. It holds samples, clinical endpoints, mutation and repertoire datasets, pipeline runs, artifacts, and interpretation outputs. Create one study per trial, tumor cohort, or translational analysis project.

Optional metadata includes tumor type, cancer stage, treatment arm, and response status.

Sample

A sample links molecular data to cohort metadata. Each sample records:

Patient and sample identifiers
Timepoint
Tumor site
Treatment
Response status
Group label
Flexible JavaScript Object Notation (JSON) metadata for study-specific fields

Samples align keys across single-cell, bulk expression, mutation, repertoire, pathology, and clinical tables.

Mutation dataset

A mutation dataset is a registered Mutation Annotation Format or Variant Call Format file linked to the study. The mutation landscape pipeline reads the file path and produces tumor mutational burden, variant classification summaries, oncoprint data, signature decomposition, and optional copy number summaries.

Repertoire dataset

A repertoire dataset is a registered T cell receptor or B cell receptor sequencing file. The immune profile pipeline uses repertoire paths for clonotype detection, diversity metrics, and clonal expansion analysis.

Pipeline run

A run is one execution of an oncology analysis pipeline.

Each run stores:

Unique run identifier, prefixed with onco-run-
Pipeline type and version
JavaScript Object Notation parameter record
Status: queued -> running -> complete or failed
Output artifacts, such as JSON summaries, oncoprint data, Kaplan-Meier curves, and interpretations
Timestamps and summary statistics

Runs chain through provenance. Interpretation references completed source runs, and survival analysis can stratify by features from immune or mutation pipelines.

Job polling

All analysis runs asynchronously in backend-analysis. The frontend polls GET /oncology/jobs/{run_id} until the job completes. Status badges update on workflow pages.

Artifacts

Pipeline outputs are written to data/oncology/artifacts/{run_id}/ as JavaScript Object Notation.

Examples:

result.json: full pipeline output with summary statistics, tables, and visualization data
Communication artifacts: interaction tables, pathway scores, and sender/receiver network
Immune profile artifacts: deconvolution fractions, Tumor Immune Dysfunction and Exclusion phenotype, exhaustion scores, and repertoire diversity
Mutation landscape artifacts: tumor mutational burden table, oncoprint matrix, signature decomposition, and copy number summary
Survival artifacts: Kaplan-Meier curve data, log-rank P value, and Cox regression coefficients
Interpretation artifacts: narrative JSON with metric citations

Relationship to other research areas

Single-cell quality control, clustering, spatial transcriptomics, differential expression, and biomarker machine learning are owned by the Computational Biology area. Oncology calls those pipelines for tumor microenvironment profiling and adds the oncology-specific layer: communication networks, immune deconvolution, mutation analysis, and survival statistics.

Whole-slide imaging, tissue segmentation, and image-based spatial quantification are owned by the Pathology area. Oncology can use pathology outputs for pathology-integrated tumor microenvironment context without duplicating image processing.

Provenance and reproducibility

Every analysis should be reproducible. The study Run history panel lists all pipeline executions with status, timestamps, pipeline type, parameters, and artifact links. Re-run jobs with identical parameters from workflow pages when upstream data or settings change.

Interpretation scope

The Oncology area supports exploratory and translational oncology research. It is not a regulated clinical diagnostic, clinical decision support tool, or treatment selection system. Treat outputs as structured research evidence for expert review.

Menu

Documentation

Key concepts

Tumor cohorts and samples

Clinical endpoints

Tumor microenvironment

Cell-cell communication

Immuno-oncology profiling

Repertoire analysis

Mutation landscape

Copy number and pathway context

Survival analysis

Artificial intelligence interpretation

Study

Sample

Mutation dataset

Repertoire dataset

Pipeline run

Job polling

Artifacts

Relationship to other research areas

Provenance and reproducibility

Interpretation scope