Single-cell atlas

Build a single-cell RNA-seq atlas from raw or processed data — QC, clustering, marker identification, differential expression, and pathway enrichment with publication-ready figures.

Research question

What cell populations exist in this dataset, what genes define each cluster, and how does expression differ between experimental conditions?

Who this is for

Wet-lab biologists running their first single-cell analysis without local Seurat/Scanpy setup
Computational biologists who want transparent parameters and checkpoint provenance
Core facilities delivering standard QC → clustering → marker → enrichment packages to client labs

Data requirements

Data	Required	Purpose
`.h5ad` or convertible 10x/CSV	Yes	AnnData input for all pipeline steps
Gene symbols in `var`	Recommended	GO enrichment compatibility
Metadata with condition labels	No (required for contrast DE)	Group comparison on Data page
Saved contrasts	No (required for contrast DE)	Treated vs. control comparisons

Workflow

Data → QC → Normalization → Clustering → Explore UMAP → DE → Enrichment → Figures → Snapshot

Step 1 — Upload and prepare

Create a single-cell study and upload an .h5ad file (or convert 10x/CSV with Convert on the Data page). Save metadata on Data → Metadata so columns merge into obs before analysis runs.

Define contrasts on the Data page when condition-level DE is needed.

Step 2 — Find structure

Open Analyze → Find Structure and run in order:

QC — filter by genes/cells detected, counts, and mitochondrial fraction
Normalization — log-normalize and select highly variable genes
Clustering — Leiden clustering and UMAP embedding

Re-running an upstream step invalidates downstream checkpoints and marks dependent results stale.

Step 3 — Explore

Review outputs before formal comparison:

Explore → Sample QC — retention histograms and per-sample metrics
Explore → Embedding — interactive UMAP colored by cluster or metadata column

Step 4 — Compare groups

Under Analyze → Compare Groups:

Mode	When to use
Cluster markers	Genes enriched in each cluster vs. all others
Contrast	Groups defined on the Data page (e.g. treated vs. control)

Run Pathway enrichment (ORA against GO biological process) on DE results.

Step 5 — Interpret and publish

Interpret → Enrichment — GO term tree and pathway context
Figures — drag UMAP, volcano, and heatmap panels onto the canvas; export PDF
Runs → Snapshots — freeze parameter set and run IDs for reproducibility

Step 6 — AI interpretation (optional)

Use analytical interpretation on completed DE or enrichment runs for cluster annotations and pathway narratives. Requires Ollama configuration — see repo root ai.md.

Expected outputs

Filtered AnnData checkpoints under data/processed/{study_id}/{dataset_id}/
UMAP embedding with Leiden cluster labels
Ranked DE table with log fold change and adjusted p-values
GO enrichment results with term hierarchy
Multi-panel figure PDF and analysis snapshot

Typical analyses

Analysis	Comparison	Question
Cell type discovery	Cluster markers	What genes define each population?
Treatment response	Contrast DE	Which genes change after perturbation?
Immune infiltration	Condition on metadata	Are T cell clusters expanded in responders?
Core client delivery	Full pipeline + snapshot	Reproducible deliverable for requesting lab

Menu

Documentation