rectanglepy.pp.build_rectangle_signatures¶
- rectanglepy.pp.build_rectangle_signatures(adata, cell_type_col='cell_type', bulks=None, *, optimize_cutoffs=True, layer=None, raw=False, p=0.015, lfc=1.5, n_cpus=None, gene_expression_threshold=0.5)¶
Builds rectangle signatures based on single-cell count data and annotations.
- Parameters:
adata (
AnnData) – The single-cell count data as a DataFrame. DataFrame must have the genes as index and cell identifier as columns. Each entry should be in raw counts.bulks (
Optional[DataFrame] (default:None)) – The bulk data as a DataFrame. DataFrame must have the bulk identifier as index and the genes as columns. Each entry should be in transcripts per million (TPM).cell_type_col (
str(default:'cell_type')) – The annotations corresponding to the single-cell count data. Series data should have the cell identifier as index and the annotations as values.layer (
Optional[str] (default:None)) – The Anndata layer to use for the single-cell data. Defaults to None.raw (
bool(default:False)) – A flag indicating whether to use the raw Anndata data. Defaults to False.optimize_cutoffs (default:
True) – Indicates whether to optimize the p-value and log fold change cutoffs using gridsearch. Defaults to True.p (default:
0.015) – The p-value threshold for the DE analysis (only used if optimize_cutoffs is False).lfc (default:
1.5) – The log fold change threshold for the DE analysis (only used if optimize_cutoffs is False).n_cpus (
Optional[int] (default:None)) – The number of cpus to use for the DE analysis. Defaults to the number of cpus available.gene_expression_threshold (default:
0.5) – The gene expression threshold for the DE analysis. How many cells need to express a gene to be considered in DGE
- Return type:
- Returns:
The result of the rectangle signature analysis which is of type RectangleSignatureResult.