GRN Inference
GRN inference using integrated methods
To infer a GRN for a given dataset using an integrated method, first download the test datasets:
aws s3 sync s3://openproblems-data/resources_test/grn/grn_benchmark \
resources_test/grn_benchmark --no-sign-request
Use resources_test/ paths for development and testing. Switch to resources/grn_benchmark/ for full-scale runs (requires downloading the main datasets — see Datasets).
With Docker (Viash)
viash run src/methods/pearson_corr/config.vsh.yaml -- \
--rna resources_test/grn_benchmark/inference_data/op_rna.h5ad \
--prediction output/net.h5ad \
--tf_all resources_test/grn_benchmark/prior/tf_all.csv
Without Docker (conda or Singularity)
bash src/methods/pearson_corr/run_local.sh \
--rna resources_test/grn_benchmark/inference_data/op_rna.h5ad \
--tf_all resources_test/grn_benchmark/prior/tf_all.csv \
--prediction output/net.h5ad
Replace pearson_corr with any method under src/methods/. Methods that require Singularity (e.g. GRNBoost2, scPRINT, Scenic/Scenic+, CellOracle) will invoke it automatically from their run_local.sh script.
GRN inference without method integration
Download the inference datasets from the Datasets page. For each dataset, perform GRN inference using your own algorithm. Consider the following:
We evaluate only the top TF-gene pairs, currently limited to 50,000 edges, ranked by their assigned weight.
The inferred network should follow this format:
Columns:
source: Transcription factor (TF)target: Target geneweight: Regulatory importance/likelihood score
Since geneRNIB works with AnnData, your inferred network should be saved in this format.
If your network is a pandas DataFrame with three columns (source, target, weight), you can save it as follows:
Python
import anndata as ad
net['weight'] = net['weight'].astype(str) # weight must be stored as a string
output = ad.AnnData(
X=None,
uns={
"method_id": "your_method_name", # e.g. "grnboost2"
"dataset_id": "dataset_name", # e.g. "replogle"
"prediction": net[["source", "target", "weight"]]
}
)
output.write_h5ad("save_to_file.h5ad")
R
library(zellkonverter)
net$weight <- as.character(net$weight)
output <- AnnData(
X = matrix(nrow = 0, ncol = 0),
uns = list(
method_id = "your_method_name", # e.g. "grnboost2"
dataset_id = "dataset_name", # e.g. "replogle"
prediction = net[, c("source", "target", "weight")]
)
)
output$write_h5ad("save_to_file.h5ad", compression = "gzip")
Once you have inferred GRNs for one or more datasets, proceed to the GRN evaluation page to run the evaluation metrics.