Skip to contents

This function sets up training datasets for use in machine learning models.

Usage

setup_training(
  query_cds,
  ref_cds,
  ref_celldata_col,
  norm_method = c("log", "binary", "size_only", "none"),
  selected_genes = NULL,
  train_frac = 0.8,
  tf_idf = F,
  scale = F,
  LSImethod = 1,
  verbose = T,
  addbias = F,
  return_type = c("list", "matrix", "S4obj"),
  debug = F,
  ...
)

Arguments

query_cds

A cell data set (cds) to query.

ref_cds

A reference cell data set (cds).

ref_celldata_col

The column in the reference cell data set containing cell data.

norm_method

The normalization method to use. Options are "log", "binary", "size_only", or "none".

selected_genes

A vector of pre-selected genes for analysis.

train_frac

The fraction of data to use for training.

tf_idf

Boolean indicating whether to perform TF-IDF transformation.

scale

Boolean indicating whether to scale the data.

LSImethod

The method for Latent Semantic Indexing.

verbose

Boolean indicating whether to display verbose output.

addbias

Boolean indicating whether to add bias.

return_type

The type of output to return. Options are "list", "matrix", or "S4obj".

...

Additional arguments.

Value

A list, matrix, or S4 object containing the training datasets.