Skip to content

Usage

batch_correct_counts(
  moo,
  count_type = "norm",
  sub_count_type = "voom",
  sample_id_colname = NULL,
  feature_id_colname = NULL,
  samples_to_include = NULL,
  covariates_colnames = "Group",
  batch_colname = "Batch",
  label_colname = NULL,
  colors_for_plots = NULL,
  print_plots = options::opt("print_plots"),
  save_plots = options::opt("save_plots"),
  plots_subdir = "batch"
)

Arguments

moo

multiOmicDataSet object (see create_multiOmicDataSet_from_dataframes())

count_type

the type of counts to use – must be a name in the counts slot (moo@counts)

sub_count_type

if count_type is a list, specify the sub count type within the list. (Default: "voom")

sample_id_colname

The column from the sample metadata containing the sample names. The names in this column must exactly match the names used as the sample column names of your input Counts Matrix. (Default: NULL - first column in the sample metadata will be used.)

feature_id_colname

The column from the counts dataa containing the Feature IDs (Usually Gene or Protein ID). This is usually the first column of your input Counts Matrix. Only columns of Text type from your input Counts Matrix will be available to select for this parameter. (Default: NULL - first column in the counts matrix will be used.)

samples_to_include

Which samples would you like to include? Usually, you will choose all sample columns, or you could choose to remove certain samples. Samples excluded here will be removed in this step and from further analysis downstream of this step. (Default: NULL - all sample IDs in moo@sample_meta will be used.)

covariates_colnames

The column name(s) from the sample metadata containing variable(s) of interest, such as phenotype. Most commonly this will be the same column selected for your Groups Column. Some experimental designs may require that you add additional covariate columns here. Do not include the batch_colname here.

batch_colname

The column from the sample metadata containing the batch information. Samples extracted, prepared, or sequenced at separate times or using separate materials/staff/equipment may belong to different batches. Not all data sets have batches, in which case you do not need batch correction. If your data set has no batches, you can provide a batch column with the same value in every row to skip batch correction (alternatively, simply do not run this function).

label_colname

The column from the sample metadata containing the sample labels as you wish them to appear in the plots produced by this template. This can be the same Sample Names Column. However, you may desire different labels to display on your figure (e.g. shorter labels are sometimes preferred on plots). In that case, select the column with your preferred Labels here. The selected column should contain unique names for each sample. (Default: NULLsample_id_colname will be used.)

colors_for_plots

Colors for the PCA and histogram will be picked, in order, from this list. If you have >12 samples or groups, program will choose from a wide range of random colors

print_plots

Whether to print plots during analysis (Defaults to FALSE, overwritable using option 'moo_print_plots' or environment variable 'MOO_PRINT_PLOTS')

save_plots

Whether to save plots to files during analysis (Defaults to FALSE, overwritable using option 'moo_save_plots' or environment variable 'MOO_SAVE_PLOTS')

plots_subdir

subdirectory in where plots will be saved if save_plots is TRUE

Value

multiOmicDataSet with batch-corrected counts

Examples

moo <- multiOmicDataSet(
  sample_metadata = as.data.frame(nidap_sample_metadata),
  anno_dat = data.frame(),
  counts_lst = list(
    "raw" = as.data.frame(nidap_raw_counts),
    "clean" = as.data.frame(nidap_clean_raw_counts),
    "filt" = as.data.frame(nidap_filtered_counts),
    "norm" = list(
      "voom" = as.data.frame(nidap_norm_counts)
    )
  )
) %>%
  batch_correct_counts(
    count_type = "norm",
    sub_count_type = "voom",
    covariates_colnames = "Group",
    batch_colname = "Batch",
    label_colname = "Label"
  )
#> * batch-correcting norm-voom counts
#> Found2batches
#> Adjusting for2covariate(s) or covariate level(s)
#> Standardizing Data across genes
#> Fitting L/S model and finding priors
#> Finding parametric adjustments
#> Adjusting the Data
#> The total number of features in output: 7943
#> Number of samples after batch correction: 10

head(moo@counts[["batch"]])
#>            Gene       A1       A2       A3       B1       B2       B3       C1
#> 1 0610007P14Rik 6.437738 6.251229 6.048600 6.284429 6.188062 6.180803 6.333751
#> 2 0610009B22Rik 4.904608 5.100317 4.960486 4.037742 4.843373 5.098318 4.013808
#> 3 0610010F05Rik 4.921026 5.701279 6.485933 6.140332 5.847360 5.560233 3.737422
#> 4 0610011F06Rik 5.309874 5.288411 5.069086 5.261067 5.269024 5.551350 5.548404
#> 5 0610012G03Rik 5.426686 5.406358 5.415468 4.625768 5.333482 5.529869 5.845995
#> 6 0610037L13Rik 5.413417 5.293344 5.144240 5.421276 3.945936 4.831507 4.443280
#>         C2       C3
#> 1 6.253867 6.530433
#> 2 4.391701 5.050022
#> 3 2.756696 2.865261
#> 4 5.919472 5.455400
#> 5 6.086350 4.769502
#> 6 4.651311 5.063511