Skip to contents

Generate approximate pseudo-bulk interaction data by random projections

Usage

asap_random_bulk_interacting_columns(
  mtx_file,
  row_file,
  col_file,
  idx_file,
  num_factors,
  W_nm_list,
  mtx_file_rhs = NULL,
  row_file_rhs = NULL,
  col_file_rhs = NULL,
  idx_file_rhs = NULL,
  A_dd_list = NULL,
  rseed = 42L,
  do_product = FALSE,
  do_log1p = FALSE,
  do_down_sample = FALSE,
  save_rand_proj = FALSE,
  weighted_rand_proj = FALSE,
  NUM_THREADS = 0L,
  CELL_NORM = 10000,
  BLOCK_SIZE = 1000L,
  EDGE_PER_SAMPLE = 100L,
  a0 = 1,
  b0 = 1,
  MAX_ROW_WORD = 2L,
  ROW_WORD_SEP = "_",
  MAX_COL_WORD = 100L,
  COL_WORD_SEP = "@",
  verbose = FALSE
)

Arguments

mtx_file

matrix-market-formatted data file (bgzip)

row_file

row names (gene/feature names)

col_file

column names (cell/column names)

idx_file

matrix-market colum index file

num_factors

a desired number of random factors

W_nm_list

list(src.index, tgt.index, weights) for columns

mtx_file_rhs

right-hand-side matrix-market-formatted data file (bgzip)

row_file_rhs

right-hand-side row names (gene/feature names)

col_file_rhs

right-hand-side column names (cell/column names)

idx_file_rhs

right-hand-side matrix-market colum index file

A_dd_list

list(src.index, tgt.index, weights) for features

rseed

random seed

do_product

yi * yj for interaction (default: FALSE)

do_log1p

log(x + 1) transformation (default: FALSE)

do_down_sample

down-sampling (default: FALSE)

save_rand_proj

save random projection (default: FALSE)

weighted_rand_proj

save random projection (default: FALSE)

NUM_THREADS

number of threads in data reading

CELL_NORM

normalization constant per each data point

BLOCK_SIZE

disk I/O block size (number of columns)

EDGE_PER_SAMPLE

down-sampling cell per sample (default: 100)

a0

gamma(a0, b0) (default: 1)

b0

gamma(a0, b0) (default: 1)

MAX_ROW_WORD

maximum words per line in row_file

ROW_WORD_SEP

word separation character to replace white space

MAX_COL_WORD

maximum words per line in col_file

COL_WORD_SEP

word separation character to replace white space

verbose

verbosity

Value

a list

  • PB pseudobulk (average) data (feature x sample)

  • sum pseudobulk (sum) data (feature x sample)

  • size size per sample (sample x 1)

  • positions pseudobulk sample positions (cell pair x 1)

  • rand.dict random dictionary (proj factor x feature)

  • rand.proj random projection results (sample x proj factor)

  • colnames column (cell) names

  • rownames feature (gene) names