Skip to contents

Topic statistics to estimate factor loading

Usage

asap_pmf_stat_rbind(
  mtx_files,
  row_files,
  col_files,
  idx_files,
  log_beta_vec,
  beta_row_names_vec,
  do_stdize_beta = FALSE,
  do_log1p = FALSE,
  verbose = FALSE,
  NUM_THREADS = 1L,
  BLOCK_SIZE = 100L,
  MAX_ROW_WORD = 2L,
  ROW_WORD_SEP = "_",
  MAX_COL_WORD = 100L,
  COL_WORD_SEP = "@"
)

Arguments

do_stdize_beta

use standardized log_beta (default: TRUE)

do_log1p

do log(1+y) transformation (default: FALSE)

verbose

verbosity

NUM_THREADS

number of threads in data reading

BLOCK_SIZE

disk I/O block size (number of columns)

MAX_ROW_WORD

maximum words per line in row_files[i]

ROW_WORD_SEP

word separation character to replace white space

MAX_COL_WORD

maximum words per line in col_files[i]

COL_WORD_SEP

word separation character to replace white space

mtx_file

matrix-market-formatted data file (D x N, bgzip)

row_file

row names file (D x 1)

col_file

column names file (N x 1)

idx_file

matrix-market colum index file

log_x

D x K log dictionary/design matrix

beta_row_names

row names log_x (D vector)

Value

a list that contains:

  • beta.list a list of dictionary matrices (row x factor)

  • corr.list a list of empirical correlation matrices (column x factor)

  • colsum.list a list of column sum vectors (column x 1)