We will generate Y ~ Poisson(mu * rho) where mu ~ exp(log.mu/smudge), rho ~ Gamma(a,b)

make.sc.eqtl.mosaic(
  file.header,
  X,
  h2,
  n.causal.snps = 1,
  n.causal.genes = 5,
  pve.y.by.u0 = 0.3,
  n.u0 = 3,
  pve.u1.by.x = 0.8,
  pve.y.by.u1 = 0.3,
  n.u1 = 3,
  pve.interaction = 0.5,
  n.interaction = 0,
  n.genes = 50,
  n.covar.genes = n.genes,
  num.mixtures = 1,
  num.mosaic = 1,
  smudge = 1,
  rho.a = 2,
  rho.b = 2,
  ncell.ind = 10,
  rseed = 13
)

Arguments

X

genotype matrix (individual x SNPs)

h2

heritability (proportion of variance of Y explained by genetic X)

n.causal.snps

X variables directly affecting on Y

n.causal.genes

Y variables directly regulated by X

pve.y.by.u0

proportion of variance of Y explained by U0

n.u0

number of covariates on Y

pve.u1.by.x

proportion of variance of U1 explained by X

pve.y.by.u1

proportion of variance of Y explained by U1

n.u1

number of covariates on Y

pve.interaction

proportion of variance of Y explained by interaction

n.interaction

number of genes interacting with the causal genes

n.genes

total number of genes (Y variables)

num.mixtures

num of cell mixtures

smudge

a scaling factor for a GLM model (default: 1)

rho.a

rho ~ Gamma(a, b)

rho.b

rho ~ Gamma(a, b)

ncell.ind

number of cells per individual

rseed

random seed

num.batches

num of single-cell data batches

Value

simulation results

Details

The simulation result list will have two lists:

data:

  • data$mtx: a matrix market data file

  • data$row: a file with row names

  • data$col: a file with column names

  • data$idx: an indexing file for the columns

  • data$indv: a mapping file between column and individual names

indv:

  • indv$y: observed (noisy) individual x gene matrix

  • indv$x: observed individual x variants genotype matrix

  • indv$causal.snps: causal variants (X variables)

  • indv$causal.genes: causal genes (Y variables)

  • indv$causal.label: true labels