Skip to contents

Extract barcodes, UMIs, clipping information and poly(A)/poly(T) lengths, producing a large parquet file.

Usage

ingest_read_pairs(
  out_prefix,
  reads1,
  reads2,
  samples = NULL,
  max_mismatch = 1,
  clip_quality_char = "I",
  clip_penalty = 4,
  poly_penalty = 1000,
  suffix_penalty = 4,
  limit = Inf,
  subsample = 1,
  seed = 563
)

Details

For subsampling, the seqtk command-line tool must be installed.