Wrapper for bedtools maskfasta.

mask_regions_in_fasta(bed_file, fasta_file, out_fasta_file, ...)

Arguments

bed_file

Path to bed file with locations of regions to mask.

fasta_file

Path to unmasked fasta file.

out_fasta_file

Path to write masked fasta file.

...

Other arguments. Not used by this function, but meant to be used by drake_plan for tracking during workflows.

Value

List; output of processx::run(). Externally, a fasta file will be written to the path specified by `out_fasta_file`.

Details

All regions of the `fasta_file` specified by the `bed_file` will be replaced ("hard-masked") with 'N's.

The bed file is a tab-separated file with columns for chromosome (e.g., chr1), start position (e.g., 1), and end position (e.g., 10), in that order. No column headers are used.

Examples

# NOT RUN {
# First write genes, introns, and exons out as tsv files

temp_dir <- tempdir()
find_bed_regions(
  gff3_file = system.file("extdata", "Arabidopsis_thaliana_TAIR10_40_small.gff3", package = "baitfindR", mustWork = TRUE),
  source_select = "araport11",
  out_type = "write_all",
  out_dir = temp_dir,
  prefix = "arabidopsis"
)

# Now mask the genome, using the bed file and genome fasta file.
mask_genome(
  bed_file = "temp_dir/test_introns",
  fasta_file = "data_raw/Arabidopsis_thaliana.TAIR10.dna.toplevel.renamed.fasta",
  out_fasta_file = "temp_dir/test_masked"
)
# }