If tsv files are written out by selecting "write_all" for `out_type`, they will overwrite any existing files with the same name in `out_dir`.

find_bed_regions(gff3_file, source_select = NULL, gene_label = "gene",
  exon_label = "exon", verbose = FALSE, prefix = NULL,
  out_dir = NULL, out_type = c("genes", "introns", "exons",
  "write_all"), ...)

Arguments

gff3_file

Path to input file in `gff3` format.

source_select

Character vector; only use regions from these sources. Must match values in `source` column of gff3 file. Optional.

gene_label

String; value used to indicate genes in gff3 file. Must match at least one value in `type` column of gff3 file. Default "gene".

exon_label

String; value used to indicate exons in gff3 file. Must match at least one value in `type` column of gff3 file. Default "exon".

verbose

Logical; should `bedr` functions output all messages?

prefix

String; prefix to attach to tsv files if `out_type` is "write_all".

out_dir

Directory to write tsv files if `out_type` is "write_all".

out_type

Type of output to return: "genes": dataframe in "bed" format of genes. "introns": dataframe in "bed" format of introns. "exons": dataframe in "bed" format of exons. "write_all": write tab-separated files for each of `genes`, `introns`, and `exons` to `out_dir`. The hash digest of the combined genes, introns, and exons will be returned.

...

Other arguments. Not used by this function, but meant to be used by drake_plan for tracking during workflows.

Value

Dataframe or character.

Examples

# Find genes arabidopsis_gff_file <- system.file("extdata", "Arabidopsis_thaliana_TAIR10_40_small.gff3", package = "baitfindR", mustWork = TRUE) genes <- find_bed_regions( gff3_file = arabidopsis_gff_file, source_select = "araport11", out_type = "genes" ) head(genes)
#> chr start end #> 1:3631-5899 1 3631 5899 #> 1:6788-9130 1 6788 9130 #> 1:11649-13714 1 11649 13714 #> 1:23121-33171 1 23121 33171 #> 1:33365-37871 1 33365 37871 #> 1:38444-41017 1 38444 41017
# Find introns introns <- find_bed_regions( gff3_file = arabidopsis_gff_file, source_select = "araport11", out_type = "introns" ) head(introns)
#> chr start end #> 1:3913-3996 1 3913 3996 #> 1:4276-4486 1 4276 4486 #> 1:4605-4706 1 4605 4706 #> 1:5095-5174 1 5095 5174 #> 1:5326-5439 1 5326 5439 #> 1:7069-7157 1 7069 7157
# Find exons exons <- find_bed_regions( gff3_file = arabidopsis_gff_file, source_select = "araport11", out_type = "exons" ) head(exons)
#> chr start end #> 1:3631-3913 1 3631 3913 #> 1:3996-4276 1 3996 4276 #> 1:4486-4605 1 4486 4605 #> 1:4706-5095 1 4706 5095 #> 1:5174-5326 1 5174 5326 #> 1:5439-5899 1 5439 5899
# NOT RUN { # Write genes, introns, and exons out as tsv files temp_dir <- tempdir() find_bed_regions( gff3_file = arabidopsis_gff_file, source_select = "araport11", out_type = "write_all", out_dir = temp_dir, prefix = "arabidopsis" ) # }