If tsv files are written out by selecting "write_all" for `out_type`, they will overwrite any existing files with the same name in `out_dir`.
find_bed_regions(gff3_file, source_select = NULL, gene_label = "gene", exon_label = "exon", verbose = FALSE, prefix = NULL, out_dir = NULL, out_type = c("genes", "introns", "exons", "write_all"), ...)
| gff3_file | Path to input file in `gff3` format. |
|---|---|
| source_select | Character vector; only use regions from these sources. Must match values in `source` column of gff3 file. Optional. |
| gene_label | String; value used to indicate genes in gff3 file. Must match at least one value in `type` column of gff3 file. Default "gene". |
| exon_label | String; value used to indicate exons in gff3 file. Must match at least one value in `type` column of gff3 file. Default "exon". |
| verbose | Logical; should `bedr` functions output all messages? |
| prefix | String; prefix to attach to tsv files if `out_type` is "write_all". |
| out_dir | Directory to write tsv files if `out_type` is "write_all". |
| out_type | Type of output to return: "genes": dataframe in "bed" format of genes. "introns": dataframe in "bed" format of introns. "exons": dataframe in "bed" format of exons. "write_all": write tab-separated files for each of `genes`, `introns`, and `exons` to `out_dir`. The hash digest of the combined genes, introns, and exons will be returned. |
| ... | Other arguments. Not used by this function, but meant to
be used by |
Dataframe or character.
# Find genes arabidopsis_gff_file <- system.file("extdata", "Arabidopsis_thaliana_TAIR10_40_small.gff3", package = "baitfindR", mustWork = TRUE) genes <- find_bed_regions( gff3_file = arabidopsis_gff_file, source_select = "araport11", out_type = "genes" ) head(genes)#> chr start end #> 1:3631-5899 1 3631 5899 #> 1:6788-9130 1 6788 9130 #> 1:11649-13714 1 11649 13714 #> 1:23121-33171 1 23121 33171 #> 1:33365-37871 1 33365 37871 #> 1:38444-41017 1 38444 41017# Find introns introns <- find_bed_regions( gff3_file = arabidopsis_gff_file, source_select = "araport11", out_type = "introns" ) head(introns)#> chr start end #> 1:3913-3996 1 3913 3996 #> 1:4276-4486 1 4276 4486 #> 1:4605-4706 1 4605 4706 #> 1:5095-5174 1 5095 5174 #> 1:5326-5439 1 5326 5439 #> 1:7069-7157 1 7069 7157# Find exons exons <- find_bed_regions( gff3_file = arabidopsis_gff_file, source_select = "araport11", out_type = "exons" ) head(exons)#> chr start end #> 1:3631-3913 1 3631 3913 #> 1:3996-4276 1 3996 4276 #> 1:4486-4605 1 4486 4605 #> 1:4706-5095 1 4706 5095 #> 1:5174-5326 1 5174 5326 #> 1:5439-5899 1 5439 5899# NOT RUN { # Write genes, introns, and exons out as tsv files temp_dir <- tempdir() find_bed_regions( gff3_file = arabidopsis_gff_file, source_select = "araport11", out_type = "write_all", out_dir = temp_dir, prefix = "arabidopsis" ) # }