Given a directory containing aligned fasta files, clean the alignments by removing columns below the specified occupancy cutoff.

phyutility_wrapper(path_to_ys = pkgconfig::get_config("baitfindR::path_to_ys"),
  fasta_folder, min_col_occup, seq_type = "dna", overwrite = FALSE,
  get_hash = TRUE, echo = pkgconfig::get_config("baitfindR::echo",
  fallback = FALSE), ...)

Arguments

path_to_ys

Character vector of length one; the path to the folder containing Y&S python scripts, e.g., "/Users/me/apps/phylogenomic_dataset_construction/"

fasta_folder

Character vector of length one; the path to the folder containing the alignments (fasta files) to be cleaned. Alignment files must end in .aln.

min_col_occup

Numeric; characters (columns of the alignment) with less than this occupancy (as a decimal) will be removed from each alignment in the folder.

seq_type

Character vector of length one indicating type of sequences. Should either be "dna" for DNA or "aa" for proteins.

overwrite

Logical; should previous output of this command be erased so new output can be written? Once erased it cannot be restored, so use with caution!

get_hash

Logical; should the 32-byte MD5 hash be computed for all result files concatenated together? Used for by drake_plan for tracking during workflows. If TRUE, this function will return the hash.

echo

Logical; should the standard output and error be printed to the screen?

...

Other arguments. Not used by this function, but meant to be used by drake_plan for tracking during workflows.

Value

Cleaned alignments will be written to fasta_folder with the file ending .aln-cln. If get_hash is TRUE, the 32-byte MD5 hash be computed for all .aln-cln files concatenated together will be returned.

Details

Wrapper for Yang and Smith (2014) phyutility_wrapper.py

References

Yang, Y. and S.A. Smith. 2014. Orthology inference in non-model organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics. Molecular Biology and Evolution 31:3081-3092. https://bitbucket.org/yangya/phylogenomic_dataset_construction/overview

Examples

# NOT RUN {
phyutility_wrapper(fasta_folder = "some/folder/with/alignments/", min_col_occup = 0.3, seq_type = "dna")
# }