Given a directory containing unaligned fasta files, align all fasta files in
the directory. If there are > 1000 sequences in the directory, use the
mafft --auto algorithm. If less, use the --genafpair
algorithm.
mafft_wrapper(path_to_ys = pkgconfig::get_config("baitfindR::path_to_ys"), fasta_folder, infile_ending = "fa", number_cores, seq_type = "dna", overwrite = FALSE, get_hash = TRUE, echo = pkgconfig::get_config("baitfindR::echo", fallback = FALSE), ...)
| path_to_ys | Character vector of length one; the path to the folder containing Y&S python scripts, e.g., "/Users/me/apps/phylogenomic_dataset_construction/" |
|---|---|
| fasta_folder | Character vector of length one; the path to the folder containing the fasta files to be aligned. |
| infile_ending | Character vector of length one; only files with this ending will be included. |
| number_cores | Numeric; number of threads to use for and |
| seq_type | Character vector of length one indicating type of sequences. Should either be |
| overwrite | Logical; should previous output of this command be erased so new output can be written? Once erased it cannot be restored, so use with caution! |
| get_hash | Logical; should the 32-byte MD5 hash be computed for all aligned fasta files concatenated together? Used for by |
| echo | Logical; should the standard output and error be printed to the screen? |
| ... | Other arguments. Not used by this function, but meant to be used by |
Aligned fasta files will be written to fasta_folder with the file ending .aln. If get_hash is TRUE, the 32-byte MD5 hash be computed for all .aln files concatenated together will be returned.
Wrapper for Yang and Smith (2014) mafft_wrapper.py
Yang, Y. and S.A. Smith. 2014. Orthology inference in non-model organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics. Molecular Biology and Evolution 31:3081-3092. https://bitbucket.org/yangya/phylogenomic_dataset_construction/overview
# NOT RUN { mafft_wrapper(fasta_folder = "some/folder/with/fasta/files", number_cores = 2, seq_type = "dna") # }