Utilities for working with biological sequences in R.
seqUtils provides simple, efficient tools for common sequence analysis
tasks:
- Read/write FASTA files - Fast I/O for single-line and multi-line FASTA formats
- Translate sequences - DNA/RNA to amino acids with deletion handling
- Align sequences - Wrapper for MAFFT alignment to reference sequences
- Find substitutions - Compare sequences and identify differences
- Calculate consensus - Generate consensus sequences from alignments
- Clean sequences - Remove or replace non-standard characters
You can install the development version of seqUtils from GitHub with:
# install.packages("devtools")
devtools::install_github("SamT123/seqUtils")library(seqUtils)
# Read FASTA file
sequences <- fast_fasta("sequences.fasta")
# Translate to amino acids
aa_sequences <- translate(sequences)
# Find substitutions compared to a reference
substitutions <- get_substitutions(
reference_seq,
query_seqs,
exclude = c("X", "N")
)- Biostrings package
- MAFFT (for alignment functions)