DocumentationFAQ & Help
// faq
FAQ & Troubleshooting
Quick answers to common questions and solutions to potential issues.
// common issues
Common issues & quick fixes
Docker won't start
Cannot connect to Docker daemon
Fix: Start Docker Desktop or run
sudo systemctl start dockerPermission denied
Fix: Add user to docker group:
sudo usermod -aG docker $USEROut of memory
CUDA out of memory
Fix: Reduce batch size:
--batch-size=512System memory exhausted
Fix: Process smaller datasets or increase swap space.
// questions
Frequently asked questions
// installation
Installation
Which installation method should I choose?
Docker (recommended) for most users — easier and includes everything pre-configured. Local installation only if you're a developer or need custom modifications.
Do I need a GPU to run AlignAIR?
No, but it's highly recommended. AlignAIR can run on CPU but will be significantly slower. For best performance, use an NVIDIA GPU with CUDA 11+ support.
The Docker image is very large. Is this normal?
Yes, the image includes PyTorch, CUDA libraries, and pre-trained models. Expect 3–5GB download size. This is normal for deep learning applications.
// usage
Usage
What input file formats are supported?
AlignAIR supports CSV, TSV, and FASTA formats. For CSV/TSV files, ensure there's a column named
sequence containing your nucleotide sequences.How do I choose the right threshold values?
Start with defaults: V=0.75, D=0.3, J=0.8. For high-quality data, increase thresholds for more stringent calls. For noisy data, decrease slightly. See the thresholding guide for details.
Should I use heavy or light chain models?
Choose based on your data:
- Heavy chain: Use
IGH_S5F_576for IGH sequences - Light chain: Use
IGL_S5F_576for IGL/IGK sequences
My sequences are longer than 576 nucleotides. What happens?
AlignAIR automatically trims sequences to the maximum input size (default 576 nt) during preprocessing. Trimming preserves the most informative regions for V(D)J assignment.
// performance
Performance
How can I speed up processing?
Performance tips:
- Increase batch size:
--batch-size=4096 - Use GPU instead of CPU
- Process sequences in similar length groups
- Ensure sufficient GPU memory
AlignAIR is running out of memory. What can I do?
Memory optimization:
- Reduce batch size:
--batch-size=512 - Split large datasets into smaller files
- Close other GPU-intensive applications
- Use CPU mode for very large datasets
How long should processing take?
1K sequences (GPU)~30 seconds
10K sequences (GPU)~3–5 minutes
100K sequences (GPU)~30–60 minutes
CPU processing~10× slower
// errors
Common error messages
File & path errors
FileNotFoundError: No such file or directory
Check your file paths and ensure volume mounting is correct.
KeyError: 'sequence'
Your CSV file must have a column named "sequence".
CUDA & memory errors
CUDA out of memory
Reduce batch size or use CPU mode.
No CUDA-capable device detected
Install NVIDIA drivers or use CPU mode.
// help
Still need help?
Can't find what you're looking for? Open an issue or start a discussion on GitHub.
© 2025 AlignAIR. All rights reserved.·Advancing computational biology through AI