DocumentationFAQ & Help
FAQ & Troubleshooting
Find quick answers to common questions and solutions to potential issues you might encounter while using AlignAIR.
🚨 Common Issues & Quick Fixes
Docker Won't Start
Error: Cannot connect to Docker daemon
Solution: Start Docker Desktop or run
sudo systemctl start docker
Error: Permission denied
Solution: Add user to docker group:
sudo usermod -aG docker $USER
Out of Memory
CUDA out of memory error
Solution: Reduce batch size:
--batch-size=512
System memory exhausted
Solution: Process smaller datasets or increase swap space
Frequently Asked Questions
Installation
Setup, Docker, and environment issues
Usage
Parameters, commands, and workflows
Performance
Speed, memory, and optimization
Installation Questions
Q: Which installation method should I choose?
Docker (Recommended) for most users - it's easier and includes everything pre-configured.Local installation only if you're a developer or need custom modifications.
Q: Do I need a GPU to run AlignAIR?
No, but it's highly recommended. AlignAIR can run on CPU but will be significantly slower. For best performance, use an NVIDIA GPU with CUDA 11+ support.
Q: The Docker image is very large. Is this normal?
Yes, the image includes PyTorch, CUDA libraries, and pre-trained models. Expect 3-5GB download size. This is normal for deep learning applications.
Usage Questions
Q: What input file formats are supported?
AlignAIR supports CSV, TSV, and FASTA formats. For CSV/TSV files, ensure there's a column named
"sequence"
containing your nucleotide sequences.Q: How do I choose the right threshold values?
Start with defaults: V=0.75, D=0.3, J=0.8. For high-quality data, increase thresholds for more stringent calls. For noisy data, decrease slightly. See our thresholding guide for details.
Q: Should I use heavy or light chain models?
Choose based on your data:
- • Heavy chain: Use
IGH_S5F_576
for IGH sequences - • Light chain: Use
IGL_S5F_576
for IGL/IGK sequences
Q: My sequences are longer than 576 nucleotides. What happens?
AlignAIR automatically trims sequences to the maximum input size (default 576 nt) during preprocessing. The trimming preserves the most informative regions for V(D)J assignment.
Performance Questions
Q: How can I speed up processing?
Performance tips:
- • Increase batch size:
--batch-size=4096
- • Use GPU instead of CPU
- • Process sequences in similar length groups
- • Ensure sufficient GPU memory
Q: AlignAIR is running out of memory. What can I do?
Memory optimization:
- • Reduce batch size:
--batch-size=512
- • Split large datasets into smaller files
- • Close other GPU-intensive applications
- • Use CPU mode for very large datasets
Q: How long should processing take?
Typical processing times:
1K sequences (GPU):~30 seconds
10K sequences (GPU):~3-5 minutes
100K sequences (GPU):~30-60 minutes
CPU processing:~10x slower
Common Error Messages
File & Path Errors
FileNotFoundError: No such file or directory
Check your file paths and ensure volume mounting is correct
KeyError: 'sequence'
Your CSV file must have a column named "sequence"
CUDA & Memory Errors
CUDA out of memory
Reduce batch size or use CPU mode
No CUDA-capable device detected
Install NVIDIA drivers or use CPU mode
Still Need Help?
Can't find what you're looking for? We're here to help!
© 2025 AlignAIR. All rights reserved.•Advancing computational biology through AI