AlignAIR Logo
Documentation
DocumentationFAQ & Help

FAQ & Troubleshooting

Find quick answers to common questions and solutions to potential issues you might encounter while using AlignAIR.

🚨 Common Issues & Quick Fixes

Docker Won't Start

Error: Cannot connect to Docker daemon
Solution: Start Docker Desktop or run sudo systemctl start docker
Error: Permission denied
Solution: Add user to docker group: sudo usermod -aG docker $USER

Out of Memory

CUDA out of memory error
Solution: Reduce batch size: --batch-size=512
System memory exhausted
Solution: Process smaller datasets or increase swap space

Frequently Asked Questions

Installation

Setup, Docker, and environment issues

Usage

Parameters, commands, and workflows

Performance

Speed, memory, and optimization

Installation Questions

Q: Which installation method should I choose?
Docker (Recommended) for most users - it's easier and includes everything pre-configured.Local installation only if you're a developer or need custom modifications.
Q: Do I need a GPU to run AlignAIR?
No, but it's highly recommended. AlignAIR can run on CPU but will be significantly slower. For best performance, use an NVIDIA GPU with CUDA 11+ support.
Q: The Docker image is very large. Is this normal?
Yes, the image includes PyTorch, CUDA libraries, and pre-trained models. Expect 3-5GB download size. This is normal for deep learning applications.

Usage Questions

Q: What input file formats are supported?
AlignAIR supports CSV, TSV, and FASTA formats. For CSV/TSV files, ensure there's a column named "sequence" containing your nucleotide sequences.
Q: How do I choose the right threshold values?
Start with defaults: V=0.75, D=0.3, J=0.8. For high-quality data, increase thresholds for more stringent calls. For noisy data, decrease slightly. See our thresholding guide for details.
Q: Should I use heavy or light chain models?
Choose based on your data:
  • Heavy chain: Use IGH_S5F_576 for IGH sequences
  • Light chain: Use IGL_S5F_576 for IGL/IGK sequences
Q: My sequences are longer than 576 nucleotides. What happens?
AlignAIR automatically trims sequences to the maximum input size (default 576 nt) during preprocessing. The trimming preserves the most informative regions for V(D)J assignment.

Performance Questions

Q: How can I speed up processing?
Performance tips:
  • • Increase batch size: --batch-size=4096
  • • Use GPU instead of CPU
  • • Process sequences in similar length groups
  • • Ensure sufficient GPU memory
Q: AlignAIR is running out of memory. What can I do?
Memory optimization:
  • • Reduce batch size: --batch-size=512
  • • Split large datasets into smaller files
  • • Close other GPU-intensive applications
  • • Use CPU mode for very large datasets
Q: How long should processing take?
Typical processing times:
1K sequences (GPU):~30 seconds
10K sequences (GPU):~3-5 minutes
100K sequences (GPU):~30-60 minutes
CPU processing:~10x slower

Common Error Messages

File & Path Errors

FileNotFoundError: No such file or directory
Check your file paths and ensure volume mounting is correct
KeyError: 'sequence'
Your CSV file must have a column named "sequence"

CUDA & Memory Errors

CUDA out of memory
Reduce batch size or use CPU mode
No CUDA-capable device detected
Install NVIDIA drivers or use CPU mode

Still Need Help?

Can't find what you're looking for? We're here to help!

© 2025 AlignAIR. All rights reserved.Advancing computational biology through AI