AlignAIR Logo
Documentation
DocumentationFAQ & Help
// faq

FAQ & Troubleshooting

Quick answers to common questions and solutions to potential issues.

// common issues

Common issues & quick fixes

Docker won't start

Cannot connect to Docker daemon
Fix: Start Docker Desktop or run sudo systemctl start docker
Permission denied
Fix: Add user to docker group: sudo usermod -aG docker $USER

Out of memory

CUDA out of memory
Fix: Reduce batch size: --batch-size=512
System memory exhausted
Fix: Process smaller datasets or increase swap space.
// questions

Frequently asked questions

// installation

Installation

Which installation method should I choose?
Docker (recommended) for most users — easier and includes everything pre-configured. Local installation only if you're a developer or need custom modifications.
Do I need a GPU to run AlignAIR?
No, but it's highly recommended. AlignAIR can run on CPU but will be significantly slower. For best performance, use an NVIDIA GPU with CUDA 11+ support.
The Docker image is very large. Is this normal?
Yes, the image includes PyTorch, CUDA libraries, and pre-trained models. Expect 3–5GB download size. This is normal for deep learning applications.
// usage

Usage

What input file formats are supported?
AlignAIR supports CSV, TSV, and FASTA formats. For CSV/TSV files, ensure there's a column named sequence containing your nucleotide sequences.
How do I choose the right threshold values?
Start with defaults: V=0.75, D=0.3, J=0.8. For high-quality data, increase thresholds for more stringent calls. For noisy data, decrease slightly. See the thresholding guide for details.
Should I use heavy or light chain models?

Choose based on your data:

  • Heavy chain: Use IGH_S5F_576 for IGH sequences
  • Light chain: Use IGL_S5F_576 for IGL/IGK sequences
My sequences are longer than 576 nucleotides. What happens?
AlignAIR automatically trims sequences to the maximum input size (default 576 nt) during preprocessing. Trimming preserves the most informative regions for V(D)J assignment.
// performance

Performance

How can I speed up processing?

Performance tips:

  • Increase batch size: --batch-size=4096
  • Use GPU instead of CPU
  • Process sequences in similar length groups
  • Ensure sufficient GPU memory
AlignAIR is running out of memory. What can I do?

Memory optimization:

  • Reduce batch size: --batch-size=512
  • Split large datasets into smaller files
  • Close other GPU-intensive applications
  • Use CPU mode for very large datasets
How long should processing take?
1K sequences (GPU)~30 seconds
10K sequences (GPU)~3–5 minutes
100K sequences (GPU)~30–60 minutes
CPU processing~10× slower
// errors

Common error messages

File & path errors

FileNotFoundError: No such file or directory
Check your file paths and ensure volume mounting is correct.
KeyError: 'sequence'
Your CSV file must have a column named "sequence".

CUDA & memory errors

CUDA out of memory
Reduce batch size or use CPU mode.
No CUDA-capable device detected
Install NVIDIA drivers or use CPU mode.
// help

Still need help?

Can't find what you're looking for? Open an issue or start a discussion on GitHub.

© 2025 AlignAIR. All rights reserved.Advancing computational biology through AI