Skip to content

Alphafold

Info

Based off work provided by Research Computing at University of Virginia

AlphaFold launch command

Please refer to run_alphafold.py for all available options.

Launch script run

For your convenience, we have prepared a launch script run that takes care of the Singularity command and the database paths, since these are unlikely to change. If you do need to customize anything please use the full Singularity command.

#!/bin/bash

if [ -z "$ALPHAFOLD_DATA_PATH" ]; then
   echo "\$ALPHAFOLD_DATA_PATH variable not set. Setting to default path:"
   echo "/panfs/pfs.local/scratch/all/db"
   echo ""
   export ALPHAFOLD_DATA_PATH=/panfs/pfs.local/scratch/all/db
fi

singularity run -B $(realpath $ALPHAFOLD_DATA_PATH):/data \
   -B .:/etc \
   --pwd /app/alphafold \
   --nv $CONTAINERDIR/alphafold-${EBVERSIONALPHAFOLD}.sif \
   --data_dir=/data \
   --uniref90_database_path=/data/uniref90/uniref90.fasta \
   --mgnify_database_path=/data/mgnify/mgy_clusters.fa \
   --template_mmcif_dir=/data/pdb_mmcif/mmcif_files \
   --obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat \
   "$@"

Explanation of Singularity flags

  1. The database and models are stored in $ALPHAFOLD_DATA_PATH.
  2. A cache file ld.so.cache will be written to /etc, which is not allowed on the cluster. The workaround is to bind-mount e.g. the current working directory to /etc inside the container. [-B .:/etc]
  3. You must launch AlphaFold from /app/alphafold inside the container due to this issue. [--pwd /app/alphafold]
  4. The --nv flag enables GPU support.

Explanation of AlphaFold flags

  1. The default command of the container is at /app/run_alphafold.sh.
  2. As a consequence of the Singularity --pwd flag, the fasta and output paths must be full paths (e.g. /home/$USER/mydir, not relative paths (e.g. ./mydir ). You may use $PWD as demonstrated.
  3. The max_template_date is of the form YYYY-MM-DD.
  4. Only the database paths in mark_flags_as_required of run_alphafold.py are included because the optional paths depend on db_preset (full_dbs or reduced_dbs) and model_preset.

SLURM SCRIPT

Below are some templates for your Slurm script

Monomer with full_dbs

#!/bin/bash
#SBATCH --partition=sixhour # partition
#SBATCH --gres=gpu:1        # number of GPUs
#SBATCH --nodes=1           # number of nodes
#SBATCH --cpus-per-task=8   # number of cores
#SBATCH --mem=40g           # memory
#SBATCH --time=6:00:00      # time

module purge
module load singularity alphafold

run --fasta_paths=$PWD/your_fasta_file \
   --output_dir=$PWD/outdir \
   --model_preset=monomer \
   --db_preset=full_dbs \
   --bfd_database_path=/data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
   --pdb70_database_path=/data/pdb70/pdb70 \
   --uniclust30_database_path=/data/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
   --max_template_date=YYYY-MM-DD \
   --use_gpu_relax=True

Multimer with reduced_dbs

#!/bin/bash
#SBATCH --partition=sixhour # partition
#SBATCH --gres=gpu:1        # number of GPUs
#SBATCH --nodes=1           # number of nodes
#SBATCH --cpus-per-task=8   # number of cores
#SBATCH --mem=40g           # memory
#SBATCH --time=6:00:00      # time

module purge
module load singularity alphafold

run --fasta_paths=$PWD/your_fasta_file \
   --output_dir=$PWD/outdir \
   --model_preset=multimer \
   --db_preset=reduced_dbs \
   --pdb_seqres_database_path=/data/pdb_seqres/pdb_seqres.txt \
   --uniprot_database_path=/data/uniprot/uniprot.fasta \
   --small_bfd_database_path=/data/small_bfd/bfd-first_non_consensus_sequences.fasta \
   --max_template_date=YYYY-MM-DD \
   --use_gpu_relax=True

Notes

  1. You may need to request 8 CPU cores due to this line printed in the output:
    Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpys2ocad8/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 ./seq.fasta /share/resources/data/alphafold/mgnify/mgy_clusters.fa"
    
  2. You must provide a value for --max_template_date. See https://github.com/deepmind/alphafold/blob/main/run_alphafold.py#L92-L934.
  3. The flag --use_gpu_relax is only for version 2.1.2 and above.
  4. You are not required to use the run wrapper script. You can always provide the full singularity command.