rigdenlab · hllelli2 · May 28, 2025 · Apr 29, 2025 · May 7, 2025 · May 7, 2025
diff --git a/README.md b/README.md
@@ -18,7 +18,7 @@ We recommend installing this package in a virtual environment or conda / microma
 
 To set up a conda/micromamba environment, run:
 ```bash
-conda env create -n abcfold python=3.11
+conda create -n abcfold python=3.11
 conda activate abcfold
 ```
 
@@ -50,7 +50,7 @@ If you wish to help develop this package, you can install the development depend
 ```bash
 python -m pip install -e .
 python -m pip install -r requirements-dev.txt
-python -m pre-commit install
+python -m pre_commit install
 ```
 
 ## Usage
@@ -89,6 +89,9 @@ abcfold <input_json>  <output_dir> -abc --mmseqs2 --model_params <path_to_af3_mo
 > If you wish to run ABCFold with the AlphaFold3 JACKHMMER MSA search, you need to remove the `--mmseqs2` flag and provide the `--database` flag with the path to the directory containing the AlphaFold3 databases.
 > The `--database` path will also be stored after the first run and won't be required in subsequent ABCFold jobs.
 
+>[!WARNING]
+> When using the `--mmseqs2` flag, AlphaFold3 will be run without pairedMSA information. If this is important for your target (e.g. modelling a complex), we recommend running the AlphaFold3 JACKHMMER MSA search as the pairedMSA is automatically generated.
+
 >[!WARNING]
 >`--model_params` and `--database` will need to be provided again if you do a fresh install.
 
@@ -98,7 +101,8 @@ However, there you may wish to use the following flags to add run time options s
 - `<input_json>`: Path to the input AlphaFold3 JSON file.
 - `<output_dir>`: Path to the output directory.
 - `-a`, `-b`, `-c` (`--alphafold3`, `--boltz1`,`--chai1`): Flags to run Alphafold3, Boltz-1 and Chai-1 respectively. If none of these flags are provided, Alphafold3 will be run by default.
-- `--mmseqs2`: [optional] Flag to use MMseqs2 MSAs and templates.
+- `--mmseqs2`: [optional] Flag to use MMseqs2 MSAs and templates (if specified).
+- `--mmseqs_database`: [optional] The path to the database used by a local copy of MMSeqs2, provided mmseqs is installed, the inclusion of this flag allows MMseqs2 to be run locally.
 - `--override`: [optional] Flag to override the existing output directory.
 - `--save_input`: [optional] Flag to save the input JSON file in the output directory.
 
@@ -107,10 +111,12 @@ However, there you may wish to use the following flags to add run time options s
 - `--model_params`: Path to the directory containing the AlphaFold3 model parameters.
 - `--database`: [optional] Path to the directory containing the AlphaFold3 databases #Note: This is not used if using the
 `--mmseqs2` flag.
-- `--use_af3_template_search`[optional] If providing your own custom MSA or you've ran `--mmseqs`, allow Alphafold3 to search for templates
+- `--sif_path`: [optional] Path to sif file if using an AlphaFold3 singularity instead of Docker
+- `--use_af3_template_search`[optional] If providing your own custom MSA or you've ran `--mmseqs2`, allow Alphafold3 to search for templates
 
-#### Template and MSA arguments
+#### Template arguments
 
+- `--templates`: Flag to enable a template search
 - `--num_templates`: [optional] The number of templates to use (default: 20)
 
 - `--custom_template`: [optional] Path to a custom template file in mmCIF format or a list of custom templates. A more detailed decription on how to use the custom template argument can be found below Visualisation arguments.
@@ -143,11 +149,10 @@ ABCFold will output the AlphaFold, Boltz and/or Chai models in the `<output_dir>
 Unless the `--no_visuals` flag is used, you can then open the output pages by running:
 
 ```bash
-python <output_dir>/open_output.py
+cd <output_dir>
+python open_output.py
 ```
 
-
-
 ## Main Page Example
 ![main_page_example](https://raw.githubusercontent.com/rigdenlab/ABCFold/refs/heads/main/abcfold/html/static/main_page_example.png)
 
@@ -158,8 +163,6 @@ The output page will be available on `http://localhost:8000/index.html`. If you
 you will find `open_output.py` in your `<output_dir>`. This needs to be run from your `<output_dir>`.
 
 
-
-
 ## Extra Features
 
 Below are scripts for adding MMseqs2 MSAs and custom templates to AlphaFold3 input JSON files.
@@ -182,6 +185,16 @@ mmseqs2msa --input_json <input_json> --output_json <output_json> --templates --n
 - `<input_json>`: Path to the input AlphaFold3 JSON file.
 - `<output_json>`: [optional] Path to the output JSON file (default: `<input_json_stem>`_mmseqs.json).
 - `<num_templates>`: [optional] The number of templates to use (default: 20)
+- `<mmseqs_database>`: [optional] The path to the database used by a local copy of MMSeqs2, provided mmseqs is installed, the inclusion of this flag allows MMseqs2 to be run locally.
+
+> [!NOTE]
+> If you need to install the mmseqs databases you can use setup_mmseqs_databases.sh
+> This replicates the MMSeqs2 database setup from ColabFold
+
+```
+bash
+MMSEQS_NO_INDEX=1 ./setup_mmseqs_databases.sh /path/to/db_folder
+```
 
 
 #### Without Templates
@@ -231,7 +244,7 @@ mmseqs2msa --input_json <input_json> --output_json <output_json> --templates --n
 - `<target_id>`: [conditionally required] The ID of the sequence the custom template relates to, only required if modelling a complex. If providing a list of custom templates, you can provide a single target ID if they all relate to the same target. Otherwise, you should provide a list of target IDs corresponding to the list of custom templates.
 
 
-### Common Issues
+### Possible Issues
 
 #### Using `--target_id` with homo-oligomer
 
@@ -262,6 +275,52 @@ Below is an example of a hetero-3-mer. When modelling a homo-oligomer, id is giv
 
 If you want to add a custom template to the first sequence, you can use `--target_id A`. If you wish to add a custom template to the second sequence, use `--target_id B` or `--target_id C`.
 
+#### Boltz-1 limitations
+
+If modelling multiple copies of the same sequence in Boltz-1, the input JSON must be set up as follows:
+
+```json
+{
+  "name": "7ZYH",
+  "sequences": [
+    {
+      "protein": {
+        "id": ["A", "B"],
+        "sequence": "SNAESKIKDCPWYDRGFCKHGPLCRHRHTRRVICVNYLVGFCPEGPSCKFMHPRFELPMGTTEQ"
+      }
+    },
+  ],
+  "modelSeeds": [1],
+  "dialect": "alphafold3",
+  "version": 1
+}
+
+If the identical sequences are given as seperate entities (as shown below) you will encounter an error.
+
+```json
+{
+  "name": "7ZYH",
+  "sequences": [
+    {
+      "protein": {
+        "id": "A",
+        "sequence": "SNAESKIKDCPWYDRGFCKHGPLCRHRHTRRVICVNYLVGFCPEGPSCKFMHPRFELPMGTTEQ"
+      }
+    },
+    {
+      "protein": {
+        "id": "B",
+        "sequence": "SNAESKIKDCPWYDRGFCKHGPLCRHRHTRRVICVNYLVGFCPEGPSCKFMHPRFELPMGTTEQ"
+      }
+    }
+  ],
+  "modelSeeds": [1],
+  "dialect": "alphafold3",
+  "version": 1
+}
+```
+
+Additionally, Boltz-1 currently lacks the ability to create linked-ligands and therefore covalent bonds between the chain/ligand will be missing.
 
 ## Contributing
 

diff --git a/abcfold/abcfold.py b/abcfold/abcfold.py
@@ -24,6 +24,7 @@
 from abcfold.output.alphafold3 import AlphafoldOutput
 from abcfold.output.boltz import BoltzOutput
 from abcfold.output.chai import ChaiOutput
+from abcfold.output.file_handlers import superpose_models
 from abcfold.output.utils import (get_gap_indicies, insert_none_by_minus_one,
                                   make_dummy_m8_file)
 from abcfold.scripts.abc_script_utils import (check_input_json, make_dir,
@@ -92,7 +93,7 @@ def run(args, config, defaults, config_file):
     if args.alphafold3:
         from abcfold.alphafold3.check_install import check_af3_install
 
-        check_af3_install(interactive=False)
+        check_af3_install(interactive=False, sif_path=args.sif_path)
 
     if args.boltz1:
         from abcfold.boltz1.check_install import check_boltz1
@@ -117,6 +118,7 @@ def run(args, config, defaults, config_file):
 
             input_params = add_msa_to_json(
                 input_json=input_json,
+                mmseqs_db=args.mmseqs_database,
                 templates=args.templates,
                 num_templates=args.num_templates,
                 chai_template_output=temp_dir.joinpath("all_chains.m8"),
@@ -149,6 +151,7 @@ def run(args, config, defaults, config_file):
                 database_dir=af3_database,
                 number_of_models=args.number_of_models,
                 num_recycles=args.num_recycles,
+                sif_path=args.sif_path,
             )
 
             if af3_success:
@@ -201,7 +204,7 @@ def run(args, config, defaults, config_file):
             )
 
             if chai_success:
-                co = ChaiOutput(chai_output_dir, input_params, name)
+                co = ChaiOutput(chai_output_dir, input_params, name, args.save_input)
                 outputs.append(co)
             successful_runs.append(chai_success)
 
@@ -234,6 +237,7 @@ def run(args, config, defaults, config_file):
                     for idx in ao.output[seed].keys():
                         model = ao.output[seed][idx]["cif"]
                         model.check_clashes()
+                        score_file = ao.output[seed][idx]["summary"]
                         plddt = model.residue_plddts
                         if len(indicies) > 0:
                             plddt = insert_none_by_minus_one(
@@ -242,7 +246,8 @@ def run(args, config, defaults, config_file):
                                 )
                         index_counter += 1
                         model_data = get_model_data(
-                            model, plot_dict, "AlphaFold3", plddt, args.output_dir
+                            model, plot_dict, "AlphaFold3",
+                            plddt, score_file, args.output_dir
                         )
                         alphafold_models["models"].append(model_data)
 
@@ -253,6 +258,7 @@ def run(args, config, defaults, config_file):
                 for idx in bo.output.keys():
                     model = bo.output[idx]["cif"]
                     model.check_clashes()
+                    score_file = bo.output[idx]["json"]
                     plddt = model.residue_plddts
                     if len(indicies) > 0:
                         plddt = insert_none_by_minus_one(
@@ -261,7 +267,8 @@ def run(args, config, defaults, config_file):
                             )
                     index_counter += 1
                     model_data = get_model_data(
-                        model, plot_dict, "Boltz-1", plddt, args.output_dir
+                        model, plot_dict, "Boltz-1",
+                        plddt, score_file, args.output_dir
                     )
                     boltz_models["models"].append(model_data)
 
@@ -273,6 +280,7 @@ def run(args, config, defaults, config_file):
                     if idx >= 0:
                         model = co.output[idx]["cif"]
                         model.check_clashes()
+                        score_file = co.output[idx]["scores"]
                         plddt = model.residue_plddts
                         if len(indicies) > 0:
                             plddt = insert_none_by_minus_one(
@@ -281,14 +289,37 @@ def run(args, config, defaults, config_file):
                                 )
                         index_counter += 1
                         model_data = get_model_data(
-                            model, plot_dict, "Chai-1", plddt, args.output_dir
+                            model, plot_dict, "Chai-1",
+                            plddt, score_file, args.output_dir
                         )
                         chai_models["models"].append(model_data)
 
         combined_models = (
             alphafold_models["models"] + boltz_models["models"] + chai_models["models"]
         )
 
+        # Make the output directory for the models
+        os.makedirs(args.output_dir.joinpath("output_models"), exist_ok=True)
+        output_models = []
+        for model in combined_models:
+            cif_file = args.output_dir.joinpath(model["model_path"])
+            if model["model_source"] == "AlphaFold3":
+                output_name = "af3_model_" + model["model_id"][-1] + ".cif"
+            elif model["model_source"] == "Boltz-1":
+                output_name = "boltz_model_" + model["model_id"][-1] + ".cif"
+            elif model["model_source"] == "Chai-1":
+                output_name = "chai_model_" + model["model_id"][-1] + ".cif"
+            shutil.copy(
+                cif_file,
+                args.output_dir.joinpath("output_models").joinpath(output_name),
+            )
+            output_models.append(
+                args.output_dir.joinpath("output_models").joinpath(output_name)
+            )
+        # Superpose the models
+        if len(output_models) > 1:
+            superpose_models(output_models)
+
         sequence_data = get_model_sequence_data(cif_models)
         sequence = ""
         for key in sequence_data.keys():

diff --git a/abcfold/alphafold3/check_install.py b/abcfold/alphafold3/check_install.py
@@ -1,5 +1,7 @@
 import logging
 import subprocess
+from pathlib import Path
+from typing import Union
 
 from packaging.version import Version
 
@@ -9,7 +11,8 @@
 AF3_VERSION = "3.0.0"
 
 
-def check_af3_install(interactive: bool = True) -> None:
+def check_af3_install(interactive: bool = True,
+                      sif_path: Union[str, Path, None] = None) -> None:
     """
     Check if Alphafold3 is installed by running the help command
 
@@ -21,7 +24,7 @@ def check_af3_install(interactive: bool = True) -> None:
 
     """
     logger.debug("Checking if Alphafold3 is installed")
-    cmd = generate_test_command(interactive)
+    cmd = generate_test_command(interactive, sif_path)
     with subprocess.Popen(
         cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE
     ) as p:
@@ -36,7 +39,7 @@ def check_af3_install(interactive: bool = True) -> None:
             raise subprocess.CalledProcessError(p.returncode, cmd, stderr)
     logger.info("Alphafold3 is installed")
 
-    cmd = generate_version_command()
+    cmd = generate_version_command(sif_path)
     with subprocess.Popen(
         cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE
     ) as p:
@@ -49,7 +52,8 @@ def check_af3_install(interactive: bool = True) -> None:
             )
 
 
-def generate_test_command(interactive: bool = True) -> str:
+def generate_test_command(interactive: bool = True,
+                          sif_path: Union[str, Path, None] = None) -> str:
     """
     Generate the Alphafold3 help command
 
@@ -59,20 +63,36 @@ def generate_test_command(interactive: bool = True) -> str:
     Returns:
         str: The Alphafold3 help command
     """
-    return f"""
+    if sif_path:
+        return f"""
+    singularity exec \
+    {sif_path} \
+    python /app/alphafold/run_alphafold.py \
+    --help
+"""
+    else:
+        return f"""
     docker run {'-it' if interactive else ''} \
     alphafold3 \
     python run_alphafold.py \
     --help
 """
 
 
-def generate_version_command() -> str:
+def generate_version_command(sif_path: Union[str, Path, None] = None) -> str:
     """
     Generate the Alphafold3 version command
     """
-
-    return """docker run \
+    if sif_path:
+        return f"""
+    singularity exec \
+    {sif_path} \
+    python -c \
+    'from alphafold3.version import __version__ ; print(__version__)'
+"""
+    else:
+        return """
+    docker run \
     alphafold3 \
     python -c \
     'from alphafold3.version import __version__ ; print(__version__)'