Protein Structure Prediction on O2
HMS IT has been able to confirm via user report that certain compute nodes are causing prediction jobs to fail with something like the following error:
2024-07-15 08:39:57.833390: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:628] failed to get PTX kernel "shift_right_logical" from module: CUDA_ERROR_NOT_FOUND: named symbol not found
2024-07-15 08:39:57.833441: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2153] Execution of replica 0 failed: INTERNAL: Could not find the corresponding function
If these errors were found on jobs that were dispatched to any of the following compute nodes:
compute-g-17-166
compute-g-17-167
compute-g-17-168
compute-g-17-169
compute-g-17-170
compute-g-17-171
users may attempt to exclude these nodes by adding the following line to their submission scripts:
#SBATCH -x compute-g-17-[166-171]
or inline as part of the sbatch
command:
If you still see the above error but the job did NOT dispatch to one of the above compute nodes, please contact rchelp@hms.harvard.edu.
Research Computing now supports the use of several cutting-edge protein structure prediction options:
For more details about the individual software, please refer to their respective documentation pages above.
As it may be difficult to determine which is the correct choice of program to leverage, we offer the following usage/troubleshooting flowchart helpfully generated by collaborators at the Center for Computational Medicine (CCB), with whom we held a town hall regarding Alphafold and ColabFold. More information about the town hall can be found at the CCB website.
For users interested in running large volumes of predictions with LocalColabFold, we strongly recommend getting familiar with the local mmseqs2
alignment procedure (Using (Local)ColabFold on O2 | Generating MSAs Using Local MMseqs2 ). This is crucial to avoid getting bottlenecked or blacklisted by the remote server.
A direct link to the AlphaFold database mentioned at the top of the flowchart below can be found here.
Please contact rchelp@hms.harvard.edu for any questions or clarification.