BioModal

A primer to Alphafold

Utkarsh Singh — Thu, 23 Oct 2025 22:59:17 GMT

Note: This post has been written more for a wet-lab scientist who has little prior knowledge about these tools, as opposed to a computational protein engineer.

Introduction

What is AlphaFold

Alphafold is an AI tool developed by Google DeepMind that can be used to predict protein 3D structures solely from their sequence. In other terms, all you input is the amino acid sequence of a protein, and you get a computational prediction of the structure of how the protein folds ( with confidence levels for each part of the protein), which you can then use for further downstream analysis, like docking, etc. The breakthrough: predicting 3D protein structure from sequence

note: all references to alphafold refer to alphafold-3 the latest version, unless specified otherwise.

How does Alphafold achieve this?

Here is a rudimentary explanation of what AlphaFold does under the hood to predict protein structures. (This part can be skipped if you only intend to understand how to use alphafold to the fullest)

Step 1: Input your amino acid sequence.

Step 2: AlphaFold searches sequence databases (UniProt, BFD, etc.) to find homologous proteins and builds a Multiple Sequence Alignment (MSA). This captures evolutionary constraints—if residues co-vary across species (e.g., both change from positive to negative together), they’re likely interacting in 3D space.

Step 3: Next, Alphafold searches the structure templates in places like databases like (PDB) to find proteins with known 3D structures similar to your target. These help to serve as structural ‘templates’ while generating the 3D structure for your target sequence.

Step 4: Armed with the MSA and the structural templates, the Neural Network iteratively predicts 3d coordinates for each atom until it finally converges on a final structure.

Q: But what if you find the exact protein in the PDB structure? Does it essentially reinvent the wheel?

A: Yes, hence its always a good idea to first check if the structure exists before spinning up alphafold.

Q: But what about for something like orphan proteins where there dont exist good homologues?

A: Yes, alphafold struggles to predict them effectively. In essence, this is one major limitation of Alphafold. It struggles to accurately predict structures of certain proteins, like:

Highly variable regions (CDR loops in antibodies)
De novo designed proteins

Capabilities

(From the Alphafold 3 paper.)

Alphafold can model several possibilities, not just single protein structures. Such as single-chain predictions and Joint structure of complexes including proteins, nucleic acids, small molecules, ions, and modified residues.

Note: Previously, you needed to use AlphaFold-Multimer for protein complexes and interactions between chains (e.g., antibody heavy/light chains, receptor-ligand complexes); however, now you can use AF3 for everything.

Differences between Alphafold-2 and Alphafold-3

Alphafold 2 consists of a Novel Neural network architecture (Evoformer), which it uses to process MSA and pair representations, which it feeds to the structure modules, which directly output 3d coordinates for each atom

The alphafold 2 Network architecture

Alphafold 3, alongside the initial Evoformer model for processing of sequence alignment and templates, uses diffusion modules for final structure generation. This essentially means it starts from random atomic coordinates and iteratively “denoises” them into the final structure. We will explore its implications in a bit.

(Alphafold 3 architecture)

Limitations

However, like all AI models, there is no free lunch. One of them is that the model doesn’t respect chirality(a 4.4 violation rate, to be precise). At times, there are also cases of it producing clashing atoms in its structure predictions. These are mostly seen when predicting protein-nucleic complexes, for compounds having greater than 100 nucleotides and greater than 2,000 residues in total.

With the move towards a diffusion generative block for outputting 3D coordinates from AF2, we also see a change in the model, which outputs more hallucinations in disordered regions. With Alphafold-2, these low confidence regions had a trademark ‘ribbon-like’ structure; however, AF3 seems to instead completely hallucinate structures on its own at times.

It is also important to note that Alphafold has been trained on static structures as seen in PDB files, and hence does not generalize well to predicting dynamic conformational states as seen in solutions. To tackle this problem, the developers recommend generating a large number of predictions with different seeds and ranking them, particularly for antibody-antigen complexes.

Alphafold for the motivated user

For most people, the simple act of plugging in amino acid sequence(s) and downloading the PDB files and analyzing them in some visualization software like PyMol should suffice, which is pretty easy to do, hence I won’t go into that in here. This part should serve as a guide for people trying to do more funky things with Alphafold.

Post-Translational Modifications

(you can find them while clicking the three dots on the right side)

With Alphafold 3, you can specify the post-translational modifications for certain amino acids. This helps with understanding how certain PTMs (like glycosylation, for instance) can affect the proteins’ 3d conformations and show potential clashes or steric hindrances. There are certain tools like Phosphosite, which is a database for phosphorylation, acetylation, ubiquitination, etc, or something like NetPhos when experimental data is unavailable.

Template Settings

Alphafold allows you to control which PDB structures are used as templates. You have 3 options to choose from

Use PDB Templates up to a custom date: Allows you to specify exactly which structures AlphaFold can “see”.
Use PDB templates with default cut-off date (29/09/2021): This is the date AlphaFold 2 was trained on and prevents data leakage.
Turn off templates: Force AlphaFold to predict entirely from MSA evolutionary information.

Ideally, if you’re trying to validate your proteins, it’s a good idea to set a custom date for the PDB templates lest Alphafold use your own structure as a template. For truly de novo antibody design, for instance, it may be a good idea to completely turn off templates, as Alphafold may try to shift your structure towards more canonical structures.

Metric scores— all you need to know

PlDDT

Alongside the predicted structure, Alphafold also gives you a score for each part of the predicted structure, which helps you figure out which parts you can trust and which likely require further work.

PlDDT stands for Predicted Local Distance Difference Test. It measures the model’s confidence about the ‘local structure’ of the protein. A good rule of thumb is to go off the color scheme in Alphafold: blue high confidence, yellow = slightly disordered, and orange is = pretty disordered. While it’s a useful metric, it often falls short in the parts that really matte,r like the CDRs of antibodies or linker regions.

PAE

While pLDDT tells you about local confidence for each residue, PAE tells you about the relative positioning between different parts of your protein. The scale is typically 0-30 Ångströms. Ideally you see

Strong dark blue blocks along the diagonal (well-structured domains
Dark blue off-diagonal blocks connecting domains (domains positioned correctly and
For multimers: dark blue blocks between different chains (confidence in the interface)

(HIV Protease Homodimer, one of the most confident I’ve ever seen Alphafold)

Limitations:

However it is imperative to note Alphafold structures should be taken with a large grain of salt especially for GPCR’s. Here’s an example

(β2-Adrenergic Receptor + Gαs complex)

Notice how the β2-adrenergic receptor complexed with its G-protein shows dramatically lower confidence scores (ipTM=0.14) compared to the HIV protease dimer (ipTM=0.92). And we can see in the plot, while AlphaFold can predict the G-protein structure reasonably well, the interface between the receptor and G-protein (off-diagonal regions) is almost completely white.

Feel free to add your questions in the comments, and I’ll try my best to answer them or make a follow-up post!

100 Open questions to think about

Utkarsh Singh — Mon, 20 Feb 2023 13:49:02 GMT

Here’s a list of a few questions I came up with while filling up an application for SPARC. Most of them revolve around rationality, AI and human decision-making in general. Do let me know if you find these interesting to think about, have any comments, or know the answers to any of these questions!

Does Effective Altruism’s emphasis on the future generations, belittle the needs of the current, and if so, is this morally appropriate?
Why are calls to people's emotions more ‘effective’ than rationality?
How does effective forecasting like Fermicasting provide any plausible benefit? Are people reasonably confident to make meaningful decisions based on the heuristics of other people?
Does crypto really have any inherent value? And what is something we can do with crypto that we’d be unable to do with the money?
Artificial intelligence is trained on human data. Why then are we outraged when a word-predicting model outputs something outrageous?
It is not entirely impossible that artificial intelligence might be better at decision-making than humans, if so, would it be better to align AI to human values, or leave it in some area for independent decision-making?
There are several opinions that AI would help create not reduce net jobs. For unskilled blue-collar workers. What are some of these jobs?
Is it possible to satisfy the need for human craving for loneliness solely through artificial intelligence?
What's the best way to make a positive impact on the world?
What's the end goal of humans, is it to optimize individual happiness?
What's the best way to form meaningful relationships with people?

What are some of the main issues plaguing AI Alignment?
Some of the current language models are owned by companies as these are expensive to run. Will Open Source models ever come to be as competitive as them?
How should developing countries optimize for development and progress while ensuring that they’re not accelerating climate change?
How does one manage to build depth in a specific niche of ai, while managing to stay ‘dangerous’ in other fields
What are the best ways to solve problems of distribution like hunger and poverty?
What's the best solution to the current problem plaguing chatgpt making up nonexistent information and sources?
To what extent of freedom of speech be allowed?
With students using chatgpt for almost all essay prompts, what are examples of areas where only humans will be able to make intelligible responses if any?
What are some of the best ways to learn about opinions in philosophy, and try to answer questions?
With OpenAI constantly patching the different jailbreaks being used to bypass its content policies,
What’s more likely to make people mad: something that’s false or true?
Is there a reproducible process for making pop songs that AI can replicate?
Will AI ever be able to better understand us than we do ourselves?
Is dissociation from emotions better or worse for making decisions? Particularly in a field like friendship or family
Should powerful AI systems should behave in the way users want or their creators intend?
What's the best way to depolarise society from its current state?
The current language models are being trained on data that don't accurately reflect all strata of society, what are the best ways to overcome this?
What are some of the best ways to form contrarian ideas that are right?
What measures does crypto have once, hypothetically prop-shops start trading crypto and intentionally boost or deflate their value?
What are some of the best ways to get better at forecasting?
Would Universal basic income promote innovation or deaccelerate it?
How do we get better at noticing things overlooked by others
Would it be possible to build a programming language that generates additional syntax to solve different needs?
What are the best ways to help solve the rising issues of loneliness
How does one get to regularly interact and learn from successful people?
Is it ever truly possible to overcome our biases, and how can we do this?
What are some of the positive outcomes we can get through gene editing, and how can we make an impact in this field?
Is there a thing like free thought?
Do Animals Dream?
How do various religions differ in the nature and magnitude of their effects?
What influences when people to act in accordance with their self-interest and when they don't?
How does mental imagery work? How do we improve its function?
Do people have different levels of self-control or do they just experience temptation differently?
What makes a good life? How do we study this?
We remember dreams almost perfectly right after waking up and then the memory rapidly recedes and disappears completely, unless we write them down. This isn’t how normal memories function. So, why the difference?
What is “personal productivity” and why does it vary from day to day so much (eg. Weinberger et al 2018)? And why does it not seem to correlate with environmental variables like weather or sleep quality?
Does listening to music improve or worsen memory?
What is consciousness?
What would happen if we could travel faster than the speed of light?
How much of our behavior is determined by nature versus nurture?
How does language shape the way we think?
What makes some memories more vivid than others?
What does it really mean to be ‘self-aware’?
What laws should be imposed by governments on generative AI, if any?
Is rationality a universal trait, or are there cultural differences in what is considered rational behavior?
What determines how we perceive time? Is it the same for everyone?
How do people make major decisions in their lives? When and why does it come up, and how do they go about making those decisions?
Are we all fundamentally selfish, even when we do things for others' benefit? Or are there truly settings (intrinsic and/or exogenous) where we do things that are good for others but bad for both our short-term future and long-term future selves?
Do people have different levels of self-control or do they just experience temptation differently?
Is there a way to conduct research without bias in funding? How?
Would it be feasible for prop trading shops to be owned by the government to ensure market liquidity?
What would be the best way to go about building a large language model to rival that of GPT-3?
Are we all fundamentally selfish, even when we do things for others' benefit? Or are there truly settings (intrinsic and/or exogenous) where we do things that are good for others but bad for both our short-term future and long-term future selves?
What basically goes on in the brain, when we design or think of something ‘new’ or never seen before?
Are people able to concentrate more effectively under total silence?
what are the factors that influence the speed and accuracy of learning a new language?
What is the best way to factor in risk while making uncertain decisions?
How does trusting the ‘gut’ work?
If we ever find a way to significantly extend the human health span or reverse ageing, what could that post-death society look like?
What are the neural mechanisms underlying consciousness, and how can we study and manipulate these mechanisms?
Would it be possible to prevent shrinkage of the brain?
How does neuroplasticity differ between the developing brain and the adult brain,
What is happening in the brain when a human questions?
What is the probability there is microbial-like life (other than from earth) in our solar system?"
Is string theory more closely correct than any other current theory of physics?
What's the best way to determine if someone would be a good friend for you?
How do you ask the right questions?
How do I get people to like me?
How do you tell the difference between a preference and a bias
What is the probability that I might be sleep deprived if I wake up before my alarm goes off more than 95% of the time?
What do other people subjectively experience when they are thinking? To me it’s like talking to myself (in verbal English sentences) but I'm told that isn't universal.
When is self-denial useful in altering your desires, vs satisfying them so you can devote time to other things?
How does one define wisdom?
What happens to consciousness once you fall asleep?
Can charisma be taught?
Why is it so hard to predict success?
Why are we so fascinated by coincidences?
Is It Wrong to Enjoy Yourself While the World Is Burning?
Is it more important to help society or to help yourself?
how can we stop confusing correlation with causation?
Why Do We Want What We Can’t Have?
Which Matters More, a First or Last Impression?
How do I improve my ability to simulate/guess other people's internal states and future behaviours?
How do I work out what I want and what I should do?
2) Would the human race be eradicated if there is a worst-possible-scenario nuclear incident? Or merely a lot of people?
What could be the potential downsides of building a universal sign language?
How do people ascertain emotions in certain songs?
Do animals ever 'ask questions'?
When you forget a thought, where does this thought go?