ok this gets hairy. we don't have lots of structural data for mutations, so the backbone similarity used in this paper is maybe not the best way to extract information from the model.

however there are lots of papers that look at correlations of model outputs against protein stability data

1

Replies

  1. i wouldn't be surprised if free energy estimations are much less accurate for random sequences and de novo designed proteins than for natural proteins, as pLMs are learning coevolutionary statistics as discussed here. curious what you think!

    www.pnas.org/doi/10.1073/...

    Protein language models (pLMs) have emerged as potent tools for predicting and designing
protein structure and function, and the degree to which th...

    Protein language models learn evolutionary statistics of interacting sequence motifs | PNAS

    Protein language models (pLMs) have emerged as potent tools for predicting and designing protein structure and function, and the degree to which th...

    1