Jase Gehring @skyjase.bsky.social

5 days ago

ok this gets hairy. we don't have lots of structural data for mutations, so the backbone similarity used in this paper is maybe not the best way to extract information from the model.

however there are lots of papers that look at correlations of model outputs against protein stability data

Jase Gehring @skyjase.bsky.social

5 days ago

if instead the authors used a method like one of the below and drew the same conclusions, would that be more satisfying? for me, not really...

www.biorxiv.org/content/10.1... academic.oup.com/bioinformati... www.nature.com/articles/s41...

https://www.biorxiv.org/content/10.1101/2024.05.20.595026v1.full.pdf

Replies

Jase Gehring @skyjase.bsky.social
5 days ago

i wouldn't be surprised if free energy estimations are much less accurate for random sequences and de novo designed proteins than for natural proteins, as pLMs are learning coevolutionary statistics as discussed here. curious what you think!

www.pnas.org/doi/10.1073/...

Protein language models learn evolutionary statistics of interacting sequence motifs | PNAS

Protein language models (pLMs) have emerged as potent tools for predicting and designing protein structure and function, and the degree to which th...

1

https://www.biorxiv.org/content/10.1101/2024.05.20.595026v1.full.pdf

Replies

Protein language models learn evolutionary statistics of interacting sequence motifs | PNAS