We really have not made a lot of progress on explaining the core deep mystery of LLMs:
How does a model using matrix multiplication to predict the next word manage to simulate human thought well enough to do all the very human-like things it does? And what does that mean about us and our thinking?