With the closure of the HuggingFace LLM leaderboard, and no access to powerful GPUs, I stopped running experiments. But with the flood of new Open Source models (Qwen, MiniMax, GLM, and more), and finally having just enough compute at home, I have started working on the current batch of LLMs. The heatmaps keep coming back with the same general story, but every architecture has its own neuroanatomy. The brains are different. The principle is the same. And some models are looking really interesting (Qwen3.5 27B in particular). I will release the code along with uploading new RYS models and a blog post once my Hopper-system finishes grinding on MiniMax M2.5.
Hurdle Word 2 AnswerPOLIO,详情可参考新收录的资料
如果想要用显卡凑够 128GB 的 VRAM,在专业卡买不到的前提下,你需要买整整五块 RTX 5090D,这还是忽略显卡间通讯延迟之后的结果。,这一点在新收录的资料中也有详细论述
make sample-test-apache