Tokenization and Training Improvements
Tokenization and Training Improvements [Link to full code] Introduction There have been quite a few papers published recently that I have been wanting to implement, but have not yet had the time...
Tokenization and Training Improvements [Link to full code] Introduction There have been quite a few papers published recently that I have been wanting to implement, but have not yet had the time...
Implementing ATLAS: Learning to Optimally Memorize the Context at Test Time [Link to the full code] Loss curves for ATLAS and the transformer baseline. Try to guess which is which! Introduction...
MNIST Classifier from scratch in Java [Link to the full code] Background Of all the possible programming languages to use for training a handwritten digit classifer, Java would not be my first...
H-Net Paper Quick (and probably incorrect) Implementation [Link to the full code] [Paper Link] [Github Repo] Architecture Overview The H-Net architecture consists of encoder, main, and decode...
hello, world!