brogrammer

https://bhoener.github.io/brogrammerLearning Data Science and ML in PyTorch 2026-04-26T13:02:08-07:00 Brodie Hoener https://bhoener.github.io/ Jekyll © 2026 Brodie Hoener /assets/img/favicons/favicon.ico /assets/img/favicons/favicon-96x96.png Tokenization and Training Improvements2026-04-26T12:00:00-07:00 2026-04-26T13:01:51-07:00 https://bhoener.github.io/posts/Tokenization-and-Training-Improvements/ Brodie Hoener

Tokenization and Training Improvements [Link to full code] Introduction There have been quite a few papers published recently that I have been wanting to implement, but have not yet had the time. These are mostly optimizations and enhancements to the transformer (DeepSeek mHC, Engram, Attention Residuals, XSA, etc). I have also been interested in tokenization and whether a model with a smal...

Implementing ATLAS2026-03-20T19:00:00-07:00 2026-03-24T14:38:31-07:00 https://bhoener.github.io/posts/Implementing-ATLAS/ Brodie Hoener

Implementing ATLAS: Learning to Optimally Memorize the Context at Test Time [Link to the full code] Loss curves for ATLAS and the transformer baseline. Try to guess which is which! Introduction I saw this paper near the end of 2025 and thought it would be fun to implement. It was not, but I did learn a few new things while doing it. ATLAS is a massive paper content-wise. The authors propo...

Java MNIST from scratch2025-11-01T19:00:00-07:00 2025-11-06T18:13:00-08:00 https://bhoener.github.io/posts/Java-MNIST-from-scratch/ Brodie Hoener

MNIST Classifier from scratch in Java [Link to the full code] Background Of all the possible programming languages to use for training a handwritten digit classifer, Java would not be my first choice. Although, to be fair, it wouldn’t be my first choice for doing anything else, either. CS210 at my college is taught entirely in Java. I decided to make this as a first java project to learn ...

Implementing H-Net2025-08-21T13:00:00-07:00 2026-03-24T14:31:49-07:00 https://bhoener.github.io/posts/Implementing-HNet/ Brodie Hoener

H-Net Paper Quick (and probably incorrect) Implementation [Link to the full code] [Paper Link] [Github Repo] Architecture Overview The H-Net architecture consists of encoder, main, and decoder networks joined together by chunking and dechunking modules. Encoder The encoder is a regular sequence model (e.g. transformer, mamba) that processes raw bytes (vocab size 256). Instead of a proje...

Hello World :)2024-07-04T21:14:00-07:00 2024-07-04T21:14:00-07:00 https://bhoener.github.io/posts/Hello-World/ Brodie Hoener

hello, world!