<feed xmlns="http://www.w3.org/2005/Atom"> <id>https://bhoener.github.io/</id><title>brogrammer</title><subtitle>Learning Data Science and ML in PyTorch</subtitle> <updated>2026-04-26T13:02:08-07:00</updated> <author> <name>Brodie Hoener</name> <uri>https://bhoener.github.io/</uri> </author><link rel="self" type="application/atom+xml" href="https://bhoener.github.io/feed.xml"/><link rel="alternate" type="text/html" hreflang="en" href="https://bhoener.github.io/"/> <generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator> <rights> © 2026 Brodie Hoener </rights> <icon>/assets/img/favicons/favicon.ico</icon> <logo>/assets/img/favicons/favicon-96x96.png</logo> <entry><title>Tokenization and Training Improvements</title><link href="https://bhoener.github.io/posts/Tokenization-and-Training-Improvements/" rel="alternate" type="text/html" title="Tokenization and Training Improvements" /><published>2026-04-26T12:00:00-07:00</published> <updated>2026-04-26T13:01:51-07:00</updated> <id>https://bhoener.github.io/posts/Tokenization-and-Training-Improvements/</id> <content type="text/html" src="https://bhoener.github.io/posts/Tokenization-and-Training-Improvements/" /> <author> <name>Brodie Hoener</name> </author> <category term="ml" /> <category term="deep_learning" /> <summary>Tokenization and Training Improvements [Link to full code] Introduction There have been quite a few papers published recently that I have been wanting to implement, but have not yet had the time. These are mostly optimizations and enhancements to the transformer (DeepSeek mHC, Engram, Attention Residuals, XSA, etc). I have also been interested in tokenization and whether a model with a smal...</summary> </entry> <entry><title>Implementing ATLAS</title><link href="https://bhoener.github.io/posts/Implementing-ATLAS/" rel="alternate" type="text/html" title="Implementing ATLAS" /><published>2026-03-20T19:00:00-07:00</published> <updated>2026-03-24T14:38:31-07:00</updated> <id>https://bhoener.github.io/posts/Implementing-ATLAS/</id> <content type="text/html" src="https://bhoener.github.io/posts/Implementing-ATLAS/" /> <author> <name>Brodie Hoener</name> </author> <category term="ml" /> <category term="deep_learning" /> <category term="paper" /> <summary>Implementing ATLAS: Learning to Optimally Memorize the Context at Test Time [Link to the full code] Loss curves for ATLAS and the transformer baseline. Try to guess which is which! Introduction I saw this paper near the end of 2025 and thought it would be fun to implement. It was not, but I did learn a few new things while doing it. ATLAS is a massive paper content-wise. The authors propo...</summary> </entry> <entry><title>Java MNIST from scratch</title><link href="https://bhoener.github.io/posts/Java-MNIST-from-scratch/" rel="alternate" type="text/html" title="Java MNIST from scratch" /><published>2025-11-01T19:00:00-07:00</published> <updated>2025-11-06T18:13:00-08:00</updated> <id>https://bhoener.github.io/posts/Java-MNIST-from-scratch/</id> <content type="text/html" src="https://bhoener.github.io/posts/Java-MNIST-from-scratch/" /> <author> <name>Brodie Hoener</name> </author> <category term="ml" /> <category term="deep_learning" /> <category term="java" /> <category term="cs" /> <summary>MNIST Classifier from scratch in Java [Link to the full code] Background Of all the possible programming languages to use for training a handwritten digit classifer, Java would not be my first choice. Although, to be fair, it wouldn’t be my first choice for doing anything else, either. CS210 at my college is taught entirely in Java. I decided to make this as a first java project to learn ...</summary> </entry> <entry><title>Implementing H-Net</title><link href="https://bhoener.github.io/posts/Implementing-HNet/" rel="alternate" type="text/html" title="Implementing H-Net" /><published>2025-08-21T13:00:00-07:00</published> <updated>2026-03-24T14:31:49-07:00</updated> <id>https://bhoener.github.io/posts/Implementing-HNet/</id> <content type="text/html" src="https://bhoener.github.io/posts/Implementing-HNet/" /> <author> <name>Brodie Hoener</name> </author> <category term="ml" /> <category term="deep_learning" /> <category term="paper" /> <summary>H-Net Paper Quick (and probably incorrect) Implementation [Link to the full code] [Paper Link] [Github Repo] Architecture Overview The H-Net architecture consists of encoder, main, and decoder networks joined together by chunking and dechunking modules. Encoder The encoder is a regular sequence model (e.g. transformer, mamba) that processes raw bytes (vocab size 256). Instead of a proje...</summary> </entry> <entry><title>Hello World :)</title><link href="https://bhoener.github.io/posts/Hello-World/" rel="alternate" type="text/html" title="Hello World :)" /><published>2024-07-04T21:14:00-07:00</published> <updated>2024-07-04T21:14:00-07:00</updated> <id>https://bhoener.github.io/posts/Hello-World/</id> <content type="text/html" src="https://bhoener.github.io/posts/Hello-World/" /> <author> <name>Brodie Hoener</name> </author> <category term="ml" /> <category term="deep_learning" /> <summary>hello, world!</summary> </entry> </feed>
