For ages the "bits" and "bytes" have been the elements of computing — until transformers came in and tokens in the context of LLMs became the de facto element of computation in generative AI. The approach has been very successful, without any doubt.
However, fixed tokenizers are a source of bias in our models. Meta's research on the Byte Latent Transformer introduces a new architecture that removes the necessary tokenizer and works directly upon bytes.
I'm confident that these models not only scale better but also are more adaptive to varying contexts — without the bias that comes from fixing a token collection.
The shift from tokens back to bytes feels like a return to first principles. Sometimes progress means going back to the fundamentals and rethinking what we took for granted.