
Maybe it’s time to open up a hipster bespoke ram shop. You may only get 1kb but it will be the most beautiful hand-woven core memory you’ve ever seen.

Maybe it’s time to open up a hipster bespoke ram shop. You may only get 1kb but it will be the most beautiful hand-woven core memory you’ve ever seen.

To be more specific (for anyone interested), the next word predictors are usually a type of model called an LSTM (at least I think that’s the most common). This model type has been used for a long time for dealing with sequential data. In 2014 there was a famous paper introducing an attention mechanism. This was a rather brilliant, though relatively minor extension to how LSTMs work. Essentially between each step of an LSTM it generates some data representing the model’s knowledge of the sequence to that point. The attention mechanism looks back at these intermediate values and determines how relevant each state is to the current point in the sequence and pulls in the most relevant bits. This vastly improved the memory of the LSTM over longer sequences.
In 2017 there was another famous paper “attention is all you need” which said something to the effect of “the attention mechanism is doing all the work, we don’t need the rest of the LSTM we can replace it by running attention between all point combinations in the sequence.” It’s actually significantly slower to run as the model grows, but much much faster to train because it’s not intrinsically sequential. This is the transformer model that’s the basis of all our LLMs.
Obviously some massive simplifications here but as despite being fairly anti AI, I do love the engineering behind it. So yeah, pretty literally a fancy text predictor, but it turns out when you throw all the compute you can muster at a fancy word predictor is makes the world go crazy

Language needs to change because it’s completely undecipherable to lay people. I use all sorts of models to great effect in my work. Random forests, LSTMs, SVMs, etc all have tons of great uses. I am pretty anti “AI” as lay people understand it (though the technology is super cool on a technical level)
On the plus side, I can much more easily convince people to use any sort of machine learning models than I used to be able to by calling it AI.

Yeah, the price kept me away for a long time. I do now find it worth it but completely get that it wouldn’t be for many. So nice to be the consumer rather than the product, though.

It’s so bad it drove me to finally pay for Kagi, which I must say is incredibly better than ddg or ecosia
I tried switching to ddg several times over the last decade or so and always gave up because of issues like this, especially with technical or obscure topics. Recently started paying for Kagi and do not have the same issues at all there. The results are good and nicely presented without ads. Of course, on the flip side it’s pretty expensive.