Managers

inari@piefed.zip · 1 month ago

Managers

FiniteBanjo@feddit.online · 1 month ago

Pretty sure these AI companies are running at a cost, and due to AI Scaling Laws you hit the accuracy limit a lot sooner with a smaller model so it would probably be both worse and more expensive.

I could see how you might think speedrunning bankruptcy is similar to being “ahead of the curve” in this economy, though.

ricecake@sh.itjust.works · 1 month ago

There’s a big difference between training a model, running a model, and running a model at scale.

A small, self hosted setup will have lower accuracy and queries per second, and it will have a cost, but the cost will be no more than playing a videogame. You’ll still have something surprisingly accurate and responsive for some tasks, like being a wiki interface or something.

Remember that some of these models can run on a standard smartphone, and all the hoopla when people found that chrome was downloading models onto people’s devices.

theunknownmuncher@lemmy.world · edit-2 1 month ago

No that’s not how this works. Inference is cheap and efficient. AI companies are bankrupting themselves with training costs that they need to recoup back by selling inference. Open-weight models have already been trained.

Also, going big in terms of model size shows diminishing marginal returns on accuracy, not efficiency of scale. Smaller models are way more efficient and consistently catch up to the largest models, which is why today’s SOTA 27 billion parameter model competes with yesterday’s SOTA 500+ billion parameter model.

GamingChairModel@lemmy.world · 1 month ago

AI companies are bankrupting themselves with training costs that they need to recoup back by selling inference.

I think they hit a wall in actual returns on performance with pretraining, years ago. Then they started scaling up on post-training/reinforcement learning to continue improvement, but that might be hitting a plateau as well. More recently it looks like they’re relying more heavily on scaling up on inference, which is a significant problem for their long term business models.

If they’re not able to cheaply deliver inference (and charge at a premium), how will they be able to sustain their businesses?

It seems that the most recent, largest models are using a lot more tokens to accomplish the same tasks, so even as token cost drops the actual cost of using the latest models seems to be going up with time (even as performance improves).

theunknownmuncher@lemmy.world · 1 month ago

If they’re not able to cheaply deliver inference (and charge at a premium), how will they be able to sustain their businesses?

I definitely agree that they have a big problem on their hands, and are in deep deep trouble. They are in a position where they must sell a service that is very cheap in order to pay for up front costs that were very expensive.

This is also why the release of Deepseek was such a devastating blow to US AI companies. It proved that:

they don’t really have a moat that would lock users into their service, or secret special knowledge that prevents other companies from training competitive models. They’re in a race to the bottom
Deepseek was not only able to train a model of the same caliber, but they were able to do it at a tiny fraction of the cost that US AI companies spent on training US models. Because they spent so much less on training, it means that Deepseek is able to undercut the US companies and offer inference at a much lower price