Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

AbuTahir@lemm.ee · edit-2 6 months ago

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

Nanook@lemm.ee · 6 months ago

lol is this news? I mean we call it AI, but it’s just LLM and variants it doesn’t think.

Melvin_Ferd@lemmy.world · 6 months ago

This is why I say these articles are so similar to how right wing media covers issues about immigrants.

There’s some weird media push to convince the left to hate AI. Think of all the headlines for these issues. There are so many similarities. They’re taking jobs. They are a threat to our way of life. The headlines talk about how they will sexual assault your wife, your children, you. Threats to the environment. There’s articles like this where they take something known as twist it to make it sound nefarious to keep the story alive and avoid decay of interest.

Then when they pass laws, we’re all primed to accept them removing whatever it is that advantageous them and disadvantageous us.

SoftestSapphic@lemmy.world · 6 months ago

Wow it’s almost like the computer scientists were saying this from the start but were shouted over by marketing teams.

aidan@lemmy.world · 6 months ago

And engineers who stood to make a lot of money

minoscopede@lemmy.world · edit-2 6 months ago

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it’s not obvious by any means. This finding is not showing a problem with LLMs’ abilities in general. The issue they discovered is specifically for so-called “reasoning models” that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that’s a flaw that needs to be corrected before models can actually reason.

Knock_Knock_Lemmy_In@lemmy.world · 6 months ago

When given explicit instructions to follow models failed because they had not seen similar instructions before.

This paper shows that there is no reasoning in LLMs at all, just extended pattern matching.

theherk@lemmy.world · 6 months ago

Yeah these comments have the three hallmarks of Lemmy:

AI is just autocomplete mantras.
Apple is always synonymous with bad and dumb.
Rare pockets of really thoughtful comments.

Thanks for being at least the latter.

Tobberone@lemm.ee · 6 months ago

What statistical method do you base that claim on? The results presented match expectations given that Markov chains are still the basis of inference. What magic juice is added to “reasoning models” that allow them to break free of the inherent boundaries of the statistical methods they are based on?

minoscopede@lemmy.world · edit-2 6 months ago

I’d encourage you to research more about this space and learn more.

As it is, the statement “Markov chains are still the basis of inference” doesn’t make sense, because markov chains are a separate thing. You might be thinking of Markov decision processes, which is used in training RL agents, but that’s also unrelated because these models are not RL agents, they’re supervised learning agents. And even if they were RL agents, the MDP describes the training environment, not the model itself, so it’s not really used for inference.

I mean this just as an invitation to learn more, and not pushback for raising concerns. Many in the research community would be more than happy to welcome you into it. The world needs more people who are skeptical of AI doing research in this field.

Tobberone@lemm.ee · 6 months ago

Which method, then, is the inference built upon, if not the embeddings? And the question still stands, how does “AI” escape the inherent limits of statistical inference?

AbuTahir@lemm.ee · 6 months ago

Cognitive scientist Douglas Hofstadter (1979) showed reasoning emerges from pattern recognition and analogy-making - abilities that modern AI demonstrably possesses. The question isn’t if AI can reason, but how its reasoning differs from ours.

billwashere@lemmy.world · 6 months ago

When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.

x0x7@lemmy.world · edit-2 6 months ago

Intuition is about the only thing it has. It’s a statistical system. The problem is it doesn’t have logic. We assume because its computer based that it must be more logic oriented but it’s the opposite. That’s the problem. We can’t get it to do logic very well because it basically feels out the next token by something like instinct. In particular it doesn’t mask or disconsider irrelevant information very well if two segments are near each other in embedding space, which doesn’t guarantee relevance. So then the model is just weighing all of this info, relevant or irrelevant to a weighted feeling for the next token.

This is the core problem. People can handle fuzzy topics and discrete topics. But we really struggle to create any system that can do both like we can. Either we create programming logic that is purely discrete or we create statistics that are fuzzy.

Of course this issue of masking out information that is close in embedding space but is irrelevant to a logical premise is something many humans suck at too. But high functioning humans don’t and we can’t get these models to copy that ability. Too many people, sadly many on the left in particular, not only will treat association as always relevant but sometimes as equivalence. RE racism is assoc with nazism is assoc patriarchy is historically related to the origins of capitalism ∴ nazism ≡ capitalism. While national socialism was anti-capitalist. Associative thinking removes nuance. And sadly some people think this way. And they 100% can be replaced by LLMs today, because at least the LLM is mimicking what logic looks like better though still built on blind association. It just has more blind associations and finetune weighting for summing them. More than a human does. So it can carry that to mask as logical further than a human who is on the associative thought train can.

NotASharkInAManSuit@lemmy.world · 6 months ago

People think they want AI, but they don’t even know what AI is on a conceptual level.

Buddahriffic@lemmy.world · 6 months ago

They want something like the Star Trek computer or one of Tony Stark’s AIs that were basically deus ex machinas for solving some hard problem behind the scenes. Then it can say “model solved” or they can show a test simulation where the ship doesn’t explode (or sometimes a test where it only has an 85% chance of exploding when it used to be 100%, at which point human intuition comes in and saves the day by suddenly being better than the AI again and threads that 15% needle or maybe abducts the captain to go have lizard babies with).

AIs that are smarter than us but for some reason don’t replace or even really join us (Vision being an exception to the 2nd, and Ultron trying to be an exception to the 1st).

NotASharkInAManSuit@lemmy.world · 6 months ago

They don’t want AI, they want an app.

StereoCode@lemmy.world · 6 months ago

You’d think the M in LLM would give it away.

jj4211@lemmy.world · 6 months ago

And that’s pretty damn useful, but obnoxious to have expectations wildly set incorrectly.

GaMEChld@lemmy.world · 6 months ago

Most humans don’t reason. They just parrot shit too. The design is very human.

El Barto@lemmy.world · 6 months ago

LLMs deal with tokens. Essentially, predicting a series of bytes.

Humans do much, much, much, much, much, much, much more than that.

Zexks@lemmy.world · 6 months ago

No. They don’t. We just call them proteins.

stickly@lemmy.world · 6 months ago

You are either vastly overestimating the Language part of an LLM or simplifying human physiology back to the Greek’s Four Humours theory.

Zexks@lemmy.world · 6 months ago

No. I’m not. You’re nothing more than a protein based machine on a slow burn. You don’t even have control over your own decisions. This is a proven fact. You’re just an ad hoc justification machine.

stickly@lemmy.world · 6 months ago

How many trillions of neuron firings and chemical reactions are taking place for my machine to produce an output? Where are these taking place and how do these regions interact? What are the rules for storing and reshaping memory in response to stimulus? How many bytes of information would it take to describe and simulate all of these systems together?

The human brain alone has the capacity for about 2.5PB of data. Our sensory systems feed data at a rate of about 10⁹ bits/s. The entire English language, compressed, is about 30MB. I can download and run an LLM with just a few GB. Even the largest context windows are still well under 1GB of data.

Just because two things both find and reproduce patterns does not mean they are equivalent. Saying language and biological organisms both use “bytes” is just about as useful as saying the entire universe is “bytes”; it doesn’t really mean anything.

El Barto@lemmy.world · 6 months ago

“They”.

What are you?

joel_feila@lemmy.world · 6 months ago

Thata why ceo love them. When your job is 90% spewing bs a machine that does that is impressive

Jhex@lemmy.world · 6 months ago

this is so Apple, claiming to invent or discover something “first” 3 years later than the rest of the market

postmateDumbass@lemmy.world · 6 months ago

Trust Apple. Everyone else who were in the space first are lying.

Mniot@programming.dev · 6 months ago

I don’t think the article summarizes the research paper well. The researchers gave the AI models simple-but-large (which they confusingly called “complex”) puzzles. Like Towers of Hanoi but with 25 discs.

The solution to these puzzles is nothing but patterns. You can write code that will solve the Tower puzzle for any size n and the whole program is less than a screen.

The problem the researchers see is that on these long, pattern-based solutions, the models follow a bad path and then just give up long before they hit their limit on tokens. The researchers don’t have an answer for why this is, but they suspect that the reasoning doesn’t scale.

FreakinSteve@lemmy.world · 6 months ago

NOOOOOOOOO

SHIIIIIIIIIITT

SHEEERRRLOOOOOOCK

800XL@lemmy.world · 6 months ago

Extept for Siri, right? Lol

Threeme2189@lemmy.world · 6 months ago

Apple Intelligence

jj4211@lemmy.world · 6 months ago

Without being explicit with well researched material, then the marketing presentation gets to stand largely unopposed.

So this is good even if most experts in the field consider it an obvious result.

RampantParanoia2365@lemmy.world · edit-2 6 months ago

Fucking obviously. Until Data’s positronic brains becomes reality, AI is not actual intelligence.

AI is not A I. I should make that a tshirt.

HeyListenWatchOut@lemmy.world · 6 months ago

It’s an expensive carbon spewing parrot.

Threeme2189@lemmy.world · 6 months ago

It’s a very resource intensive autocomplete

vala@lemmy.world · 6 months ago

No shit

surph_ninja@lemmy.world · 6 months ago

You assume humans do the opposite? We literally institutionalize humans who not follow set patterns.

LemmyIsReddit2Point0@lemmy.world · 6 months ago

deleted by creator

silasmariner@programming.dev · 6 months ago

Some of them, sometimes. But some are adulated and free and contribute vast swathes to our culture and understanding.

petrol_sniff_king@lemmy.blahaj.zone · 6 months ago

Maybe you failed all your high school classes, but that ain’t got none to do with me.

surph_ninja@lemmy.world · 6 months ago

Funny how triggering it is for some people when anyone acknowledges humans are just evolved primates doing the same pattern matching.

ZILtoid1991@lemmy.world · 6 months ago

Thank you Captain Obvious! Only those who think LLMs are like “little people in the computer” didn’t knew this already.

TheFriar@lemm.ee · 6 months ago

Yeah, well there are a ton of people literally falling into psychosis, led by LLMs. So it’s unfortunately not that many people that already knew it.

joel_feila@lemmy.world · 6 months ago

Dude they made chat gpt a little more boit licky and now many people are convinced they are literal messiahs. All it took for them was a chat bot and a few hours of talk.

flandish@lemmy.world · 6 months ago

stochastic parrots. all of them. just upgraded “soundex” models.

this should be no surprise, of course!

atlien51@lemm.ee · 6 months ago

Employers who are foaming at the mouth at the thought of replacing their workers with cheap AI:

🫢

monkeyslikebananas2@lemmy.world · 6 months ago

Can’t really replace. At best, this tech will make employees more productive at the cost of the rainforests.

atlien51@lemm.ee · 6 months ago

Yes but asshole employers haven’t realized this yet

LonstedBrowryBased@lemm.ee · 6 months ago

Yah of course they do they’re computers

finitebanjo@lemmy.world · 6 months ago

That’s not really a valid argument for why, but yes the models which use training data to assemble statistical models are all bullshitting. TBH idk how people can convince themselves otherwise.

Encrypt-Keeper@lemmy.world · 6 months ago

TBH idk how people can convince themselves otherwise.

They don’t convince themselves. They’re convinced by the multi billion dollar corporations pouring unholy amounts of money into not only the development of AI, but its marketing. Marketing designed to not only convince them that AI is something it’s not, but also that that anyone who says otherwise (like you) are just luddites who are going to be “left behind”.

turmacar@lemmy.world · edit-2 6 months ago

I think because it’s language.

There’s a famous quote from Charles Babbage when he presented his difference engine (gear based calculator) and someone asking “if you put in the wrong figures, will the correct ones be output” and Babbage not understanding how someone can so thoroughly misunderstand that the machine is, just a machine.

People are people, the main thing that’s changed since the Cuneiform copper customer complaint is our materials science and networking ability. Most things that people interact with every day, most people just assume work like it appears to on the surface.

And nothing other than a person can do math problems or talk back to you. So people assume that means intelligence.

finitebanjo@lemmy.world · 6 months ago

I often feel like I’m surrounded by idiots, but even I can’t begin to imagine what it must have felt like to be Charles Babbage explaining computers to people in 1840.

intensely_human@lemm.ee · 6 months ago

They aren’t bullshitting because the training data is based on reality. Reality bleeds through the training data into the model. The model is a reflection of reality.

finitebanjo@lemmy.world · edit-2 6 months ago

An approximation of a very small limited subset of reality with more than a 1 in 20 error rate who produces massive amounts of tokens in quick succession is a shit representation of reality which is in every way inferior to human accounts to the point of being unusable for the industries in which they are promoted.

And that Error Rate can only spike when the training data contains errors itself, which will only grow as it samples its own content.

intensely_human@lemm.ee · 6 months ago

Computers are better at logic than brains are. We emulate logic; they do it natively.

It just so happens there’s no logical algorithm for “reasoning” a problem through.

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

archive.is