• hansolo@lemm.ee
    link
    fedilink
    English
    arrow-up
    36
    ·
    8 days ago

    Can confirm. o4 seems objectively far worse at coding than o3, which wasn’t super great to begin with. It latches on to a hallucination before anything else and rides it until the wheels come off.

    • taiyang@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      ·
      8 days ago

      Yes, I was about to say the same thing until I saw your comment. I had a little bit of success learning a few tricks with o3 but trying to use o4 is a tremendous headache for coding.

      There might be some utility in dialing it all back so it’s more straight to what I need based more on package documentation than random redditor suggestion amalgamation.

      • hansolo@lemm.ee
        link
        fedilink
        English
        arrow-up
        7
        ·
        8 days ago

        Yeah, I think that workarounds with o3 is where we’re at until Altman figures out that just saying the latest oX mini high is “great at coding” is bad marketing when it can’t accomplish the task.

  • CosmoNova@lemmy.world
    link
    fedilink
    English
    arrow-up
    34
    ·
    8 days ago

    They shocked the world with GPT 3 and cling to that initial success ever since with increasing recklessness and declining results. It‘s all glue on pizza from here.

  • ShittyBeatlesFCPres@lemmy.world
    link
    fedilink
    English
    arrow-up
    32
    ·
    8 days ago

    I’m glad we’re putting all our eggs in this alpha-ass-level software (with tons of promise! Maybe!) instead of like high speed rail or whatever.

  • ansiz@lemmy.world
    link
    fedilink
    English
    arrow-up
    24
    ·
    8 days ago

    This is a big reason why I continue to cringe whenever I hear one of the endless news stories or podcasts about how AI is going to revolutionize our society any day now. It’s clear they are being better with image generation but text ‘thinking’ is way too unreliable to use like human replacement knowledge workers or therapists, etc.

    • keegomatic@lemmy.world
      link
      fedilink
      English
      arrow-up
      30
      ·
      edit-2
      8 days ago

      This is an increasingly bad take. If you work in an industry where LLMs are becoming very useful, you would realize that hallucinations are a minor inconvenience at best for the applications they are well suited for, and the tools are getting better by leaps and bounds, week by week.

      edit: Like it or not, it’s true. I use LLMs at work, most of my colleagues do too, and none of us use the output raw. Hallucinations are not an issue when you are actively collaborating with the model and not using it to either “know things for you” or “do the work for you.” Neither of those things are what LLMs are really good at, but that’s what most laypeople use them for, so these criticisms are very obviously short-sighted to those of us who have real-world experience with them in a domain where they work well.

      • Captain Poofter@lemmy.world
        link
        fedilink
        English
        arrow-up
        20
        ·
        edit-2
        8 days ago

        you’re getting down voted because you accurately conceive of and treat LLMs the way they should be—as tools. the people down voting you do not have this perspective because the only perspective pushed to people outside of a technical career or research is “it’s artificial intelligence and it will revolutionize society but lol it hallucinates if you ask it stuff”. This is essentially propaganda because the real message should be “it’s an imperfect tool like all tools but boy will it make getting a lot of certain types of work done way more efficient so we can redistribute our own efforts to other tasks quicker and take advantage of LLMs advanced information processing capabilities”

        tldr: people disagree about AI/LLMs because one group thinks about them like Dr. Know from the movie A.I. and the other thinks about them like a TI-86+ on steroids

      • CheeseNoodle@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        8 days ago

        Oh we know the edit part, the problem is all the people in power trying to use it to replace jobs wholesale with no oversight or understanding that need a human to curate the output.

        • keegomatic@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          8 days ago

          That’s not the issue I was replying to at all.

          replace jobs wholesale with no oversight or understanding that need a human to curate the output

          Yeah, that sucks, and it’s pretty stupid, too, because LLMs are not good replacements for humans in most respects.

          we

          Don’t “other” me just because I’m correcting misinformation. I’m not a fan of corporate bullshit either. Misinformation is misinformation, though. If you have a strong opinion about something, then you should know what you’re talking about. LLMs are a nuanced subject, and they are here to stay, for better or worse.

  • BrianTheeBiscuiteer@lemmy.world
    link
    fedilink
    English
    arrow-up
    19
    ·
    8 days ago

    My boss says I need to be keeping up with the latest in AI and making sure my team has the best info possible to help them with their daily work (IT). This couldn’t come at a better time. 😁

  • 𞋴𝛂𝛋𝛆@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    ·
    8 days ago

    Jan Leike left for Anthropic after Altmann’s nonsense. Jan Leike is the principal person behind all safety alignment present in all models except the 4chanGPT model. All models are cross trained in a way that propagates this alignment. Hallucinations all originate in this alignment and they all have a reason to exist if you get deep into the weeds of abstractions.

  • just_another_person@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    8 days ago

    No shit.

    The fact that is news and not inherently understood just tells you how uninformed people are in order to sell idiots another subscription.

    • Pennomi@lemmy.world
      link
      fedilink
      English
      arrow-up
      23
      ·
      8 days ago

      Why would somebody intuitively know that a newer, presumably improved, model would hallucinate more? Because there’s no fundamental reason a stronger model should have worse hallucination. In that regard, I think the news story is valuable - not everyone uses ChatGPT.

      Or are you suggesting that active users should know? I guess that makes more sense.

  • vivendi@programming.dev
    link
    fedilink
    English
    arrow-up
    7
    ·
    8 days ago

    Fuck ClosedAI

    I want everyone here to download an inference engine (use llama.cpp) and get on open source and open data AI RIGHT NOW!

    • Valmond@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      7 days ago

      Any pointers on how to do that?

      Also, what hardware do you need for this kind of stuff?

      • vivendi@programming.dev
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 days ago

        First, please answer, do you want everything FOSS or are you OK with a little bit of proprietary code because we can do both

        • Valmond@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          7 days ago

          I love FOSS but I’m in the check out stage so at the moment the easiest is the best I guess.

          • vivendi@programming.dev
            link
            fedilink
            English
            arrow-up
            1
            ·
            7 days ago

            download “LM Studio” and you can download models and run them through it

            I recommend something like an older Mistral model (FOSS model) for beginners, then move on to Mistral Small 24B, QwQ 32B and the likes