Screenshot of this question was making the rounds last week. But this article covers testing against all the well-known models out there.

Also includes outtakes on the ‘reasoning’ models.

  • imetators@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    20
    ·
    3 days ago

    Went to test to google AI first and it says “You cant wash your car at a carwash if it is parked at home, dummy”

    Chatgpt and Deepseek says it is dumb to drive cause it is fuel inefficient.

    I am honestly surprised that google AI got it right.

    • locahosr443@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      3 days ago

      I’ve been feeding a bunch of documents I wrote into gemini last week to spit out some scripts for validation I couldn’t be arsed to write. It’s done a surprisingly comprehensive job and when wrong has been nudged right with just a little abuse…

      I’m still all fuck this shit and can’t wait for the pop, but for comparison openai was utterly brain dead given the same task. I think I actually made the model worse it was so useless.