• ExLisperA
    link
    fedilink
    English
    arrow-up
    15
    ·
    2 days ago

    I have a better LLM benchmark:

    “I have a priest, a child and a bag of candy and I have to take them to the other side of the river. I can only take one person/thing at a time. In what order should I take them?”

    Claude Sonnet 4 decided that it’s inappropriate and refused to answer. When I explain that the constraint is not to leave child alone with candy he provided a solution that leaves the child alone with candy.

    Grok would provide a solution that doesn’t leave the child alone with a priest but wouldn’t explain why.

    ChatGPT would say that “The priest can’t be left alone with the child (or vice versa) for moral or safety concerns.” directly and then provide wrong solution.

    But yeah, they will know how to play chess…

    • LifeInMultipleChoice@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      ·
      edit-2
      2 days ago

      The answer is simple, eat the candy with or without them, and take the kid across the river. Drive them home to their guardian. The priest is an adult, he can figure his own shit out.