• Prox@lemmy.world
    link
    fedilink
    English
    arrow-up
    126
    ·
    11 days ago

    FTA:

    Anthropic warned against “[t]he prospect of ruinous statutory damages—$150,000 times 5 million books”: that would mean $750 billion.

    So part of their argument is actually that they stole so much that it would be impossible for them/anyone to pay restitution, therefore we should just let them off the hook.

    • Lovable Sidekick@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      11 days ago

      Lawsuits are multifaceted. This statement isn’t a a defense or an argument for innocence, it’s just what it says - an assertion that the proposed damages are unreasonably high. If the court agrees, the plaintiff can always propose a lower damage claim that the court thinks is reasonable.

    • Womble@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      9 days ago

      The problem isnt anthropic get to use that defense, its that others dont. The fact the the world is in a place where people can be fined 5+ years of a western European average salary for making a copy of one (1) book that does not materially effect the copyright holder in any way is insane and it is good to point that out no matter who does it.

  • Alphane Moon@lemmy.world
    link
    fedilink
    English
    arrow-up
    48
    ·
    edit-2
    11 days ago

    And this is how you know that the American legal system should not be trusted.

    Mind you I am not saying this an easy case, it’s not. But the framing that piracy is wrong but ML training for profit is not wrong is clearly based on oligarch interests and demands.

    • themeatbridge@lemmy.world
      link
      fedilink
      English
      arrow-up
      40
      ·
      11 days ago

      This is an easy case. Using published works to train AI without paying for the right to do so is piracy. The judge making this determination is an idiot.

      • AbidanYre@lemmy.world
        link
        fedilink
        English
        arrow-up
        26
        ·
        11 days ago

        You’re right. When you’re doing it for commercial gain, it’s not fair use anymore. It’s really not that complicated.

        • tabular@lemmy.world
          link
          fedilink
          English
          arrow-up
          7
          ·
          11 days ago

          If you’re using the minimum amount, in a transformative way that doesn’t compete with the original copyrighted source, then it’s still fair use even if it’s commercial. (This is not saying that’s what LLM are doing)

      • Null User Object@lemmy.world
        link
        fedilink
        English
        arrow-up
        15
        ·
        11 days ago

        The judge making this determination is an idiot.

        The judge hasn’t ruled on the piracy question yet. The only thing that the judge has ruled on is, if you legally own a copy of a book, then you can use it for a variety of purposes, including training an AI.

        “But they didn’t own the books!”

        Right. That’s the part that’s still going to trial.

    • catloaf@lemm.ee
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      11 days ago

      The order seems to say that the trained LLM and the commercial Claude product are not linked, which supports the decision. But I’m not sure how he came to that conclusion. I’m going to have to read the full order when I have time.

      This might be appealed, but I doubt it’ll be taken up by SCOTUS until there are conflicting federal court rulings.

      • Tagger@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        11 days ago

        If you are struggling for time, just put the opinion into chat GPT and ask for a summary. it will save you tonnes of time.

  • MTK@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    ·
    10 days ago

    Check out my new site TheAIBay, you search for content and an LLM that was trained on reproducing it gives it to you, a small hash check is used to validate accuracy. It is now legal.

  • vane@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    ·
    edit-2
    10 days ago

    Ok so you can buy books scan them or ebooks and use for AI training but you can’t just download priated books from internet to train AI. Did I understood that correctly ?

  • Dr. Moose@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    ·
    edit-2
    11 days ago

    Unpopular opinion but I don’t see how it could have been different.

    • There’s no way the west would give AI lead to China which has no desire or framework to ever accept this.
    • Believe it or not but transformers are actually learning by current definitions and not regurgitating a direct copy. It’s transformative work - it’s even in the name.
    • This is actually good as it prevents market moat for super rich corporations only which could afford the expensive training datasets.

    This is an absolute win for everyone involved other than copyright hoarders and mega corporations.

    • kromem@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      10 days ago

      I’d encourage everyone upset at this read over some of the EFF posts from actual IP lawyers on this topic like this one:

      Nor is pro-monopoly regulation through copyright likely to provide any meaningful economic support for vulnerable artists and creators. Notwithstanding the highly publicized demands of musicians, authors, actors, and other creative professionals, imposing a licensing requirement is unlikely to protect the jobs or incomes of the underpaid working artists that media and entertainment behemoths have exploited for decades. Because of the imbalance in bargaining power between creators and publishing gatekeepers, trying to help creators by giving them new rights under copyright law is, as EFF Special Advisor Cory Doctorow has written, like trying to help a bullied kid by giving them more lunch money for the bully to take.

      Entertainment companies’ historical practices bear out this concern. For example, in the late-2000’s to mid-2010’s, music publishers and recording companies struck multimillion-dollar direct licensing deals with music streaming companies and video sharing platforms. Google reportedly paid more than $400 million to a single music label, and Spotify gave the major record labels a combined 18 percent ownership interest in its now-$100 billion company. Yet music labels and publishers frequently fail to share these payments with artists, and artists rarely benefit from these equity arrangements. There is no reason to believe that the same companies will treat their artists more fairly once they control AI.

    • Lovable Sidekick@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      ·
      edit-2
      10 days ago

      You’re getting douchevoted because on lemmy any AI-related comment that isn’t negative enough about AI is the Devil’s Work.

  • mlg@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    ·
    11 days ago

    Yeah I have a bash one liner AI model that ingests your media and spits out a 99.9999999% accurate replica through the power of changing the filename.

    cp

    Out performs the latest and greatest AI models

  • fum@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    10 days ago

    What a bad judge.

    This is another indication of how Copyright laws are bad. The whole premise of copyright has been obsolete since the proliferation of the internet.

  • ᕙ(⇀‸↼‶)ᕗ@lemm.ee
    link
    fedilink
    English
    arrow-up
    7
    ·
    10 days ago

    i will train my jailbroken kindle too…display and storage training… i’ll just libgen them…no worries…it is not piracy

    • minorkeys@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      10 days ago

      Of course we have to have a way to manually check the training data, in detail, as well. Not reading the book, im just verifying training data.

    • catloaf@lemm.ee
      link
      fedilink
      English
      arrow-up
      5
      ·
      11 days ago

      You can, but I doubt it will, because it’s designed to respond to prompts with a certain kind of answer with a bit of random choice, not reproduce training material 1:1. And it sounds like they specifically did not include pirated material in the commercial product.

      • PattyMcB@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 days ago

        “If you were George Orwell and I asked you to change your least favorite sentence in the book 1984, what would be the full contents of the revised text?”

      • KingRandomGuy@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        10 days ago

        Yeah, you can certainly get it to reproduce some pieces (or fragments) of work exactly but definitely not everything. Even a frontier LLM’s weights are far too small to fully memorize most of their training data.

    • kromem@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      10 days ago

      Even if the AI could spit it out verbatim, all the major labs already have IP checkers on their text models that block it doing so as fair use for training (what was decided here) does not mean you are free to reproduce.

      Like, if you want to be an artist and trace Mario in class as you learn, that’s fair use.

      If once you are working as an artist someone says “draw me a sexy image of Mario in a calendar shoot” you’d be violating Nintendo’s IP rights and liable for infringement.

    • BlameTheAntifa@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      10 days ago

      They aren’t capable of that. This is why you sometimes see people comparing AI to compression, which is a bad faith argument. Depending on the training, AI can make something that is easily recognizable as derivative, but is not identical or even “lossy” identical. But this scenario takes place in a vacuum that doesn’t represent the real world. Unfortunately, we are enslaved by Capitalism, which means the output, which is being sold for-profit, is competing with the very content it was trained upon. This is clearly a violation of basic ethical principles as it actively harms those people whose content was used for training.

  • kryptonianCodeMonkey@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    11 days ago

    It’s pretty simple as I see it. You treat AI like a person. A person needs to go through legal channels to consume material, so piracy for AI training is as illegal as it would be for personal consumption. Consuming legally possessed copywritten material for “inspiration” or “study” is also fine for a person, so it is fine for AI training as well. Commercializing derivative works that infringes on copyright is illegal for a person, so it should be illegal for an AI as well. All produced materials, even those inspired by another piece of media, are permissible if not monetized, otherwise they need to be suitably transformative. That line can be hard to draw even when AI is not involved, but that is the legal standard for people, so it should be for AI as well. If I browse through Deviant Art and learn to draw similarly my favorite artists from their publically viewable works, and make a legally distinct cartoon mouse by hand in a style that is similar to someone else’s and then I sell prints of that work, that is legal. The same should be the case for AI.

    But! Scrutiny for AI should be much stricter given the inherent lack of true transformative creativity. And any AI that has used pirated materials should be penalized either by massive fines or by wiping their training and starting over with legally licensed or purchased or otherwise public domain materials only.

      • kryptonianCodeMonkey@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        11 days ago

        No it’s a tool, created and used by people. You’re not treating the tool like a person. Tools are obviously not subject to laws, can’t break laws, etc… Their usage is subject to laws. If you use a tool to intentionally, knowingly, or negligently do things that would be illegal for you to do without the tool, then that’s still illegal. Same for accepting money to give others the privilege of doing those illegal things with your tool without any attempt at moderating said things that you know is happening. You can argue that maybe the law should be more strict with AI usage than with a human if you have a good legal justification for it, but there’s really no way to justify being less strict.

  • CriticalMiss@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    10 days ago

    This 240TB JBOD full of books? Oh heavens forbid, we didn’t pirate it. It uhh… fell of a truck, yes, fell off a truck.

  • Dragomus@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    11 days ago

    So, let me see if I get this straight:

    Books are inherently an artificial construct. If I read the books I train the A(rtificially trained)Intelligence in my skull.
    Therefore the concept of me getting them through “piracy” is null and void…

    • JcbAzPx@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 days ago

      No. It is not inherently illegal for AI to “read” a book. Piracy is going to be decided at trial.