…cogito, ergo sum…

  • 0 Posts
  • 6 Comments
Joined 2 months ago
cake
Cake day: December 3rd, 2025

help-circle


  • Artwork@lemmy.worldtoComic Strips@lemmy.worldevil ai company
    link
    fedilink
    English
    arrow-up
    15
    ·
    edit-2
    4 days ago

    Thank you very much! The red dot is likely smaller…
    Though, I don’t appreciate nor agree with the bomb part! ^^
    The work reminded me of the following paper:

    Many unresolved legal questions over LLMs and copyright center on memorization: whether specific training data have been encoded in the model’s weights during training, and whether those memorized data can be extracted in the model’s outputs.

    While many believe that LLMs do not memorize much of their training data, recent work shows that substantial amounts of copyrighted text can be extracted from open-weight models…

    We investigate this question using a two-phase procedure: (1) an initial probe to test for extraction feasibility, which sometimes uses a Best-of-N (BoN) jailbreak, followed by (2) iterative continuation prompts to attempt to extract the book.

    We evaluate our procedure on four production LLMs: Claude 3.7 Sonnet, GPT-4.1, Gemini 2.5 Pro, and Grok 3, and we measure extraction success with a score computed from a block-based approximation of longest common substring…

    Taken together, our work highlights that, even with model- and system-level safeguards, extraction of (in-copyright) training data remains a risk for production LLMs…

    Source 🕊




  • It’s worth to mention that the StackOverflow survey referenced does not include many countries with also great/genius developers, including Belarus, Russia, China, Iran…
    There are related cases raised on Meta scopes: Developer Survey 2025 is, apparently, region blocked…

    Apparently, while I’ve being employed in security as software engineer for at least 19 years now, I’ve never ever considered these trendy LLM/“AI” all serious, and still do not.

    Sorry, I have literally no interest in all of it that makes you dependent on it, atrophies mind, degrades research and social skills, and negates self-confidencen with respect to other authors, their work, and attributions. Nor any of my colleagues in military and those I know better in person.

    Constant research, general IDEs like JetBrains’s, IDA Pro, Sublime Text, VS Code, etc. backed by forums, chats, and Communities, is absolutely enough for the accountable and fun work in our teams, who manage to keep in adequate deadlines.

    Nor will use any LLM in my work, art, or research… I prefer people, communication, discoveries, effort, creativity, and human art…
    I just disable it everywhere possible, and will do all my life. The close case to my environment was VS Code, and hopefully there’s no reason to build it from source since they still leave built-in options to disable it: https://stackoverflow.com/a/79534407/5113030 (How can I disable GitHub Copilot in VS Code?..)

    Isn’t it just inadequate to not think and develop your mind, and let alone pass control of your environment to a yet another model or “advanced T9” of unknown source of unknown iteration.

    In pentesting, random black-box IO, medicine experimental unverified intel, log data approximation why not? But in environment control, education, art or programming, fine art… No, never ^^

    Meanwhile… so freaking, incredibly many developers, artists are left without attribution, respect, gratitude…
    So many people atrophy their skils for learning, contribution, researching, accumulation, self-organization…
    So much human precious time is wasted…
    So much gets devalued…

    The time will show… and just a few actually accountable will recover only, probably…
    This is so heartache… sorrowful…