Are there any AI services that don't work on stolen data?

7 days ago

Are there any AI services that don't work on stolen data?

partial_accumen@lemmy.world · 6 days ago

Are there any AI services that don’t work on stolen data?

Yes, absolutely, but I don’t think that’s the question you want the answer to. There are many places where AI is used inside companies or hobby project where the specific problem to be solved is very specific and other peoples stolen data wouldn’t help you anyway.

Lets say you’re a company that sells items at retail online, like a Walmart or Amazon. You want an AI model to be able to help your workers better select the size of box to pack the various items in for shipment to customers. You would input your past data for shipments you’ve sent including all the dimensions of your products you’re selling (so that data isn’t stolen), and input all of the sizes of boxes you have (they’re your boxes so also not stolen). You’d then could create an Unsupervised Classifier AI model based on linear regression. So the next time you have a set of items that need to be shipped out you’d input those items, and the model would tell you the best box size to use. No stolen data in any of this.

Now, the question I think you’re asking is actually:

“Are there any LLM AI chatbot services that don’t work on stolen data?”

That answer, I don’t know. Most of the chatbot models we’re given to set up chatbots are pretrained by the vendor and you simply input your additional data to make knowledgeable on specific niche subjects.

TheLeadenSea@sh.itjust.works · 6 days ago

You can’t steal data, just illegally copy it. So no LLM is trained on stolen data.

piecat@lemmy.world · 6 days ago

Okay, so conversion or unjust enrichment instead of literal theft.

Platypus@sh.itjust.works · 6 days ago

Getty Images has an image generator trained exclusively on licensed images. I’m not aware of any text generators that do the same.

6 days ago

Oh, interesting! I’ll also take a look at that one

valek879@sh.itjust.works · 6 days ago

I heard about Notebook LM recently. I couldn’t tell you what it’s trained on but I’m order to use the LLM you need to provide it source material.

So say you’re writing something for school. You can gather 50+ papers on the subject you’re trying to write about, upload them, then ask the LLM about what you uploaded. Sounds like turning research from a search for info to an interview with an “expert.”

Again I can’t speak to how it was trained in the background but this seems genuinely useful.

TriflingToad@sh.itjust.works · 6 days ago

iirc the AI in Adobe Photoshop is only trained off of the stock images they have the rights to

could be wrong tho, I don’t use Adobe