Hacker News — vinext + Cloudflare Workers

new
past
show
ask
show
jobs
submit

▲Zuckerberg 'Personally Authorized and Encouraged' Meta's Copyright Infringement (variety.com)

373 points by spankibalt 13 hours ago | 334 comments

glaslong 1 hours ago [-]

All those lawsuits against students who downloaded but didn't even redistribute mp3s. Less than a fair use transformation. Just the file download itself. ... Lesson learned: those students should have stolen millions instead!

ben_w 13 hours ago [-]

A lot of people would be very pleased if this leads to Zuckerberg getting even the statutory minimum damages ($750?) on each infringement.

The previous infringement case with Anthropic said that while training an AI was transformative and not itself an infringement, pirating works for that purpose still was definitely infringement all by itself. The settlement was $1.5bn, so close to $3k for each of the 500k they pirated, so if Zuckerberg pirated "millions" (plural) it is quite plausible his settlement could be $6bn.

qingcharles 7 hours ago [-]

What's frustrating is all those kids who got criminal charges for running MP3 sites back in the day [1], and this guy rips off every piece of media in existence and will walk away literally because he's too rich to be charged.

[1] See, e.g. https://en.wikipedia.org/wiki/Oink%27s_Pink_Palace#Legal_pro...

shrubby 3 hours ago [-]

https://pluralistic.net/2025/04/23/zuckerstreisand/

Cory Doctorow wrote a nice summary of the Zuckerstreisand book by Sarah Wynn-Williams.

"First, Facebook becomes too big to fail.

Then, Facebook becomes too big to jail.

Finally, Facebook becomes too big to care."

AgentME 23 minutes ago [-]

I liked Doctorow better before he cheered for stricter copyright enforcement.

matheusmoreira 1 hours ago [-]

> When Wynn-Williams give birth to her second child, she hemorrhages, almost dies, and ends up in a coma.

> Afterwards, Kaplan gives her a negative performance review because she was "unresponsive" to his emails and texts while she was dying in an ICU.

Holy shit.

qingcharles 2 hours ago [-]

Thank you, that was the quote I was thinking of, but couldn't remember.

nadermx 3 hours ago [-]

I just don't see why everyone seems to not be cheering that perhaps we are not going to go back to the days where all those kids are going to be re charged. It almost feels like everyone wants to go back to labels carpet bombing students with lawsuits[0]

[0] https://w2.eff.org/IP/P2P/riaa-v-thepeople.html

davkan 2 hours ago [-]

As someone who’s engaged in private piracy basically my entire life I’ve never even considered venturing into gray areas of licensing when procuring for my company. In fact I’ve done the opposite and rooted it out wherever I’ve found it.

It just seems obvious to me that a profit seeking venture should be held to a higher standard when it comes to infringing on the property rights of other companies and individuals, especially if they seek to enforce their own.

Those kids weren’t hypocritically enforcing their own property rights and making employees sign ndas while downloading shit from tpb.

justacrow 1 hours ago [-]

Do you think if there was a mass movement of students moving off Spotify and downloading MP3s, they would _not_ be charged today?

The hypocrisy is what has at least me upset

themafia 3 hours ago [-]

False dichotomy. We can obviously have both. We can destroy corporations that rely on copyright to exist and then abuse that system to profit. We can also ignore college students and minor contributory copyright infringement.

The difference in scope here should be obvious.

We can similarly punish drug dealers while not punishing drug users. In fact it's already policy in large parts of the USA.

nadermx 3 hours ago [-]

To quote another user in this thread

"Thats such a non sequitur. This isnt a weed legalisation argument, its "Do we make IP worse for everyone, because you dont like some people benefiting from fair use"."

armada651 2 hours ago [-]

When corporations were posed with this question numerous times in the past, their answer has always been an emphatic "Yes!".

Teever 2 hours ago [-]

Because the 'perhaps' there is a load-bearing word that is doing a lot of work and it's going to be come crashing down sooner or later.

Of course some kids are going to be charged for this kind of shit, it's still a rules for thee but not for me world, the 'not for me' folks are just a hell of a lot more brazen about it.

falsemyrmidon 1 hours ago [-]

https://en.wikipedia.org/wiki/Capitol_Records%2C_Inc._v._Tho...

24 songs and was at one point $80k per song, almost 20 years ago. Let's let Zuck off with an even 100k per infringement.

matheusmoreira 1 hours ago [-]

Definitely what pisses me off the most. All these "pirates"? Arrested. Why isn't the copyright industry raiding the homes of these tech billionaires then? Why isn't SWAT pointing guns at their faces while the squad seizes all of their computers and equipment? Why aren't these CEOs in cuffs?

NoMoreNicksLeft 6 hours ago [-]

What's frustrating is that I don't even consider infringement to be a crime. Why are you all so upset about this, rather than his real crimes?

matheusmoreira 57 minutes ago [-]

I'm a copyright abolitionist. I don't care at all that they're training AIs on copyrighted works. I care a lot that they're not getting relentlessly hunted down by the copyright industry for it like all the "pirates" that came before them. The copyright industry has actually ruined lives by litigating their "infringement" nonsense. It's only fair that they go after this guy as well.

His constant violation of people's privacy is also horrendous and worthy of condemnation, but that's not directly related to the copyright infringement matter. It's a separate issue.

ethbr1 6 hours ago [-]

You get Al Capone on the charge you can make stick.

timcobb 2 hours ago [-]

Right but Al Capone did jail time, here Zuck gets to break and enter into people's homes, take their stuff, then haggle for it after-the-fact, all the while keeping the civilization-domination apparatus that he built using the stuff he stole? That is super not fair. Ordinary people could certainly not get away with that.

stubish 6 hours ago [-]

Lets define more things society doesn't want to happen as not-crimes so we can do more of them.

verisimi 3 hours ago [-]

Principles and law (that determines 'crime', a legal word) are not the same thing.

bix6 6 hours ago [-]

What are his real crimes?

hsuduebc2 5 hours ago [-]

I'm kinda being upset because on top of his ridiculously amoral and sometimes illegal behavior there are people which lives were ruined because they shared few mp3 files. Now this person once again — have absolutely no responsibility for his actions even for something so idiotic like copyright infringement when others were severely punished.

qingcharles 6 hours ago [-]

Why not both?

archagon 4 hours ago [-]

Because the rich can do it and we can’t.

protocolture 4 hours ago [-]

I do it literally all the time.

j-bos 5 hours ago [-]

It's the increase in emotionality, principles loosely held, it allows a particular goal they get tossed, Tbc this extends far beyond the current topic and commenters.

_s_a_m_ 36 minutes ago [-]

I a just world he should end forever in jail for the things he has done

timcobb 2 hours ago [-]

Okay but... I am very unimpressed by this. How is it that he then gets to still be an AI monopolist/hegemonist? How's that fair? He basically force-acquired all this stuff without asking, now he's haggling for it later. Where are the criminal charges? Where is the deprivement of, if not freedom, then equity assets.

grebc 11 hours ago [-]

Nothing will happen to him/Meta while DJT is president.

He bought the best protection around for breaking the law.

dehrmann 7 hours ago [-]

I'm not sure what Trump's levers are with this since it's a civil matter. There's no DOJ--it's publishers and an individual vs. Meta.

kevin_thibedeau 7 hours ago [-]

He likes sham investigations of attorneys general.

GolfPopper 10 hours ago [-]

[flagged]

stackghost 8 hours ago [-]

[flagged]

LastTrain 7 hours ago [-]

Psst. The Epstein Files are the distraction…

fzil 6 hours ago [-]

i thought the iran war was distraction from the epstein files. i'm losing all track of all these distractions.

alex1138 4 hours ago [-]

Before people comment on this, I'd like to point out the regime killed 12,000 people in a span of 2 days. They've been brutally murdering them for decades. Distraction or not.

utopiah 3 hours ago [-]

Here I am, finally cheering for IP lawyers. /$

gloxkiqcza 12 hours ago [-]

For context, his net worth is ~$220 billion.

azinman2 11 hours ago [-]

And meta's worth is much more than that. He's not personally paying.

ben_w 10 hours ago [-]

A company being "worth" some amount doesn't mean it has that much money and real property; it means there exist people willing to buy shares, on the margin, at a price which works out like that. One of the common (very rough) approximations is that a business is worth as much as the profit it's expected to make over the next 20 years. But one of the reasons (there are many) that this is only a rough guide, is that if you tried to sell too much of a big company all in one go, it usually depresses the price a lot, and the other way around (trying to buy a whole company) tends to raise the price a lot; both effects are because most people have different ideas about how much any given company is really worth despite that rough guide, and trade their shares at different prices while you're doing it. You may note this is a circular argument, this is indeed part of the problem.

IIRC, Facebook's cash is more like $81-82 billion.

Nevermark 1 hours ago [-]

Yes it is a different kind of worth, but it is not worth less because of it.

This common argument to not take market cap valuations seriously doesn't hold.

True, Meta as an entire entity is not liquid. A forced sale in entirety would produce a massive reduction in compensation. But that is a highly unlikely and contingent reduction.

It is also true that if you have Meta's equivalent in cash, the value of the cash is likely to drop, while the value of Meta likely to grow, over any appreciable time. In that sense, $X cash is worth much "less" than the $X market cap.

These seeming contradictions are the result of different practical tradeoffs in structures of wealth. Not because market caps reflect misleading or overstated accounting.

dylan604 9 hours ago [-]

At the same time, isn't Zuck's worth based on his shares of evilCorp while evilCorp's shares are what you just said. Ergo, the Zuck isn't worth all that either???

ben_w 9 hours ago [-]

Yup. All the headlines following the pattern "${billionaire} {gains|loses} ${x} billion this week" are mostly just fluff, the marginal share price of any given stock wanders all over the place even without forced sales or people trying to buy them out.

There's some interesting exceptions, like how Musk has managed to sell Tesla shares totalling more or less as much as the business itself has made in total lifetime revenue; but even then, Musk's theoretical net worth is very different from how much he could get if he was forced to sell all his shares suddenly.

Owner-CEOs like Musk and Zuckerberg get all the effects of such randomness, but the only examples I can think of such people getting into billion-dollar legal troubles tend to be examples which go on to sink their companies completely, so I'm not sure what impact a fine of "merely" 10% of cash reserves would do to investor confidence as expressed in share price. And this is not the only legal case Meta's facing right now.

ScoobleDoodle 9 hours ago [-]

It doesn't seem to be mostly just fluff to me.

MacKenzie Scott (Jeff Bezos' ex wife) show it can be turned into real money. As of December 2025 She had given away $7.1 billion in 2025 charitable donations, and $26.3 billion since 2019.

In reality there is the ability to execute on the shares to turn them into real money.

Jeff Bezos holds less than 10% of Amazon stock himself. Which is a huge amount of money, and a not insignificant amount of which can be turned into "real" money and even with some decline is still a phenomenal amount.

In that same time period the stock valuation has more than doubled.

financetechbro 9 hours ago [-]

Zuck can just take out loans against his equity. He doesn’t need to sell any of it to benefit from Metas “worth”

litoE 9 hours ago [-]

Plus, the money he borrows is not taxable. If he sold stock he would have to pay taxes before he could spend the income. Sure, he now owes money to someone, but he can refinance those loans again and again, and live tax-free the rest of his life while we, poor working stiffs, pay the taxes that built the airport where he parks the private jet he bought with the money he borrowed.

naniwaduni 8 hours ago [-]

People seem to get the weird idea that borrowing against their stock holdings is some special thing rich people get to do with products that the rest of us don't have access to. It's not. Margin loans are widely available to the tune of ff+1%ish or lower, and if your brokerage's publicly offered rates are probably a ripoff, they're almost certainly negotiable. The bar for access to "institutional" rates is basically 100k, the regulatory requirement for portfolio margin.

Yes, there are specialized products catered to billionaires. But those aren't getting them better rates than someone with a $200k portfolio (Zuck is not conventionally a less risky borrower than the Options Clearing Corporation!). They exist to work around the fact that some borrowers can't just casually liquidate their stock on the open market, let alone at face value. By all accounts these products are more expensive than retail.

Mostly this is an expensive (but maybe still less expensive than taxes, depending on the rate environment—it's more of a no-brainer in ZIRPland) way to diversify out of a single-stock portfolio without selling by adding leverage. At Zuck's age, it's still very unlikely to make sense to borrow instead of sell to spend. He's been known to pay real taxes in the past, they just look small relative to his imputed wealth growth because rich people don't spend a lot relative to their wealth growth because they, quite by definition, have a lot of wealth.

_DeadFred_ 7 hours ago [-]

I think people take issue with the taxes loophole. They have GAINED from the VALUE of their stocks, but they don't pay taxes on that. It should be law if you realize value from stocks you pay capital gains on those stocks. So if a loan is collateralized by $1,000,000 worth of stock value taxes should be paid on $1,000,000.

naniwaduni 6 hours ago [-]

The trouble is that a bank is not lending against the nominal value of the stock as collateral. That number is almost entirely fictional. Taxation of capital gains at time of sale is less a loophole than a reflection of the difficulty of assigning a fair price to assets that are not perfectly liquid.

Also, you'd totally gut retail home equity lending as collateral damage, with disastrous social policy consequences.

grebc 7 hours ago [-]

I wouldn’t exactly call it a loophole as such. And you can’t just Willy Nilly tax loan values.

Any asset a bank is willing to take is collateral has the same issue, it’s just very pronounced in this instance.

If you take your idea at face value, anyone who borrows against their property to renovate/upgrade would be up for tax.

thomastjeffery 8 hours ago [-]

That's why billionaires use shares as collateral to get loans. It's money once removed, and it continues to be spendable so long as the share price stays high.

I sincerely doubt that Meta's share price would crash as a result of Zuckerberg getting an expensive judgement.

bamboozled 8 hours ago [-]

There will be not a single consequence for any of this.

nielsbot 7 hours ago [-]

In a just system there would be jail time (if found guilty). Barring that a modest fine. Say, $1T.

LastTrain 7 hours ago [-]

That’ll keep him from even thinking of doing something like that again! /s

jcalvinowens 10 hours ago [-]

I had to block meta's ASN on my personal cgit server a few weeks ago because they were ignoring robots.txt and torching it. Like hundreds of megabytes of access logs just from them, spread around different network blocks to clearly try and defeat IP based limiting. I couldn't believe it.

dawnerd 3 hours ago [-]

I had to last year too, nonstop crawling, random urls that didn't exist. It looked like they were trying to proxy user queries through to a search endpoint too. The ASN matched so I know it wasn't someone spoofing them.

bflesch 9 hours ago [-]

IMO ASN-based blocking should be much more common, but unfortunately it is not supported as a first-class configuration option in many common tools.

jcalvinowens 9 hours ago [-]

Yeah, I dont know how anybody stays sane without it. I have a list of over a thousand ASNs I blackhole at this point...

Mine is a daily bash cronjob that fetches a text-based database and uses grep to build an nftables-apply script with all the IPs for the blocked ASNs. I keep meaning to share it, but it's embarrassingly messy I haven't had time to clean it up...

dlivingston 2 hours ago [-]

It would still be useful to share as an example and reference point. People can use Claude Code / etc. to re-write it to their specific situation.

noxvilleza 6 hours ago [-]

It's been a real game of cat and mouse over the last few years. I used to do daily iptables updates to block repeat scrapers on my small niche stats site I run. About 5-6 ago it become more common to see broader ranges - so I started blocking ASNs which worked great (esp for the regulars like Alibaba, Tencent, compromised DigitalOcean/OVH, ...). In the last 2-3 years though the overall bot traffic has skyrocketed - it's easy to spot bot activity after the fact (no requests to the CDN for static assets, user agent changes from one request to the next, predictable ID enumeration, etc) but not in a real time. They're also often using residential-based proxies and Cloudflare bot detection has become pretty bad.

walrus01 9 hours ago [-]

It's a real pain in the ass because in the absence of ASN based blocking, you often have to give something a long list of IP ranges in CIDR notation, and be certain you don't "miss" even one ipv4 /23 or /24 or a crawler will get through.

hsuduebc2 5 hours ago [-]

Hey, how do you identify them? Is there a service to recognize which of these companies scrapped you?

websap 9 hours ago [-]

[flagged]

jesse_dot_id 9 hours ago [-]

The world would be a much better place if these kinds of engineers had a spine.

websap 5 hours ago [-]

Yeah they’d have to use it to stand at the back of the unemployment line. Companies don’t care, someone more desperate will take the job.

dlivingston 2 hours ago [-]

Are you one of those engineers building said crawlers, by any chance?

scottyah 8 hours ago [-]

Some spines are just crooked, and the extra rigidity would hurt more than help.

debo_ 8 hours ago [-]

"One moment: reticulating spines..."

ttoinou 8 hours ago [-]

They could even feed 20 kids

modeless 7 hours ago [-]

Funny how people are suddenly on Elsevier's side. It's clear to me that AI training is transformative fair use under existing law. Maybe this will be the case to prove it.

eloisius 7 hours ago [-]

I find it grating that so many AI boosters try to frame pushing back against the AI industry as a sudden about-face for everyone that spent the last 20 years pushing back against the copyright industry. I’m also in favor of decriminalizing or legalizing small amounts of pot for personal use. That doesn’t mean I’m behind industrialized narcotic production on such a huge scale that it that it starts to distort the economy, and companies looking for new ways to add methamphetamine to every goddamn product.

protocolture 4 hours ago [-]

>I find it grating that so many AI boosters try to frame pushing back against the AI industry as a sudden about-face for everyone that spent the last 20 years pushing back against the copyright industry.

What do you think the outcome of tightening fair use is going to be? Do you think its going to be most effectual against these big evil AI companies we are meant to fear? Or is it going to end up putting more individual creators on the end of Disneys pitchforks?

Like if you support creating a gun to kill a monster, that's great. But you need to understand that weapons rarely only target the person you want them to. And its unlikely that any bill that specifically targets a certain size or profit margin is going to make it all the way into law without being generalised to the approval of large IP holders.

Its much much (much) better to look at this as an opportunity to erode IP laws for everyone, than to make them worse and hope that your particular enemies are the only ones that are affected.

>That doesn’t mean I’m behind industrialized narcotic production on such a huge scale that it that it starts to distort the economy, and companies looking for new ways to add methamphetamine to every goddamn product.

Thats such a non sequitur. This isnt a weed legalisation argument, its "Do we make IP worse for everyone, because you dont like some people benefiting from fair use".

citadel_melon 3 hours ago [-]

One could imagine a different legal standards for recreational, research, and commercial uses.

warkdarrior 2 hours ago [-]

> One could imagine a different legal standards for recreational, research, and commercial uses.

Meta used allegedly stolen copyrighted materials to train a model they shared for free with the whole world. Is this a recreational use?

dfxm12 5 hours ago [-]

It would be disingenuous framing because the argument against copyright stems from a belief that information should be free. Meta does not do things in this spirit. There's no about face needed...

AnthonyMouse 4 hours ago [-]

> It would be disingenuous framing because the argument against copyright stems from a belief that information should be free. Meta does not do things in this spirit.

Don't they? They release the llama model weights, they do things like this:

https://www.opencompute.org/wiki/Open_Rack/SpecsAndDesigns

They also make significant contributions to Linux and are the originators of popular open source projects like zstd and React.

They make their money from selling ads, not selling licenses.

xigoi 3 hours ago [-]

They only released the weights because someone leaked them.

AnthonyMouse 44 minutes ago [-]

Someone leaked the llama 1 weights before they were released. That doesn't explain why they would release the subsequent versions except that they wanted to.

2ndorderthought 6 hours ago [-]

Speaking of ai and meth, have you seen videos of the palantir CEO Alex karp? Dude looks like he's regularly getting the same meth shots Hitler used to get.

But I hear you. One of my biggest tells that someone can't be reasoned with is when they resort to whataboutism without any consideration for how 2 situations can actually be different even if there is some commonality. It's a powerful bad faith argument technique. When that style of argument comes up I nod my head and walk away. Some people are just doomed.

chungusamongus 6 hours ago [-]

[flagged]

2ndorderthought 6 hours ago [-]

I am not s copyright maximalist, but I would tell you be careful of a world where copyright and IP is meaningless. Might as well let any other country/company one shot your entire industry.

chungusamongus 6 hours ago [-]

Slippery slope, false dilemma, etc. What other fallacies do you have in your utility belt, batman?

2ndorderthought 6 hours ago [-]

How did you know I was Bruce wayne?

malfist 4 hours ago [-]

Where's my goddamn electric car Bruce?

nadermx 6 hours ago [-]

I also find it funny, I said this regarding the other thread and article[0]

'"They then copied those stolen fruits"

How are these fruits "stolen" if they still have what was allegedley stolen?

Dowling v. United States, 473 U.S. 207 (1985): The Supreme Court ruled that the unauthorized sale of phonorecords of copyrighted musical compositions does not constitute "stolen, converted or taken by fraud" goods under the National Stolen Property Act

And even if, arguendo, sure its stolen. The purpose of copyright is to "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries"

And you would be hard pressed to prove that LLM's haven't advanced the arts and sciences, so at bare minimum transformative, ie fair use.'

[0] https://news.ycombinator.com/item?id=48026207#48029072

Johnny555 6 hours ago [-]

>How are these fruits "stolen" if they still have what was allegedley stolen?

If you write a book and I take it and embed its knowledge into my product that is so pervasive that no one needs to buy your book any more (and I don't even credit you so no one knows where that knowledge came from), to you really still have what was stolen? And I didn't even buy a copy of your book to copy it.

AnthonyMouse 4 hours ago [-]

> If you write a book and I take it and embed its knowledge into my product that is so pervasive that no one needs to buy your book any more (and I don't even credit you so no one knows where that knowledge came from), to you really still have what was stolen?

The trouble with this analogy is that it proves too much.

Suppose you write a book, and so does someone else, but they have better marketing than you and then people in the market for that genre buy theirs instead of yours. Let's even stipulate that the existence of their book actually lowers your sales, because people who want that kind of book already bought theirs by the time they find out about yours and then some people don't have time to read or can't afford to buy both.

Notice that we haven't yet said a word about the contents of either book. They could be completely independent and they've never even heard of you or your book -- they "didn't even buy a copy of your book to copy it". All we know is that they're the same genre and the existence of theirs is costing you sales. By that logic all competition would thereby be "stealing", and that can't be right.

Which implies that you don't have a property right to the customers.

m4x 3 hours ago [-]

A better analogy would be that you do original research or work and produce a valuable book. Somebody else looks at your work, decides it has value, and reproduces it in a new book under their name. The new book is cheaper, or easier to find, or for whatever reason displaces your original book created through your own research and investment. Now somebody else is profiting off your creativity or work, without payment or even acknowledgement.

I'm not sure how this plays out legally, but it certainly seems unethical

AnthonyMouse 51 minutes ago [-]

So for example, when Disney sees value in public domain stories like Cinderella, Rapunzel/Tangled or Snow White, and they make movies out of them, profiting from the creativity and work of the Brothers Grimm without paying anything to their estate, or high school plays do Shakespeare, that seems unethical to you?

Would it be fair for Greece to do retroactive term extensions all the way back to Plato and then sue anyone who copies the idea of having a university or uses the Platonic solids or distributes religious texts that incorporate the dualistic theory of the soul?

2 hours ago [-]

blks 2 hours ago [-]

Why are you talking about this case that case nothing to do with the topic at hand? The comment you’re replying to gives a very clear and narrow analogy, and you’re talking about something else.

AnthonyMouse 30 minutes ago [-]

How is it something else? It's the same analogy. The problem with it is that the harm from the alleged theft doesn't require any use of the original material in order to happen, since that "harm" is competition rather than expropriation.

The attempt to distinguish them is through copying, but that's the part that isn't depriving anyone of anything.

throwawayIche9j 6 hours ago [-]

Yes. That's not to say that something damaging wasn't done, but nothing was stolen. Stealing/theft requires deprivation of property. It's like receiving a normal nonlethal punch in the face and calling it murder. Murder requires someone dying.

> Theft [...] is the act of taking another person's property or services without that person's permission or consent with the intent to deprive the rightful owner of it. --- https://en.wikipedia.org/wiki/Stealing

skeeter2020 5 hours ago [-]

>> Stealing/theft requires deprivation of property

maybe you should look up the definition of property, which is a set of legally recognized rights over a thing, typically including:

* possession (what you're focusing on)

* use

* exclusion

* transfer

The last 3 seem like they have been breached, in legally that's theft.

jasomill 5 hours ago [-]

Violation of these rights may be criminal without meeting the strict legal definition of theft.

This can even extend to stealing physical property.

Depending on local laws, stealing a car may not actually be theft if the defendent can prove they intended to return it before the owner got home from work, though it would certainly be considered theft in the colloquial sense of the term, and they would still be guilty of a lesser offense like civil and/or criminal conversion.

throwawayIche9j 4 hours ago [-]

> Depending on local laws, stealing a car may not actually be theft if the defendent can prove they intended to return it before the owner got home from work

I doubt there's even one place where the law works like that.

KPGv2 4 hours ago [-]

> I doubt there's even one place where the law works like that.

In a lot of places, that's how it works. A key element of theft is the intent to permanently deprive someone of property.

This is why joyriding isn't classified as auto theft and is instead a lesser offense. It's because joyriding is an intent to temporarily deprive, while GTA is an intent to permanently deprive.

In some jxns (the UK is one), there is a tort called trespass to goods, and an example of this would be "stealing" someone's property to deliver to another location for them to use there. The tort of conversion is similar: interference with someone's property right to treat it as your own (silent as to length of time).

throwawayIche9j 5 hours ago [-]

Theft is not the breach of any property right. It's specifically the deprivation of property without consent. Yes, I have checked the definition in my jurisdiction.

Getting punched in the face also violates rights, yet isn't murder. Murder is specifically about dying.

odo1242 5 hours ago [-]

You’re splitting hairs over a definition that isn’t relevant here (theft and copyright infringement are different things) to defend something that even you agree is bad.

throwawayIche9j 4 hours ago [-]

It isn't splitting hairs. The damages are completely different in nature.

With theft, the entire damage is the deprivation. It could be an heirloom or some other object that may have been entrusted to you, something that can never be replaced, memorabilia of loved ones. Something that you may have needed in your posession to survive (e.g. a car to go to your job).

With a given copyright violation, the damage is that maybe[1] you made less profit than you could have. The potential for profit is not property. Profit isn't guaranteed.

[1] The loss is not certain, because there's no guarantee that the ones consuming the copyrighted content could have even afforded it.

rustystump 5 hours ago [-]

You forget that laws are made by people and at anytime they can change interpretations are arbitrary, roe vs wade today but not tomorrow.

People seem to think what ai is today is theft. If enough people agree, it will be theft. Big companies dont like this and push the other way. An objectiveness doesnt exist here. It is too wiggly

4 hours ago [-]

KPGv2 4 hours ago [-]

My God, I can't believe chodes are still playing this "how many angels can you fit on the head of a pin" navel gazing semantic argument. Thirty years at least, it was all you saw on fin de ciecle Slashdot from anyone with a six-digit UID. No one cares about your hyper literalist meaning of "theft," that's not the goddamn point. Christ, this place looks like Reddit more and more.

This isn't a court of law. We don't have to talk like lawyers. If you replaced "theft" with "copyright infringement" in the comment you had such a problem with, what meaningfully changes besides we all have about five additional brain cells?

visarga 4 hours ago [-]

Even the case for copyright infringement is weak. LLMs are not copying machines, we already have copying machines at much lower price, almost zero, and perfect fidelity and much faster than generating it probabilistically. So it makes no economic sense to spend billions on training and inference to make a copier. In fact the value of LLMs is where they do not copy but apply knowledge a new situation.

AnthonyMouse 4 hours ago [-]

> If you replaced "theft" with "copyright infringement" in the comment you had such a problem with, what meaningfully changes besides we all have about five additional brain cells?

The obvious difference that copyright is subject to fair use and various other limitations that personal property isn't.

2ndorderthought 6 hours ago [-]

Cool cool cool. So all the code and data you send to anthropic and chatgpt should be mass distributable to forward other peoples arts and science? All your meeting notes with ai summarizers, slack chats with bots? Might as well put your entire company and all plans for it on github mit licensed. Ill take a peek, see if there's anything valuable to me in that. Don't worry you can keep it all on your github too. It's still yours afterall. Copilot will be training on it too though btw

IAmLiterallyAB 6 hours ago [-]

That's a privacy violation, not relevant.

2ndorderthought 6 hours ago [-]

No it's not. You exposed that data to an LLM. Should have read the fine print. The laws around that don't make sense to me anymore so therefore I own that stuff now. That's how this works right? You do know chatgpt etc can read everything you write, right?

Also social media profile pics. Great way to get faces for deep fake ads. Most people are just 1 phone call away from being voice cloned. Our likeness isn't all that important either if you think about it.

Maybe meta will clone your writing style and sign into your meta account and message your friends telling them about this awesome new product. Meta owns the account and you uploaded data to it.

Our_Benefactors 6 hours ago [-]

Literally none of these things are defensible positions, so nobody will take you seriously.

2ndorderthought 6 hours ago [-]

Many of the things I wrote are already happening. The others probably are but haven't been reported yet.

collabs 5 hours ago [-]

I think Anthorpic has pledged to not use team and enterprise user's data for training purposes. I don't mind if they do some verification or whatever as long as it doesn't end up in the responses it gives others.

HWR_14 3 hours ago [-]

What Silicon Valley company over a decade old has respected the limitations on using data that they agreed to? At least any valuable data.

KPGv2 4 hours ago [-]

yes yes and google pledged "don't be evil"

Don't be naïve. A corporation would tear the flesh from your body if it meant a better quarterly earnings report.

albedoa 5 hours ago [-]

You were swiftly corrected about your misunderstanding under your original comment. Reposting it here, removing the quote farther from its context, and hoping to not be downvoted again is very weird!

nadermx 3 hours ago [-]

I don't see how me quoting the actual complaint the news was about, in both threads, was me being swiftly corrected. If you where to base it on upvotes then this one shows I'm right and you got swiftly corrected here. In both cases it was relevant as both threads where not yet merged and about the same complaint. And held two positons on front page and I was adding to the discourse.

protocolture 4 hours ago [-]

>It's clear to me that AI training is transformative fair use under existing law.

I wouldn't even go that far. Its an entirely new product. Its like the guy who sold you the keyboard demanding royalties for the software you built.

That the person who wrote the book couldn't predict a new use case for the book in training LLMs, is irrelevant. The book isn't in the LLM. Its not being sold with the LLM. Its one of billions of tools used to create the LLM.

People try and sell this as the AI companies extracting value from the poor little IP holders like Disney. Its maddening. That content is your cultural heritage. It already belongs to you, just some idiot has been granted a lifetime of exclusive exploitation. An LLM is trained on data you already own. Disney et al wants to exploit the new technology to extract even more money out of stuff created often decades ago.

At absolute worst its reverse engineering, which was supposed to be fair use protected in the US but apparently that's been somewhat eroded.

xigoi 3 hours ago [-]

> The book isn't in the LLM.

An LLM is essentially a lossy compression of the training data. The book absolutely is in there, it’s just mangled to the point of unrecognizability.

protocolture 3 hours ago [-]

The wood tends to have an impression of the hammer that hits it. The book isn't in there, the weights are just shaped by what tools were used to form it.

When large quantities of source material are replicable by prompting its a bug not a feature.

gizajob 3 hours ago [-]

If my book isn’t in your LLM, then prove it and don’t use my book to train your LLM.

protocolture 38 minutes ago [-]

>don’t use my book to train your LLM.

What makes you think you are entitled to tell people what they can and cant do with data they purchased (or otherwise acquired) from you. Extremely honest question. I just cant put myself in your shoes.

Like if I had written anything useful I would be overwhelmingly flattered that my content be considered so worthy for inclusion.

Your profile suggests that you are a philosopher. Did you get into philosophy hoping to exploit the publishing industry to the extent that you can squeeze every cent out of your thoughts, and deny their potential uses downstream?

Its actually crazy how bad things are, I am usually keen on capitalism and exclusivity, but the whole thing with LLMs, I see people pushing hard to tighten the grip of intellectual property. I see people making 50 cents a month on Kindle Unlimited suddenly shocked that someones LLM generated output might be ever so slightly influenced by weights ever so slightly influenced by their work, seemingly thinking they might get some big payday out of it.

Give me a tiny little wedge of understanding of your thought process. Your book is right now, doing a greater social good on your behalf than me running around and removing all the trash from my neighborhood, and the benefits of that social good are going to accrue long after you and I are gone. Your work is now going to live on, in a very tiny way, in these systems forever. I am honestly envious.

If anything, I would be trying to get bad writing removed from LLM training data. Things that I dont want to influence others. But as a potentially honest promoter of your work, you want it removed?

Whats the number? If not 1:1 exactly what you charge for the book, what do you think the proper compensation for slightly influencing training weights you should receive?

eloisius 9 minutes ago [-]

> What makes you think you are entitled to tell people what they can and cant do with data they purchased

Hundreds of years of copyright law. I bought a copy of Windows, but I’m not allowed to modify that data with a cracker and sell a bootleg DVD of it.

I should edit to clarify that I’m not a big fan of Lars Ulrich or Disney, but I don’t think we’re going to get a win here for the recreational IP pirates. What’s more likely is that we’ll end up with some Frankenstein law that favors both Mikey Mouse and OpenAI, and you and I will neither get free movies nor the ability to earn a living off of our creative labor.

conception 7 hours ago [-]

Illegally obtaining copyrighted materials is usually the issue not the transformation part

akerl_ 6 hours ago [-]

Looking at the complaint ( https://publishers.org/wp-content/uploads/2026/05/2026-05-05... ), that seems like the part that's got the most solid foundation, especially given that while torrenting the books, they were also seeding to other peers.

The items they call out around training the models (and attempting to claim that each subsequent model generation should count as an additional instance of infringement) seem far less grounded in the current court interpretations of AI training.

King-Aaron 6 hours ago [-]

Absorb all "our" IP without consent, in doing so remove "our" own source of revenue, and then repackage it as their own product. Not really fair use IMO.

visarga 4 hours ago [-]

How does that work? Is it a kind of infringement without substantial similarity?

King-Aaron 2 hours ago [-]

I find it hard to think of a reasonable analogy. But it's like coming into your house, stealing all your belongings, and then building a new house with all your shit inside and then selling it back to you.

matheusmoreira 47 minutes ago [-]

The enemy of my enemy, and all that.

stiray 6 hours ago [-]

It actually depends on evilness of the company. Elsevier is just less evil that Zuckerberg and Meta, while publishers are even less problematic. I dont think there is anything funny in that.

Or anything to defend on Meta. If they go out of business, humanity profits.

blks 2 hours ago [-]

When you use millions of copyrighted materials to bundle together to produce a commercial product, I wouldn’t call that a fair use. Especially when licensing of such material doesn’t explicitly allow that, the material wasn’t even purchased on consumer markets and your commercial product may be a competitor/analogue to the copyrighted material.

Not even going to all GPL stuff, that in a better world should have screwed all the slop companies

4k0hz 5 hours ago [-]

Elsevier is shitty to people doing stuff that (imo) should be allowed. Meta is making money doing the same thing and not getting the same shittiness from Elsevier.

Elsevier at least works within the (admittedly broken) system, Meta does not.

whattheheckheck 6 hours ago [-]

If i could ask for a summary from an llm vs buy a book id go with the summary. That eats into commercial use and the supreme court case sided with Gerald Ford when a newspaper published a small gist of his autobiography because it ate into the sales

Larrikin 6 hours ago [-]

Every single Wikipedia article of a book or TV show has this summary. Ford should have lost.

2ndorderthought 6 hours ago [-]

Yea nope. I like the full book without any loss of information. Even if I don't want to read the entire book. LLMs love to respond even when something is outside of their training set.

6 hours ago [-]

rvz 6 hours ago [-]

> It's clear to me that AI training is transformative fair use under existing law. Maybe this will be the case to prove it.

That is not what this case is about. It is more about the illegal violation and piracy of copyrighted content done by Meta for commercial use and Zuck knew they were doing it.

Why did Anthropic settle [0] with a multi-billion dollar payout to authors after commercializing their LLMs that was trained off of copyrighted content that was illegally obtained and kept without the authors permission?

There's a reason why they (Anthropic) did not want it to go to trial. (Anthropic knew they would lose and it would completely bankrupt them in the hundreds of billions.)

AI boosters will do anything to justify the mass piracy and illegal obtainment of copyrighted material for commercial use (not research) which that is not fair use in the US. There is no debate on this. [0]

[0] https://images.assettype.com/theleaflet/2025-09-27/mnuaifvw/...

visarga 4 hours ago [-]

I think copyright is far for being the most important aspect related to AI, it's geopolitical and economical. And even if it was the most important, there is only a case to be made for 1. that copy used to train models and 2. rare or induced regurgitation by targeted prompting.

The original work is not replicated identically, why would we replicate a work when it can be more easily seen in original or replaced with an alternative options online. We use AI to produce new outputs to new situations. We already have had drives and networking for plain copying.

platevoltage 4 hours ago [-]

Such a garbage take. This is not a parody or a critique. Mark Zuckerberg is not Weird Al Yankovic.

__loam 4 hours ago [-]

It's not settled law so I'm not sure how that's clear to you.

brendoelfrendo 4 hours ago [-]

I think this completely misses the point... the point is that Meta pirated the media they used to train their model.

I am not a fan of US copyright law, but if I torrented millions of books, I would be facing a felony charge in criminal court and a (with statutory damages as high as $150,000 per title in cases of willful infringement) multi-billion dollar lawsuit in civil court.

In my opinion, this has nothing to do with whether or not AI training is transformative and this fair use, and everything to do with whether or not the laws apply to everyone equally. If Facebook isn't forced to pay billions and elect a sacrificial executive to serve prison time, then I will remain angry.

stackghost 6 hours ago [-]

I'm not on Elsevier's side, but I still think it's bullshit that giant companies are allowed to do things at a scale that I'd go to prison for.

platevoltage 4 hours ago [-]

That's always going to be true for the Capitalist class.

stackghost 2 hours ago [-]

And yet I continue to rage against the dying of the light.

happytoexplain 6 hours ago [-]

"Funny" is how dishonest snipes are framed. It such a common trope of internet quips, it's wearing me out. Can we please try to just format our disagreements without the snideness?

nullsanity 6 hours ago [-]

[dead]

Telaneo 8 hours ago [-]

Looking forward to the personal liability.

I've wondered what the legalese justification for letting liability evaporate as it does so often with corps. So far the reasons I'm left with are 'shrugs' and 'the relevant provision (seemingly? apparently?) simply don't apply', neither of which are any good.

I was going to make a joke about how we should attach magnets to Aaron Swartz' corpse, since that'd make for a pretty potent energy source, given how fast he must be spinning. But honestly, I think he would have seen this sort of thing coming, given how his case was handled and how things really haven't gotten any better.

Aurornis 4 hours ago [-]

The handling of Aaron Swartz’s case was a travesty, but he wasn’t indicted for piracy. The charges were for fraud, unlawfully accessing a protected computer, and damaging a computer.

In the years since the basis of the case has been forgotten and replaced with an assumption about piracy, but it was a case about unlawful access.

woah 8 hours ago [-]

Alternate reality Aaron Swartz escaped canonization and is now running an AI/crypto startup that pays you to upload training data with his YC alum buddies

Telaneo 8 hours ago [-]

Every now and then, I feel like we live in the worst possible world. Then I realise it could be much worse.

This does not comfort me.

forestingfisher 23 minutes ago [-]

Based. If i read a book from a piracy site, i can still cite that book publicly. This should also apply to AI models. I am also opposed to copyright at all, but that’s another question

_s_a_m_ 37 minutes ago [-]

Cant wait for absolutely no consequences. Consequences are for peasants like us.

soundworlds 8 hours ago [-]

I should hope that if Zuckerberg isn't severely punished for this, it at least sets a legal precedent for every other person to do the same with immunity.

All the Aaron Schwartzes of the future could freely share scientific papers with the world.

agnosticmantis 6 hours ago [-]

Willing to bet they'll lobby for regulatory capture and raise the drawbridge for the little guys.

motbus3 9 hours ago [-]

I know personally a case of a engineer who was told to do something despite all the legal problems because the company had lawyers for a reason

Telaneo 8 hours ago [-]

I'd love for that to come out during discovery when the lawsuit hits, but it probably never will. Blowing the whistle is also not a great option in this economy, although I wish more people did.

28304283409234 12 hours ago [-]

So... "move fast and steal things"?

lm411 8 hours ago [-]

When the AI scrapers were just getting started, that is basically what I thought - their plan was to scrape / suck up everything they possibly could before people realized what was happening and blocked them.

The rate at which they were spidering and scraping was so far beyond what any other supposedly legit spider was doing, it seemed like the logical explanation.

eowln 1 hours ago [-]

Steal things? What is this, the “you wouldn’t pirate a car” argument again? I thought we were well over that.

pseudalopex 5 hours ago [-]

Move fast and break laws.

mil22 9 hours ago [-]

It started at the top and at the beginning.

vips7L 9 hours ago [-]

The biggest theft from the working class that has ever happened.

platevoltage 4 hours ago [-]

In Mark's case, he still breaking things too.

MengerSponge 11 hours ago [-]

Always Has Been

1 hours ago [-]

bawolff 7 hours ago [-]

Does it matter? The company's liability would (i assume) not change if the ceo authorized it or some other high level figure authorized it.

The question to answer is, did it happen and if so is this copyright infringement (not covered by fair use), not which company official authorized it.

ipython 11 hours ago [-]

Just gonna say... Aaron Swartz faced years of prison time and ultimately decided to take his own life... for downloading scientific journal articles... to share freely with the world (aka not even profiting from it).

But a multi-billion dollar corporation downloading millions of copyrighted creative works so that they can reshape the entire labor market by training a new type of artificial intelligence model on that data set? Meh, sounds like Silicon Valley disruption, give the man a medal!

defen 8 hours ago [-]

One man illegally downloading copyrighted material is a crime. Multinational corporations illegally downloading copyrighted material is the only remaining growth area in the US economy and vital to national security.

platevoltage 4 hours ago [-]

They should make another one of those PSAs. "You wouldn't steal 10,000,000 cars".

spongebobstoes 8 hours ago [-]

Aaron Swartz was treated unjustly because copyright sucks. we should oppose such laws and treatment, not wield them as retributive tools against our opponents

it is wrong to advocate for everyone to be treated equally unjustly. better to advocate for the removal of the bad laws/structures

ipython 7 hours ago [-]

It would be easier to advocate for the reform of those laws if they were actually applied evenly.

I’m not calling for its use as a “retributive tool”. Just that it be applied evenly.

spongebobstoes 6 hours ago [-]

advocating for more punishment under copyright law is directly opposed to reform or removal of the laws

court precedent is a useful tool of advocacy

ipython 6 hours ago [-]

Technically he wasn’t charged with any copyright violations. See indictment: https://www.documentcloud.org/documents/217117-united-states...

jmye 4 hours ago [-]

What an asinine argument. Advocating for enforcement of existing laws is advocating for enforcement of existing laws. That’s it. Good god.

jmye 4 hours ago [-]

> not wield them as retributive tools against our opponents

No, we should apply them equally to Mark Fucking Zuckerberg (which is decidedly not retributive, however much you want to make an emotional appeal) until such time as they are repealed as laws. It’s not really that complicated.

lesuorac 9 hours ago [-]

And Jstor dropped the lawsuit when Aaron deleted his local copy. DOJ didn't drop theirs.

I doubt Meta has deleted their local copy though ...

qingcharles 7 hours ago [-]

It's absolutely unthinkable that Meta and friends aren't still using a corpus containing the entirety of every book they can obtain. There is no way they're building frontier LLMs without it. You can be sure as hell the Chinese are doing it, so the US corps are absolutely still doing it.

alex1138 8 hours ago [-]

And also I think MIT didn't defend Aaron but maybe I'm wrong about that

zajio1am 8 hours ago [-]

Well, Meta also shared their AI models freely with world

Melatonic 10 hours ago [-]

Truly ahead of his time

TiredOfLife 2 hours ago [-]

> Aaron Swartz faced years of prison time and ultimately decided to take his own life.

According to comments here that was totally deserved. You should not mess with copyright.

alex1138 10 hours ago [-]

Had Aaron copied Snapchat 5 times the DOJ would've been fine with it all. His fault for not having the foresight

alex1138 10 hours ago [-]

(I'm being sarcastic. Zuck gets rewarded for continually copying Snapchat features into his products)

dbg31415 28 minutes ago [-]

https://www.tomshardware.com/tech-industry/artificial-intell...

> "81.7TB"

https://en.wikipedia.org/wiki/United_States_v._Swartz

> "approximately 70 gigabytes"

SrslyJosh 11 hours ago [-]

Rules for thee but not for me.

zx8080 7 hours ago [-]

Can someone explain why are we reading this instead of "Meta was fined for copyright infrigement" news?

2ndorderthought 7 hours ago [-]

Because meta will delay any case for several years. Then the lawyers will settle for 1/100th to 1/1000th of what they stole quietly. Meta will rebrand and change its name again just like it did after its last major scandal.

No accountability for rich people has funny patterns like this.

Cider9986 5 hours ago [-]

They might not need to change their name. I don't think that copyright infringement is seen as bad by Americans compared to the privacy stuff that Facebook is known for—not that most Americans care about privacy, I guess I don't really know why Facebook rebranded.

Personally, I would be happy if AI companies are what finally take down intellectual monopoly (intellectual property). I know being anti-intellectual-monopoly isn't a common view, but i don't see average people thinking it is so important—as you can see by the huge increases in piracy recently. Could be wrong about this, I haven't done research on public opinion about copyright.

Honestly, this whole case could be great. Either copyright loses, good for us. Or Zuckerberg loses, also good for us.

I would say that copyright loses is better for society than Zuckerberg loses because, my wish for Zuckerberg to lose is from hatred, while my wish for copyright to be abolished is from my wish to help humanity.

Even Supreme Court justices[1] have said the case for copyright is thin.

[1] (before he became a justice) https://en.wikipedia.org/wiki/The_Uneasy_Case_for_Copyright

gizajob 2 hours ago [-]

They don’t need to rebrand - “Meta” (after / exceeding) is a catch all for whatever they’re being meta at today: piracy / privacy infringement / theft / slop production etc.

wrxd 6 minutes ago [-]

Nah, it’s short for metastasis. The only apt name for a company that is after growth an any cost

tbrownaw 6 hours ago [-]

Well the article says this is the start of a lawsuit, so maybe wait for it to work its way through the courts?

solid_fuel 7 hours ago [-]

In 2024, voters signaled that they don't care about corruption when they reelected the most corrupt administration in American history. Since then, there has been a widespread understanding that the rich will not face consequences in this country. For example, take a look at the Trump administration's suppression of the Epstein files. Or the Trump families cryptocurrency schemes. Or the ridiculous ballroom.

Anyway, the point is - there will be no justice until the citizens of the united states demand it.

k33n 7 hours ago [-]

[flagged]

2ndorderthought 7 hours ago [-]

This is rage bait and isn't worth spending any oxygen on it.

k33n 7 hours ago [-]

If it elicits rage that has nothing to do with me. It's interesting that no one was able to defend their positions. So of course you guys jump to just flagging everything and doing a whole "investigation" of my posts.

Trump won because he's popular and none of the slander is sticking. Take it out on me I guess.

jkubicek 7 hours ago [-]

https://www.readtangle.com/the-everything-everywhere-all-at-...

This article doesn’t even remotely itemize all of Trumps corruption, but it’s long and extremely damning.

I would hope that anyone still supporting this administration reads this article and does some introspection on why. I’m guessing that ship probably sailed 6 years ago, though.

k33n 2 minutes ago [-]

None of that is really very damning at all. I was excited to support Trump from day one. When people claimed he supported white supremacy, and it turned out that he condemned it out loud in press conferences countless times, I stopped taking the criticism too seriously. The Russian agent allegations increased my skepticism. Then when his opposition claimed he instructed the nation to inject bleach I just tuned it out for good. None of it is real. Egg prices were the big issue until they drastically decreased. It will be the same with gas prices.

throwaway-11-1 7 hours ago [-]

probably this, since it was 7 years after he was convicted for prostituting a minor, so its hard to believe any excuse saying they didn't know his background: https://www.yahoo.com/news/articles/epstein-secret-pic-wild-...

Also that there are over 2,000 emails with Peter Thiel. Or maybe the part where Sergey Brin was helping Epstein shop for an aircraft carrier (also after conviction). Honestly it was incredibly revealing that none of these people care that he raped kids. I would love to see the Trump files which were withheld but clearly thats never gonna happen.

Anyway, congrats to everyone involved on the MAGA golden age!

k33n 7 minutes ago [-]

It’s terrible that Epstein did that. And Thiel is a really odd duck, that’s for sure.

Do you have any evidence that files related to Trump were held back? I don’t believe that’s the case.

He’s mentioned in many of the files. I found it particularly interesting that Trump was an FBI informant that worked with the government to get Epstein convicted.

Have you done more than Trump has done to stop human trafficking? If so, please be specific.

And thank you. I’m really happy that Trump was elected. I found this year’s tax credits for social security income, overtime, and car payments on American vehicles to be especially great. Most favored nation drug pricing was also a really impressive achievement!

solid_fuel 6 hours ago [-]

[flagged]

k33n 17 minutes ago [-]

It’s hilarious to me that you’re under the impression you speak for a lot of people and that your anger over my personal views is so vitriolic.

You should also review the code of conduct for this website and learn to communicate in good faith if you expect to ever be taken seriously. Until then, I hope things get better for you man.

Larrikin 6 hours ago [-]

It is a leap of faith that they are speaking in good faith as a useful idiot.

k33n 15 minutes ago [-]

Im simply a regular guy who got exactly what I wanted when Trump was elected. Of course I’m speaking in good faith. That’s why misguided, bad faith participants flag my reasonable remarks or send insults instead of staying on topic. It’s just emotional meltdowns.

solid_fuel 6 hours ago [-]

Granted, it’s far more likely that they don’t believe a single word of the drivel they’ve been spreading across this forum, but regardless of intentionality the result is the same.

spate141 11 hours ago [-]

> a Meta spokesperson said, “AI is powering transformative innovations, productivity and creativity for individuals and companies, and courts have rightly found that training AI on copyrighted material can qualify as fair use. We will fight this lawsuit aggressively.”

> Authors have sued AI companies for copyright infringement before - and lost.

So, basically nothing will come out of this

fantasizr 11 hours ago [-]

they'll litigate how meta acquired those materials to train. you can do whatever you want with a book after it's in your house. but how did it get there?

gizajob 9 hours ago [-]

They’re already on record as hoovering up Library Genesis and Anna’s Archive. For their “fair use” copyright bonfire to train their LLM.

So not are these publishers rightfully pissed, Meta didn’t even give them the $6.99 for each epub to begin with. They’ve stolen the whole thing as part of this “fair use” campaign to destroy human authorship free of even the most basic remuneration.

alex1138 6 hours ago [-]

Fun fact, if you link AA on FB it gets removed

gizajob 3 hours ago [-]

I’m not a user but that doesn’t surprise me.

It’s also that Library Genesis was one of the best things on the internet until it came out that Meta had scraped it, at which point it became harder and harder to access. So not only did they pirate, their doing so made it harder for everyone else to enjoy piracy too.

anthk 9 hours ago [-]

Until Sony, Nintendo, Disney... sues them and Zuck craps down his pants. And the NSA themselves, too; because for sure they are half-backed from them. If they keep pirating down Japanese and European media, these can just wipe their asses with USA licenses and declare all media from the US un-Copyrighteable Europe and Japan.

pessimizer 9 hours ago [-]

Shouldn't this stuff trigger RICO? Why do torrent site operators get led off in cuffs for running operations that usually lose money, but Zuck doesn't?

RICO specifically cites "criminal infringement of a copyright" as laid out in 18 U.S. Code § 2319. If the CEO tells his employees to download hundreds of thousands of works illegally in order to carry out his money-making scheme, how is that not organized crime even if (dubiously) LLM training on the material is fair use?

-----

RICO: https://www.law.cornell.edu/uscode/text/18/part-I/chapter-96

Definitions: https://www.law.cornell.edu/uscode/text/18/1961

> As used in this chapter — (1) “racketeering activity” means (A)[...]; (B) any act which is indictable under any of the following provisions of title 18, United States Code: [...], section 2319 (relating to criminal infringement of a copyright),[...]

18 U.S. Code § 2319 - Criminal infringement of a copyright: https://www.law.cornell.edu/uscode/text/18/2319

-----

edit:

> 18 U.S. Code § 1962 - Prohibited activities

> (c) It shall be unlawful for any person employed by or associated with any enterprise engaged in, or the activities of which affect, interstate or foreign commerce, to conduct or participate, directly or indirectly, in the conduct of such enterprise’s affairs through a pattern of racketeering activity[...].

https://www.law.cornell.edu/uscode/text/18/1962

From the lawsuit:

“Meta — at Zuckerberg’s direction — copied millions of books, journal articles, and other written works without authorization, including those owned or controlled by Plaintiffs and the Class, and then made additional copies of those works to train Llama,” the suit says. “Zuckerberg himself personally authorized and actively encouraged the infringement. Meta also stripped [copyright management information] from the copyrighted works it stole. It did this to conceal its training sources and facilitate their unauthorized use.”

alex1138 8 hours ago [-]

> Meta also stripped [copyright management information] from the copyrighted works it stole. It did this to conceal its training sources and facilitate their unauthorized use.

WTF

stopbulying 8 hours ago [-]

[dead]

stopbulying 8 hours ago [-]

[dead]

andai 4 hours ago [-]

And thus sparked the entire sector of open weight LLMs...

dmitrygr 9 hours ago [-]

Who will be the first to implement a one-layer three-weight model and add it to BitTorrent? Let it “train” on all downloaded files. That makes it fair use. Am I doing this right?

nadermx 10 hours ago [-]

"They then copied those stolen fruits"

How are these fruits "stolen" if they still have what was allegedley stolen?

And you would be hard pressed to prove that LLM's haven't advanced the arts and sciences, so at bare minimum transformative, ie fair use.

RIMR 9 hours ago [-]

I think you are confusing the idiom "stolen fruits" with an actual accusation of criminal theft. Aside from its use in this phrasing, neither "theft" nor "steal" appears anywhere else in the article.

nadermx 9 hours ago [-]

The article, references the complaint. And even then, why use it at all?

_doctor_love 7 hours ago [-]

"You can be unethical and still be legal that’s the way i live my life"

- Mark Zuckerberg

1 hours ago [-]

_doctor_love 4 hours ago [-]

Note for the downvoters: this is literally what he said.

HumblyTossed 8 hours ago [-]

Waiting for the perp walk.

Tired of the double standard that CEOs get away when bad things happen (because they can’t be everywhere all the time) but all the benefits when the company makes a great profit (because they’re personally driving results!).

runjake 10 hours ago [-]

I don't have strong opinions on Zuck needing to be punished for this, because I have friends and family doing the same thing, although perhaps not at the same scale. I myself do not download copyrighted content. I think "rules for thee, not for me" goes both ways.

FireBeyond 10 hours ago [-]

How much revenue have your friends and family made from "doing the same thing"?

runjake 10 hours ago [-]

Some. In some cases they've "stolen" tens of thousands in content. Like I said, not at the same scale, but the same "crime" nonetheless.

I'd much rather prosecution focus on Zuck's more serious crimes against privacy and civilization as a whole. But maybe this is a small start?

pessimizer 9 hours ago [-]

> Some. In some cases they've "stolen" tens of thousands in content.

That's not revenue.

palata 7 hours ago [-]

Too rich to care.

alex1138 7 hours ago [-]

Honestly, too rich potentially off fraud

Consider the case of someone who gets banned but Facebook keeps collecting money on their business account. Or consider the case of Facebook's video metrics scandal, or... whatever. It's a little fuzzy translating how much value equates to how much stock price equates to how much real-world is-this-useful-to-me but it does matter when FB is accused of marketing (Aaron Greenspan, thinkcomp, has brought this up, in his 2019 testimony to UK parliament) advertising to more people in a region or country than actually physically exist

So fraud builds on itself, you have more fraud money to pay lawyers to try to defend you in fraud cases

danielmarkbruce 8 hours ago [-]

Except, as the article says.... it's not copyright infringement. Whether it should be or not is another issue.

hoppyhoppy2 8 hours ago [-]

>But the latest lawsuit alleges that Meta and Zuckerberg deliberately circumvented copyright-protection mechanisms — and had considered paying to license the works before abandoning that strategy at “Zuckerberg’s personal instruction.” The suit essentially argues that the conduct described falls outside protections afforded by fair-use provisions of the U.S. copyright code.

danielmarkbruce 6 hours ago [-]

One can allege all manner of things.

The title is clickbait at it's worst. The situation around copyright and AI is stock standard "CEO makes a decision in an area that is clear as mud".

lenerdenator 11 hours ago [-]

The behavior will continue until a consequence is imposed.

UltraSane 6 hours ago [-]

Remember when nerds loved saying "information wants to be free"?

phyzome 5 hours ago [-]

That was intended as a warning, not an aspiration. Some people misunderstood.

UltraSane 3 hours ago [-]

No, it was always meant as a good thing and was usually said in the context of censorship, which copyright is really just a form of.

josefritzishere 12 hours ago [-]

I would rather Zuckerberg do 6 months in jail and probation than fine Meta.

Lammy 10 hours ago [-]

You aren't going to be able to make me anti-piracy just because some corpo benefits from it too.

idle_zealot 10 hours ago [-]

I think this is an easy distinction to make: copyright is bullshit and knowledge should be free. I have no problem with pirates sharing information freely. I do have a problem with a company taking someone else's work and profiting from it. The only thing worse than copyright as it exists is copyright that can be selectively ignored when the powerful will it. Attempt to use copyright to promote Free software with the GPL? Ha, nope, copyright for me and not for thee; I'll train on your code and sell it back to you. You want to preserve access to a game or film that's unavailable or unplayable? Time to send the C&D and destroy you. Only bad things are possible.

Until we progress as a society to the point that we can put this system behind us we should at least fight to make enforcement uniform. In fact, uniform enforcement is probably a good starting point for arguing for abolition, as the pain of that enforcement is felt by proles and elites alike.

ginko 9 hours ago [-]

People who don't believe in copyright shouldn't be punished for "breaking" it.

Corporations believe in copyright so if they "break" it they should get punished for breaking rules they made up themselves.

Generally the law should be more strict for corporations than for real people.

edit: People downvoting can you argue why you disagree? I do think it's fair for the law to be more strict on the powerful rather than on the powerless.

tintor 7 hours ago [-]

but it is easier to enforce law on the powerless

jmclnx 11 hours ago [-]

I agree, time to start handing out real punishments, I think 6 months is way to small.

If this was you or me, we would be in prison for decades and have a fine in the millions. Time for these people to feel consequences.

As someone said, they will probably settle for around 6 billion, that is the same as say a $100 fine for us.

11 hours ago [-]

karanbhangui 11 hours ago [-]

This comment could get its own DSM classification for how insane it is.

I'm all for strong justice, but you want to imprison an executive for decades for copyright violations?

rpdillon 11 hours ago [-]

I'm gonna have to go dig up the link, but isn't there a guy that Nintendo basically has on indentured servitude for the rest of his life?

Ah, found it:

>In April 2023, a 54-year-old programmer named Gary Bowser was released from prison having served 14 months of a 40-month sentence. Good behaviour reduced time behind bars, but now his options are limited. For a while he was crashing on a friend’s couch in Toronto. The weekly physical therapy sessions, which he needs to ease chronic pain, were costing hundreds of dollars every week, and he didn’t have a job. And soon, he would need to start sending cheques to Nintendo. Bowser owes the makers of Super Mario $14.5m (£11.5m), and he’s probably going to spend the rest of his life paying it back.

I'm not even a tiny bit supportive, but there is precedent.

https://www.theguardian.com/games/2024/feb/01/the-man-who-ow...

masfuerte 11 hours ago [-]

American executives have been pushing to criminalise copyright infringement for decades, and America has worked hard to pressure countries all round the world to do this as part of trade deals. There is, for example, a Brit serving an eleven year sentence right now *.

Why should Zuckerberg be exempt?

* https://www.bbc.co.uk/news/uk-65697595

j-bos 10 hours ago [-]

Facebook isn't one of the companies that's been pushing for that.

esseph 9 hours ago [-]

How is that relevant?

j-bos 8 hours ago [-]

"American executives have been pushing to criminalise copyright infringement...Why should Zuckerberg be exempt?" Implicit relevence in the comment to which I'm replying.

esseph 8 hours ago [-]

I think we're misunderstanding one another.

Zuckerberg saying anything about copyright infringement is irrelevant to the actions Meta has taken in consuming and promoting the practice, and he should face criminal liability.

j-bos 5 hours ago [-]

I hear you, though I was replying only to the comment I replied to, so the misunderstanding is more of targe. I don't really care either way, was more being pedantic regarding the comment's internal premise and conclusion.

AlotOfReading 11 hours ago [-]

The non-strawman way to interpret the parent comment is that they want them to be treated the same as normal copyright violators. Jail is a common result of (criminal) copyright prosecution, with 44% of convicted offenders being imprisoned, averaging 25 months [0].

Now, I personally find the idea of imprisoning people for copyright offenses horrific, but I don't think it's remotely insane that someone else might come to that conclusion, given that we broadly accept it as a society.

[0] https://www.ussc.gov/sites/default/files/pdf/research-and-pu...

yorwba 11 hours ago [-]

From [0]: "In fiscal year 2017, there were 80 copyright/trademark infringement offenders who accounted for 0.1% of all offenders sentenced under the guidelines." This is such a low number that I assume most prosecuted cases are settled without ever making it to sentencing, or alternatively copyright infringement is just hardly ever prosecuted criminally at all.

pessimizer 9 hours ago [-]

I don't understand how the fact that 80 people were prosecuted for copyright violation in one year is an argument that one person shouldn't be prosecuted for copyright violation.

ginko 11 hours ago [-]

Is this controversial? Executives should be held liable, certainly moreso than just regular people sharing files.

lenerdenator 10 hours ago [-]

For better or for worse, the idea behind incorporation is that you, as an owner of part or all of the company, are separated from it financially and legally in most circumstances.

Zuckerberg may be CEO, majority shareholder, and on the board of Meta, but he didn't break copyright law, Meta did. So if there were to be a consequence, Meta would pay out the fine. Not sure how you jail a company.

Now, in a company with a real corporate governance structure, the board would look at the loss incurred by said fine, look at Zuckerberg, and immediately fire him for causing the loss. However, like I said before, Zuck's in charge of Meta, so that's not going to happen, and the fine is unlikely to be enough to drastically impact the company's profitability enough to sink his shares, which are the main repository of his wealth. So if he thinks he can make himself richer violating copyright law in the future, he will likely direct Meta to do so.

TL;DR, in the famous words of Bender from Futurama, "Hooray, the system fails again!"

Telaneo 8 hours ago [-]

> Zuckerberg may be CEO, majority shareholder, and on the board of Meta, but he didn't break copyright law, Meta did.

I'm still stuck on how Z telling Meta (or the relevant people at Meta, whatever) to go out there and do illegal shit doesn't make a court say that he's functionally done said illegal shit, or at least encouraged the company to do, and that he should thus be liable for that. It's not like there's much plausible deniability here. It'd be one thing if the lower ranks thought it'd be fine and did it of their own accord. It's quite another for Z to tell people to go nuts doing illegal shit.

The DMCA makes facilitation of copyright infringement illegal. Telling people to do copyright infringement is surely facilitation of copyright infringement. Surely then, Z having broken the DMCA is a fairly open and shut case, modulo calculating the damages. But apparently not?

lenerdenator 7 hours ago [-]

So, I'm not a lawyer.

I don't even play one on TV.

I wonder if, somehow, you could use or extend RICO statutes to cover this sort of thing.

triceratops 9 hours ago [-]

> Not sure how you jail a company.

> the fine is unlikely to be enough to drastically impact the company's profitability enough to sink his shares

You lack imagination :-) but you've identified both the problem and the solution.

gizajob 9 hours ago [-]

I’ve sometimes pondered this about the legal personhood of a company - it has most of the rights as a human being but can’t suffer any of the major consequences, such as jail.

It could be possible to construct a legalistic jail for a company whereby if it has committed the type of crime that a human could be jailed for, then it could be frozen for the duration, say ten years, and all its assets, shareholder funds, contracts, everything were frozen and impounded.

Of course this seems completely ludicrous because it’s so “out there” but it’s worth having the thought experiment. Things like “corporate manslaughter” really have few consequences for the corporation itself - if it was actually jailed for twenty years and shareholders and officers left frozen out and on pause, then it might be the kind of punishment that really counted for something.

esseph 9 hours ago [-]

> Not sure how you jail a company.

You jail the CEO and the others will stand up and take note.

"But they'll complain" who gives a fuck.

lenerdenator 7 hours ago [-]

In this case, they'll be right. That, again, is the purpose of incorporation. It's also the same concept that keeps someone from emptying out all of your personal bank accounts if your small business gets sued.

What you'd need is something that either removes that protection past a certain amount of value, or, to tell entities like Meta - which are basically sole proprietorships with window dressing - that they're not entitled to the protection of incorporation if they don't enact a real corporate governance model.

esseph 5 hours ago [-]

> It's also the same concept that keeps someone from emptying out all of your personal bank accounts if your small business gets sued.

Unless you have an SBA loan. Then the suing party can't get blood from a stone, but the federal government sure can.

ginko 9 hours ago [-]

Well I guess the idea of incorporation is wrong then. Execs and major shareholder should absolutely be held personally held liable.

surgical_fire 11 hours ago [-]

I would prefer a harsher punishment, but I would begrudgingly accept throwing him in jail for decades.

I always heard that criminals should be thrown in jail, it's time we started doing it to the real criminals.

jaredcwhite 8 hours ago [-]

Decades? Maybe not. A few years at minimum? Hell yeah!

jacques_chester 11 hours ago [-]

There aren't enough things an executive can go to jail for.

Fines don't do anything to deter bad behavior. Either:

* The company pays

* They pay and the company mysteriously increases next year's comp / grants a "loan" / etc

* D&O insurer pays

In all three cases the money comes out of the shareholders' hides. It provides zero personal deterrence. The payoff matrix, as seen by a sociopath, makes it rational to always defect against the common good.

The only punishment that can really focus attention is physical imprisonment in a facility they can't choose.

SOX did this for financial reporting and gee shucks it turned out executives can follow the law after all!

esseph 11 hours ago [-]

> I'm all for strong justice, but you want to imprison an executive for decades for copyright violations?

They stole the life's work of millions of people.

In less civilized times, they likely would have been drawn and quartered by strong horses, and had their limbs drug to the 4 corners of the continent as a warning to anyone else that would consider doing it again.

ghstinda 7 hours ago [-]

this dude got in over his head with the evil empire, it is interesting how he learned judo and tried to surf, that being said I despise social media and what it did to society

isaisabella 4 hours ago [-]

[dead]

WindyBolt907 3 hours ago [-]

[dead]

CalmBirch127 9 hours ago [-]

[dead]

wotsdat 10 hours ago [-]

[dead]

Der_Einzige 8 hours ago [-]

Good.

qarl 12 hours ago [-]

I know people really hate AI training on their work - but is it really any different than a human reading it?

I know there's a complaint that AI can verbatim repeat that work. But so can human savants. No one is suing human savants for reading their books.

Producing copyrighted material, of course. Training on copyrighted material... I just don't see it.

EDIT: Making a perfectly valid point, but it's unpopular, so down I go.

jryan49 11 hours ago [-]

I had to buy the copyrighted material before reading it... Meta apparently operates in a different legal system than me. That's my issue with it.

qarl 11 hours ago [-]

Yes, I have no objection to that part. It's the arguments that training itself is the problem.

Sarah Silverman as the most prominent example.

jryan49 7 hours ago [-]

I mean the act of reproducing the copyrighted material is what is illegal. LLMs I've used for coding has outputted exact copyrights for code verbatim into my code before. When that happens it feels kind of fishy to be honest.

qarl 6 hours ago [-]

Yes. I agree. But many people argue that training itself is a copyright violation. That's the position I'm countering here.

redsocksfan45 7 hours ago [-]

[dead]

Quarondeau 10 hours ago [-]

There's a huge difference in scale. The human mind can only process a limited portion of all works available over a lifetime. Human learning is therefore naturally limited to small-scale reuse, which serves to keep it proportional.

A machine training on all copyrighted materials in the world for commercial purposes at an industrial scale makes it disproportionate.

qarl 10 hours ago [-]

I see that as a distinction - but does it make a difference?

If a company hired hundreds of savants, then it would be illegal for them to read books?

I don't follow.

Quarondeau 10 hours ago [-]

It would hardly make a dent. And if you hired hundreds of savants, the knowledge would still be spread over hundreds of separate minds.

And even if we grant that those savants are also very skilled at creating "market substitutes" based on their training that are capable of competing with the original works, their maximum creative output would only be a relatively small number of new works, because they can only work at human speed.

qarl 10 hours ago [-]

Ok - but if a company were able to hire one million savants, you feel it should be illegal, because why?

Can you cite something in the copyright laws themselves that suggest this scale distinction?

triceratops 5 hours ago [-]

Your arguments boil down to "If someone were doing a completely different thing and that's ok, then why isn't this ok?" and "It's not in the text of the law so it's definitely fine."

The one million savants are humans, not machines. Humans get more rights automatically in our world today. That's the moral reason for why your example is not the same. The legal stuff will be worked out in the courts and legislatures of every country in the next 5 years.

Quarondeau 9 hours ago [-]

This goes back to the original purpose of copyright, which is to serve as an economic incentive for individual creators and artists to make more art, by securing exclusive rights to use their own works commercially for a specified time. The goal is both the creation of more works, but also to protect the economic viability of artists.

This principle is quite universal and can be found in many places, including the US constitution and US (supreme) court decisions, many international jurisdictions, treaties and conventions.

qarl 9 hours ago [-]

But my question is about your point of scale.

I don't understand why it should be allowed for one savant to study and answer questions about one book, but wrong for a company to hire one million savants to answer questions about one million books.

And I'm asking where in the law or case law this is supported.

thomasahle 11 hours ago [-]

The human savant will remember where they read it and give you credit. It might lead more people to read your work, and ultimately you make money.

The AI won't even know where the page of text it's seeing came from, and people will avoid your book as they can just ask the AI. So you make less money. (Talking about specialized technical books here.)

qarl 10 hours ago [-]

Not necessarily.

nancyminusone 12 hours ago [-]

No one is asking human savants about what they read 1 million times per day.

Suppose they did, and some guy was filling stadiums regularly to hear him recite an entire audio book. That would probably get the attention of someone's lawyers.

qarl 12 hours ago [-]

I don't see your point. The problem is producing the copyrighted work, not processing it beforehand.

If it's illegal for AIs it should be illegal for humans, too. Is that really what you're arguing? It should be illegal for savants to read books?

SahAssar 11 hours ago [-]

I don't think anyone is arguing that the consumption is illegal. It's the reproduction that is illegal.

Read a book, that's fine. Write a book, that's fine. Read a book and then write a book that is 99.9% the same as the book that you read and sell it for profit without a license from the original author, that's infringement.

qarl 11 hours ago [-]

No, if you read the article, the point is in the training, not the reproduction.

That's what all these lawsuits are about - it's the training not the reproduction. I already agreed in my first comment that the reproduction is off limits.

In this case, it appears that Meta torrented illegal copies of the work to do the training. Obviously that's bad. But conflating that with training itself doesn't follow.

SahAssar 10 hours ago [-]

The point of these lawsuits is the piracy. My parent comment was about the general situation, not this specific article.

Pirating content is illegal, regardless of if it is to train an LLM.

Usage of LLMs trained on unlicensed content (basically all of them) might or might not be illegal.

Using any method to reproduce a copyrighted work by using that original as input in a way that supplants the market value of the original is probably illegal.

At least that is my rudimentary understanding.

qarl 9 hours ago [-]

Well - maybe so. But the common belief is that training itself is a violation of copyright, no matter how it's done. That's the argument I'm countering here.

SahAssar 9 hours ago [-]

The issue is that the trainers have not sought licenses for the data and instead outright pirated it.

I don't think anyone thinks that all training is a copyright violation if all the training data is licensed. For example a LLM trained on CC0 content would be fine with basically everyone.

The problem is that training happens on data that is not licensed for that use. Some of that data also is pirated which makes it even clearer that it is illegal.

qarl 9 hours ago [-]

But why should separate licensing be required at all? A search engine reads and indexes every word of every page it crawls. No one argues that requires licensing, only that the outputs must respect copyright. Why should training be different?

SahAssar 8 hours ago [-]

When google starting outputting summaries people asked the same questions.

If you supplant the value of the original with the original as input then you probably have some legal questions to answer.

qarl 6 hours ago [-]

But that's about the output, not the training. We agree: outputs that supplant the original are the problem. A model constrained to produce only fair use outputs causes no such harm — regardless of what it was trained on.

lobf 8 hours ago [-]

Sharing copyrighted material is illegal. Presumably, if Meta blocked all seeding on the torrents they downloaded, they wouldn't have broken copyright, right?

doublescoop 11 hours ago [-]

If copyright law doesn't extend to the works being used for training, why should it extend to the model that is produced as a result? AI model creators have set up an ethical scenario where the right thing to do is ignore copyright laws when it comes to AI, which includes model use. It might never be legal, but it has become ethical to pirate models, distill them against ToS, etc.

qarl 11 hours ago [-]

I'm not sure I follow. Can you say it a different way?

SahAssar 9 hours ago [-]

I think the parent is basically saying that if you can legally pirate a book to train a LLM why can't you legally pirate a LLM model?

It's a "rules for thee and not for me" argument.

qarl 9 hours ago [-]

AH. Thank you.

triceratops 10 hours ago [-]

Training requires making copies. Even if Meta had purchased each work they'd have had to make copies of it to distribute around the training cluster.

qarl 10 hours ago [-]

Does it though? If they bought a copy for each machine?

triceratops 9 hours ago [-]

Then no copying happened so they'd be on firmer legal ground.

qarl 9 hours ago [-]

Good, we're agreed. My only point here is that training is not inherently a copyright violation.

Barrin92 9 hours ago [-]

>The problem is producing the copyrighted work, not processing it beforehand.

the distinction isn't particularly clear cut with an open source model. If it is able to reproduce copyright protected work with high fidelity such that the works produced would be derivative, that's like trying to get around laws against distribution of protected works by handing them to you in a zip file.

It's a kind of copyright washing to hand you the data as a binary blob and an algorithm to extract them out of it. That wouldn't really fly with any other technology.

And that's really where a lot of the value is mind you, these models are best thought of as lossily compressed versions of their input data. Otherwise Facebook ought to be perfectly fine to train them on public domain data.

qarl 9 hours ago [-]

I tend to agree - but you assume that it would not be possible to create a model that can train on copyrighted work and only output text which would be considered fair use.

That seems very possible to me, and undermines the "training is copyright violation" argument. It's not the training, it's the output.

grebc 11 hours ago [-]

It’s different.

qarl 10 hours ago [-]

Hm. I'm not sure I follow your logic.

grebc 22 minutes ago [-]

You asked, I answered.

If you’re struggling to comprehend that a person reading a book is different then you’re a bad bot.

fantasizr 11 hours ago [-]

reading it after stealing it: gray area. producing & monetizing competing works devaluing the original is a problem

qarl 11 hours ago [-]

So is it a problem when humans produce and monetize competing works? My understanding is that there quite an industry in humans reading books and synthesizing their points. Cliff's Notes, for example.

fantasizr 10 hours ago [-]

I did some quick googling and most of cliffs notes guides are on public domain works so no problem there, they've also paid to license content, and also have been protected by fair use as parody

qarl 10 hours ago [-]

To Kill a Mockingbird, The Catcher in the Rye, Beloved, The Kite Runner, The Handmaid's Tale are all copyrighted works with a Cliff's Notes guide.

NoOn3 11 hours ago [-]

Why should an AI have the same rights as a human?

How about then to grant AI all other rights, for example, to allow voting?(sarcasm)

qarl 10 hours ago [-]

We're not talking about rights, we're talking about illegal acts. If it's illegal for a machine to do it, how can it be ok for a human?

Just from a rational argumentation point of view. Clearly if a law is written saying as much, then sure. But there is no such copyright law like that yet.

NoOn3 10 hours ago [-]

The issue is certainly not so simple. But it seems to me, purely theoretically, that the rules don't necessarily have to be the same for living people and non-living machines.

qarl 10 hours ago [-]

Well - actually - it is pretty simple. For something to be illegal, there must be a law saying it's illegal. There are no laws distinguishing humans from machines in copyright law.

triceratops 9 hours ago [-]

> There are no laws distinguishing humans from machines in copyright law

Correct. Because until very recently there was no need.

qarl 9 hours ago [-]

AH. So you agree that it's not illegal.

triceratops 5 hours ago [-]

What isn't?

pkaeding 9 hours ago [-]

But machines don't do things. People do things, and they use tools/machines to do those things more easily or efficiently.

qarl 9 hours ago [-]

My apologies - I'm speaking loosely of course. Translate all my claims about machines breaking the law into claims about humans using machine breaking the law.

pkaeding 8 hours ago [-]

Sorry, I wasn't trying to be pedantic. I was trying to make the point (which I think is in line with your point) that the fact that AI is involved here doesn't make a difference. It is a tool, but the people using the tool are (as always) responsible for the outcome.

triceratops 9 hours ago [-]

> I know people really hate AI training on their work - but is it really any different than a human reading it?

Yes it's very different. Humans need to eat, sleep, and pay taxes. You also have to pay them competitive wages.

qarl 9 hours ago [-]

I'm not sure your argument is supported by the actual law as written.

triceratops 9 hours ago [-]

https://news.ycombinator.com/item?id=48029673

There's nothing in the law to support your argument either. The law however does say, very unambiguously, that copying without permission isn't allowed . There aren't exceptions for "training" just because it's superficially similar to a human activity (reading a book). A human isn't allowed to hand-copy Harry Potter. Even if they bought all the Harry Potter books.

qarl 9 hours ago [-]

Yes. But training is not copying.

triceratops 9 hours ago [-]

We already covered this: https://news.ycombinator.com/item?id=48029085

0x3f 11 hours ago [-]

HN really loves the copyright lobby when it's against someone they hate, huh

teddyh 10 hours ago [-]

The problem is people at large companies creating these AI models, wanting the freedom to copy artists’ works when using it, but these large companies also want to keep copyright protection intact, for their regular business activities. They want to eat the cake and have it too. And they are arguing for essentially eliminating copyright for their specific purpose and convenience, when copyright has virtually never been loosened for the public’s convenience, even when the exceptions the public asks for are often minor and laudable. If these companies were to argue that copyright should be eliminated because of this new technology, I might not object. But now that they come and ask… no, they pretend to already have, a copyright exception for their specific use, I will happily turn around and use their own copyright maximalist arguments against them.

(Copied from a comment of mine written more than three years ago: <https://news.ycombinator.com/item?id=33582047>)

frozenseven 8 hours ago [-]

>wanting the freedom to copy artists’ works when using it

Learning from copyrighted content is legal - for both humans and AI. If Meta is in hot water for anything, it's piracy and/or storage of copyrighted material.

amanaplanacanal 9 hours ago [-]

I think it's more that the little guy gets the book thrown at them while the rich bitch gets a slap on the wrist. This is widespread, and is BAD regardless of your personal opinion on copyright.

frozenseven 5 hours ago [-]

Yeah, it's very hypocritical.

swader999 11 hours ago [-]

I take issue with the use of tense used in this framing. Its not 'infringed' its 'infringing' and to say that it happened is wrong, its happening and happening continuously in these models that are in use. To say a one time payment settles it is missing the whole scope of this theft.

Royalties are owed and continuously owed as these models are deployed and doing inference. How is it any different to paying a small pittance to someone every time a song is played?

ronsor 10 hours ago [-]

Royalties for inference are unrealistic in a way that even royalties for training aren't.

The LLaMA models were released openly. Copies exist everywhere in the world. You aren't going to be able to charge someone for running `llama.cpp`; a court order ceases to have practical relevance at that point.

eaglelamp 10 hours ago [-]

Inference might be unreasonable for a royalty agreement, but, in assessing damages, it is certainly relevant.

"I made enough copies for everyone" isn't a valid defense for copyright infringement.

swader999 10 hours ago [-]

These models can provide citations so I don't see why they can't tick a royalty owed. I'm sure many here could help build this pipeline.

Aurornis 10 hours ago [-]

First, LLMs do not reliably cite works. They are not looking things up in a database and repeating them. I think this false idea occurs a lot in people who don't understand what LLMs are or how they work.

Second, royalties are not required to cite a source.

Can you imagine how disastrous it would be to everything from news reporting to scientific publishing if that was the case?

swader999 10 hours ago [-]

Yeah well then I want my robot running this crap locally in its brain so I can get it to farm my two acres and haul water for me and I'll unplug from the rest of this nonsense going forward lol.

ronsor 10 hours ago [-]

... LLMs cannot reliably provide citations. If you ask for citations, and the model did not use a web search tool, then whatever "citations" you receive are unreliable. Please do not trust these models to be honest. Just because they can discuss a topic doesn't mean they "know" where the knowledge came from in the same way that you don't need to have studied physics to catch a ball.

platevoltage 4 hours ago [-]

Perhaps it's not. Let's force Meta to pay royalties in the same way you have to pay royalties if you want to sample someone else's song.

kodt 10 hours ago [-]

If you steal a book and read it, should you have to pay every time you use the knowledge gained or recall parts of it from memory?

teddyh 10 hours ago [-]

No. People are not LLMs. And even if some argue that they are mechanically similar, they are legally distinct.

8 hours ago [-]

drfloyd51 10 hours ago [-]

If I charged people for the privilege of listening to me recite relevant parts of the book to them for profit? Yes. Depending on the copyright.

kodt 8 hours ago [-]

So like a teacher?

swader999 10 hours ago [-]

If I perform a song in public then yes, I should pay the creator every time I play it. I fail to see the difference here.

kodt 8 hours ago [-]

What if you are performing your own song which was heavily influenced by other artists?

Also I believe performing covers is legal

mitthrowaway2 10 hours ago [-]

What if you steal a CD and then play it on your radio station each morning?

Lio 10 hours ago [-]

Even better, what if you transform that stolen CD into an MP3, so the data isn’t the same as a lossy process was used, then share the MP3 with the world as your own work?

I don’t get why the training process doesn’t count as any other form of transformation but then I’m not a lawyer.

kodt 7 hours ago [-]

even better if it is a pirate radio station

Rendered at 07:31:24 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.