I wonder if Getty and Stability AI were ever in negotiation for licensing their work and this lawsuit is fallout.
Getty announced in Oct they partnered with BRIA (?) to provide generative AI tools using their licensed images [1], and Shutterstock announced a partnership with OpenAI [2].
So it's clear these rights holders are OK with generative AI, as long as they continue to extract their pound of flesh. The language around "protecting artists" is horseshit - if you're a creative and you see Disney, Getty, etc getting behind your cause, you should look _very_ carefully around and make sure you're not the one being screwed.
I don't feel like it's even slightly contradictory for Getty to simultaneously be OK with licensing their content to someone for AI modeling, but not being OK with a random unaffiliated company using their IP. It feels like pretty straightforward copyright management.
"Protecting artists" is not something they claimed as if they were opposed to AI usage at all. In their press release they said:
> It is Getty Images’ position that Stability AI unlawfully copied and processed millions of images protected by copyright and the associated metadata owned or represented by Getty Images absent a license to benefit Stability AI’s commercial interests and to the detriment of the content creators.
It's clear from this that the issue was not that Stability is an AI company, but that it's unlicensed. Getting an exclusive license to the images is specifically what they pay contributors for. Having those copyrights infringed by a competitor makes the content less valuable to Getty, and disincentivizes Getty from paying for new content in the future.
So yeah, it's plausible that this behaviour from Stability could harm content creators. Not because it's AI, but because its just run-of-the-mill unauthorized usage.
I do NOT require licensing to produce artwork, even if it may or may not be slightly/not so slightly inspired by so called 'copyrights'.
Me thinks you are an industry guy. You should read about all of the failed/lost cases that get swept under the rug.
I don't have a clue whether Stability AI will win this, it depends on their exact algorithm and how much they rely on source content, but if I draw something similar to the modern equivalent of the 'Mona Lisa', no, you don't have a right to it unless you can prove that in court. No, you can't copyright a painter painting a woman, either.
I am not an industry guy, nor do I make any claims about how successful the lawsuit will be. I am simply arguing that Getty is not being hypocritical in their position.
But now that you brought it up -- reproducing intact watermarks for Getty in multiple images as shown in the lawsuit feels like maybe your "slightly inspired" argument doesn't apply here.
Well, and here enlies the crux of the issue, can a non-human own or break copy right. You know not too long ago a guy gave a camera to a monkey, the photo that money took became popular and he wanted to capitalize, but the US copyright office clearly stated that they only copyright artistic works created by humans, not those created by non-humans. And it would extend that if non-humans cannot hold copyright, then they cannot break copyright.
Every human artist that ever lived(to my knowledge), heard, or saw someone else create a similar piece of art, from which they were inspired. If I create a song right now, how is that any different than an AI doing the same from being trained on copyrighted music. Certainly my song will be entirely made up of elements i've heard before, however large or small. An ML model is doing the same thing. There is nothing truly original in art. Artists are just filter and amplifiers of what they've heard, seen, and like. Your copyright does not permit you to restrict others from being inspired by your work, or using it for inspiration.
> If I create a song right now, how is that any different than an AI
It's vastly different in the fact that you are a human and the other is more or less a program built for business. There is only one version of yourself and your output where the AI has unlimited copies of itself and is only stopped due to lack of processing power and electricity.
I think even AI companies know there is a big difference and is the reason they use non-profit and researcher datasets instead of their own.
totally unnecessary explanation, don't know what in my comment led you to think I needed it. Do you regularly define words and etymology for people in comments?
Private people who agreed that Getty would manage their distribution and compensate them when their image was used by one of Getty's clients, yes.
So yeah, not only is it very much Getty's content to distribute, but Stability AI is absolutely screwing over thousands of little guys by not paying royalties too...
Some of that, but Getty also repackages images with licenses that allow it. And sometimes then acts like it "owns" those images by harassing others that use the same original source, including the original copyright owner. That's not especially relevant to the article, but helpful to understand a lot of what's there isn't their IP to manage.
That’s the minority of their images and also if it’s the version from their server with their watermark on it, it actually is their IP to manage. So as you said it’s not especially relevant and it’s very clear stability didn’t limit its scraping of Getty to these kinds of images. Getty in other instances being jerks doesn’t make them wrong here.
It's difficult to find any source that would prove or disprove that. I can find references that they changed 35 million images to "royalty free" in 2014. The examples I can find of those are things like this: https://ichef.bbci.co.uk/news/976/mcs/media/images/73409000/...
The crux of the issue is whether Stable Diffusion is 'learning' like a human would be if they just scrolled through Getty Images or someone's DeviantArt to learn to draw and copy their style (a style is not copyrightable).
This is why Getty Images' claim is that Stable Diffusion is a "collage tool", while SD will most likely argue on the basis of the AI being capable of exhibiting the same level of creativity as a human.
This seems totally irrelevant to the case in my view, which is whether Stability Ai had the right to use Getty’s content for training in the manner that they allegedly did.
In general, consuming content cannot be a violation of copyright or a license agreement; What is controlled, however, is stuff like reproduction. So if you do see a Getty Images picture, redraw it almost verbatim (even without tracing), you've effectively reproduced it, but the act of viewing it isn't illegal. In this vein, "exhibiting the same level of creativity as a human" is important - if it is, you can argue that the AI is just creating new stuff and the training set was simply "viewed", so anything it creates isn't necessarily infringement of any one of Getty's photos.
That’s really helpful clarification, thanks :) - and helps explain why the claim doc includes a focus on the stages in the process where the Getty content is allegedly copied.
With this in mind it will be even more interesting to see how this plays out.
It's best understood as an extremely sophisticated collage tool. We can make vague analogies about how training ML models is sort of like human learning, but ultimately it's not actually that.
It will still spit out stuff like watermarks and product photos unprompted, because its "learning" is still fundamentally mindless. It works strictly in the realm of pixels, it has no mechanism for understanding context.
It's really really not best understood as a sophisticated collage tool.
You can't take a person and turn them into a cartoon if you're just pasting existing parts of images together. Stable diffusion understands what is a cartoon and it understands what is you (assuming you train it on your face).
It spits out watermarks because it understands that watermarks appear in images and it's going to try to reproduce the watermarks if you ask for something that tends to have a watermark in it.
Unless they have explicitly transferred ownership, then yes. IP generally vests with the creator and is licensed to the recipient unless they have explicitly agreed otherwise.
Not sure if you are asserting this but Just because an image is in the public domain, it doesn’t necessarily mean that Stability has a right to use it for training. I have online content that is licensed as Creative Commons attribution, non commercial, share alike- it seems to me that using this for training would breach at least the non commercial terms of that license.
Fair enough. I agree that $500 is a big ask for public domain content, however there is still a service provided by Getty in surfacing this content. It seems fair to charge a fee for that convenience- after all, the consumer has an option to do that work and find it themselves if they value their time less than the cost and have the necessary knowledge. Kevin Kelly identifies this in his excellent article “better than free“ (https://kk.org/thetechnium/better-than-fre/)
Getty asserts a related point in the claim- that the work done to build extensive metadata for the content (which Stability apparently used for training their models) is a valuable service in and of itself.
Edit: just clicked through your link above and should have done so before. While the points above still stand (imho), the presentation on the Getty site borders on misrepresentation in the implication that they are copyright holders. Leaves a bad taste in the mouth.
I understand not liking the price point here, but people sell copies of public domain works all the time; it's utterly mundane. Are you also upset that you can buy copies of Frankenstein?
It's not hypocritical to be upset at someone else stealing your content to sell it, just because you sell it yourself. That's probably a part of why you're upset - because you don't get the money.
My understanding is that in general, artists aren't necessarily against generative AI, but their complaint is (partly) around a complete lack of consent of being a part of these training models.
From the article it might be read as "Stability AI didn't even bother attempting to reach out"
> Rather than attempt to negotiate a license with Getty Images for the use of its
content, and even though the terms of use of Getty Images' websites expressly prohibit unauthorized reproduction of content for commercial purposes such as those undertaken by Stability Al, Stability AI has copied at least 12 million copyrighted images from Getty Images' websites, along with associated text and metadata, in order to train its Stable Diffusion model.
Basically it's an AI tool that takes a copyrighted photo, and produces an AI-produced photo that is "conceptually identical" but not actually identical.
That is, given a photo of "an asian man lying in a grass field surrounded by a semicircle of driftwood and crows", it would produce a photo that had all of the same concepts, but just slightly different execution on each of them.
The man's face is still asian but clearly a different person. The driftwood is still in a semicircle, but the individual pieces are all different. The crows are still there, but arranged slightly differently. The grass is still grass, but no blades are the same.
So it's essentially cloning the idea/concept/vibes of the target image, but none of the actual "implementation details".
Does anyone have any intuition of the legal outlook on this?
On one hand, nothing is stopping me from seeing the copyrighted photo and then recruiting a similar looking model, setting up a photoshoot in a field with some stuffed crows, etc. I could replicate what the AI is doing. It would be work, but I could do it. The AI is just automating this.
On the other hand, the actual stated intention of this tool is to get around copyright. Seems sketchy.
Not a lawyer, but given that it takes the original picture as an input, there's a very strong claim that it's a tool to create a derivative work. At which point, it's time to ask the question if this is going to be fair use...
Okay, when you're advertising your product as a "get around copyright", you're going to so torpedo your credibility before the judge that there's no point trying to analyze the fair use factors--the judge is going to do whatever it takes to make them come out not in your favor.
Thumbnails are absolutely infringing... but they are fair use in most cases!
Edit: on second thought, they might not even be derivative works (as a derivative work requires the same spark of creativity necessary for a copyrightable work), just outright reproduction. But the point that they are fair use still stands.
It would unambiguously be a derivative work. (IANAL, but I'm willing to assert this.) I wouldn't think it would be infringing: I'm not particularly familiar with US law, but it seems like a classic case of fair use.
> Fortunately, the Court wasn't buying it. It rejected Perfect 10's theory and found that until Perfect 10 gave Google actual knowledge of specific infringements (e.g. specific URLs for infringing images), Google had no duty to act and could not be liable. It also held that Google could not "supervise or control" the third-party websites linked to from its search results, something most people (except apparently Perfect 10) probably already knew. The rule provides strong guidelines for future development and avoids the kind of uncertainty that could chill start-ups trying to get the next great innovation off the ground.
> We conclude that the significantly transformative nature of Google's search engine, particularly in light of its public benefit, outweighs Google's superseding and commercial uses of the thumb- nails in this case. In reaching this conclusion, we note the importance of analyzing fair use flexibly in light of new circumstances. We are also mindful of the Supreme Court's direction that "the more transformative the new work, the less will be the significance of other factors, like commercialism, that may weigh against a finding of fair use." Campbell, 510 U.S. at 579.
> With respect to the second factor, "the nature of the copy- righted work," 17 U.S.C. § 107(2), our decision in Kelly is directly on point. There we held that the photographer's images were "creative in nature" and thus "closer to the core of intended copyright protection than are more fact-based works." However, because the photos appeared on the Internet before Arriba used thumbnail versions in its search engine results, this factor weighed only slightly in favor of the photographer.
> Here, the district court found that Perfect 10's images were creative but also previously pub- lished. The right of first publication is "the author's right to control the first public appearance of his expression." Because this right encompasses "the choices of when, where, and in what form first to publish a work," id., an author exercises and exhausts this one-time right by publishing the work in any medium. See, e.g., Batjac Prods. Inc. v. GoodTimes Home Video Corp., 160 F.3d 1223, 1235 (9th Cir. 1998) (noting, in the context of the common law right of first publication, that such a right "does not entail multiple first publication rights in every available medium"). Once Perfect 10 has exploited this commercially valuable right of first publication by putting its images on the Internet for paid subscribers, Perfect 10 is no longer entitled to the enhanced protection available for an un- published work. Accordingly the district court did not err in holding that this factor weighed only slightly in favor of Perfect 10.
...
> Having undertaken a case-specific analysis of all four factors, we now weigh these factors to- gether "in light of the purposes of copyright." In this case, Google has put Perfect 10's thumbnail images (along with millions of other thumbnail images) to a use fundamentally different than the use intended by Perfect 10. In doing so, Google has provided a significant benefit to the public. Weighing this significant transformative use against the unproven use of Google's thumbnails for cell phone downloads, and considering the other fair use factors, all in light of the purpose of copy- right, we conclude that Google's use of Perfect 10's thumbnails is a fair use. Because the district court here "found facts sufficient to evaluate each of the statutory factors . . . [we] need not remand for further factfinding." We conclude that Google is likely to succeed in proving its fair use defense and, accordingly, we vacate the preliminary injunction regarding Google's use of thumbnail images.
---
And so, is the model that Stability created significantly transformative, fundamentally different, and likely to provide significant benefit to the public?
Your own first link answers that question. No that is not fair use
> Google automatically makes low-rez thumbnails of all the images it indexes. The court concluded that Google's creation and display of these thumbnails from infringing websites did not fall within fair use.
> Having undertaken a case-specific analysis of all four factors, we now weigh these factors together "in light of the purposes of copyright." In this case, Google has put Perfect 10's thumbnail images (along with millions of other thumbnail images) to a use fundamentally different than the use intended by Perfect 10. In doing so, Google has provided a significant benefit to the public. Weighing this significant transformative use against the unproven use of Google's thumbnails for cell phone downloads, and considering the other fair use factors, all in light of the purpose of copy- right, we conclude that Google's use of Perfect 10's thumbnails is a fair use. Because the district court here "found facts sufficient to evaluate each of the statutory factors . . . [we] need not remand for further factfinding." We conclude that Google is likely to succeed in proving its fair use defense and, accordingly, we vacate the preliminary injunction regarding Google's use of thumbnail images.
> Not a lawyer, but given that it takes the original picture as an input, there's a very strong claim that it's a tool to create a derivative work.
I don't think it's that simple- as I understood the proposed conversion, there would be two phases. The first phase extracts an 'idea prompt' from a source expression (resulting in "an asian man lying in a grass field surrounded by a semicircle of driftwood and crows"). The second phase generates a new expression from this prompt alone.
As long as this intermediate 'idea prompt' is sufficiently devoid of 'expression' elements to withstand scrutiny (and the tool could even embed its idea prompt in metadata for auditing purposes), I would imagine the final output to be likewise considered a sufficiently transformative work compared to the original.
That’s not really how it works. Copyright is less about process than result. For example, a human can—-completely from memory—-draw an entirely new image and it can be found to be infringing on an original. Similarly if I took someone else’s book and made a silent film adaptation of it. That is a derivative work, it won’t contain any of the literal words from the book or any words at all but if it contains a representation of the unique, identifiable characteristics of the underlying work it’s going to be considered a derivative (infringing) work. So many of these gotchas people talk about with AI are not nearly so simple.
No that's the opposite, patents are about the result and copyright is about the process. If that would work like that, all those image boards like Getty themselves would never be able to exist since there's so many near identical images produced independently.
That's not true. If you patent the Microwave, I am free to heat my food in the oven. The result is the food is cooked, the method is different. The process is a key part of the patent.
Edit: After gasp researching (which maybe we should all consider before commenting) you can patent one of 4 things, process, machine, article of manufacture, or composition of matter. Source: https://en.wikipedia.org/wiki/Method_(patent) so process is definitely something you can patent, but is not necessarily required.
Yeah sure, to be more precise, to be a copyright derivative, the creation process must be examined, there's no way around it. You can totally create very similar artworks which don't share a copyright.
As long as you're saying an AI-transformed image wouldn't be considered an inherently derivative work because of its process, I believe we're in violent agreement.
Fair use analysis isn't some dispassionate, exhaustive analysis that's used to come to a decision. No, the judge makes a decision, and then uses the analysis to justify the decision. For a particularly stark example of how divergent the analyses can be to justify different decisions, read both opinions in Google v Oracle, which are almost diametrically opposite despite being written on the same facts.
In the law you don't have "implementation details", you have ideas and expressions. Ideas are patentable, expressions are copyrightable, and never the two shall meet. The exact boundary of those two things is defined by the merger doctrine and the thin copyright doctrine[0]; but for our purposes we just need to know that if your AI art generator spits out something that's "substantially similar" to the original, it's infringing, even if it's not exactly the same. The law anticipated tracing.
What you are proposing is that you just "wash off" the expression from the idea and regenerate a new image from that idea. Great, except this isn't how AI art generators work. They aren't breaking down images into their core ideas, because those only exist in our human minds[1]. They're finding patterns of pixels that happen to match the text prompt well enough; and often times that includes the original image itself. Overfitting is a huge problem with conditional U-Net models and Google even released a paper detailing a way to find and extract memorized images out of an art generator.
So what will likely happen is that the art generator will just copy the image, or make one that's close enough that a judge would say that that it's a copy.
[0] If an expression is fundamentally wrapped up in an uncopyrightable idea and can't be expressed any other way, then it's also uncopyrightable. But if an expression is made up of uncopyrightable ideas, but separable from them, then you get a thin copyright on the arrangement of such.
[1] And, also, most humans are terrible at distinguishing idea and expression in the way that copyright law demands.
So these 'copyright laundering' AIs also need to use an LLM to generate a legal defence that shows that the copied aspects were, in fact, the only way to express the idea.
Which is a good thing, since having to read thousands of long spurious AI-generated copyright defences will quickly motivate lawyers to create laws against AI-generated legal defences.
I honestly think that, a parallel ML system to explain in words the decision of the deciding ML, is the only way we will get explainable ML.
That is somewhat analogous to how humans decide. Our brain runs over data and various potential outcomes and maybe reasoning and then decides. A bit later it generates a post-how explanation of said decision.I think we all have experienced ourselves and others having a totally wrong explanation about decisions but it is some how more satisfying than “a gut feel” or “because I said so”.
Not a lawyer but I’m under the impression that one of the things they do in law school is spend a bunch of time constructing increasingly ridiculous hypotheticals, to work out the specifics of their arguments.
I think AI is overhyped in general, but as a tool to rapidly instantiate absurd hypotheticals it is really impressive. This is cool and good, IMO.
Seems like it's a matter of time (or maybe it's already done and I haven't seen it yet) before you get optimizers that takes an image and produces a seed+prompt that yields a super super conceptually similar image.
I think being able to selectively optimize for "magic" random seeds in the diffusion algorithm will be kind of critically important here. Different seeds can produce very different images given the same prompt.
What if an optimizer can find a seed+prompt combos that are just as good as cloning as img2img?
If you want to clone an image (for some reason) just encode it in latent space and retrieve it using a deterministic sampler such as DDIM. Or you can simply right-click and copy.
If you instead don't want to clone an image, you can just extract the CLIP image embeddings from it and use them to condition a generative model like Dalle 2, Midjourney or Karlo (open source). The CLIP embeddings extract really well the semantic meaning of an image.
Going image -> vector space -> image feels very much like compression for the purpose of copyright. Like a lower quality JPEG of some image still has the same copyright properties of the original.
Something about going image -> prompt -> image feels like it "subverts" this somehow, even if the prompt is hyper-optimized to recreate the original image.
Obviously, this is just my feel/impression of it, the real test is how a jury feels about it.
The next few years will be really interesting in exposing that this is a really massive gray area.
> That's only because you understand the algorithm in the first example.
I can start with van Gogh's Starry Night, get the prompt "starry night, van Gogh" and get Starry Night back.
I'm using Starry Night as an example because Stable Diffusion consistently reproduces it, sometimes with the original frame, even with vaguely related prompts.
I'd say the jury will be making the right decision. Especially if the original image was part of the training set.
What you’re missing is that the Stable Diffusion defense will roll out example after example of non-infringing cases, like inpainting to remove power lines from a vacation photo. They will be trying to establish significant non-infringing commercial use in order to establish that the tool is not specifically designed to get around copyright.
A tool that specifically advertises itself at getting around copyright is going to have a very hard time in court…
I think the jurors would be aware (or at least acceptive) of the notion that describing a picture using words does not create a derived work. That is something a good lawyer could make a show of, even to the point of having an artist draw a picture from such a description.
The vectors are much more opaque because there's no straightforward human equivalent.
My point is that if there is a prompt that results in a picture that is a nearly identical copy, the average jury member is going to think "yep, that's a copy".
Trying to explain how that "isn't really a copy" by explaining AI concepts isn't going to win the day, not when they can SEE the copy.
If the textual description in and of itself is already "not a copy", though, the question of whether the image produced from that textual description is a copy shouldn't arise, no? It's not like the jury is ruling on random questions; they're going to give answers to whatever the judge puts in front of them.
If you describe an image with sufficient fidelity on the specific expression to another human--someone who has never seen the original photo, mind you--that they are able to produce something that feels "substantially similar" to the original photo, that is still going to be copyright infringement. There is a reason that, if you want to reverse engineer and reimplement some piece of technology, even if you are using humans, you are supposed to have not only the team which doesn't get to see the original device and will implement it based on a description from a second team whose job is to carefully analyze the device to build documentation for that first team... but you then take that documentation and you run it through lawyers who carefully attempt to verify that the description is only of the factual content required for interoperability / behavior and doesn't accidentally include any of the expression inherent in the design of the original product. These examples from this demo video aren't just "image of a pile of gemstones" but stuff like "image of exactly six gems in this specific orientation with this specific color palette and texture and lighting and..."; there's just no way you are going to get a lawyer to sign off on a description as not including any of the expressive elements of the original image.
Note that there are non-infringing models to use with img2img. If I drew a line sketch of the scene that I wanted and then used img2img to fill it out - that's not infringing.
On the other hand, if I was to take an existing image had and then used that to transform it, it is a derivative work. If this was my image created with a camera rather than mspaint, its still a derivative work but I, as the creator of the original, can create derivative works of my own (I do it all the time).
The idea that I could use someone else's image and run it through img2img to make it "not a copy" misses the point of copyright and the exclusive right of the copyright holder.
> Only the owner of copyright in a work has the right to prepare, or to authorize someone else to create, an adaptation of that work. The owner of a copyright is generally the author or someone who has obtained the exclusive rights from the author. In any case where a copyrighted work is used without the permission of the copyright owner, copyright protection will not extend to any part of the work in which such material has been used unlawfully. The unauthorized adaptation of a work may constitute copyright infringement.
---
With starting from text, a description of the Mona Lisa is difficult to get generative art to create without significant effort to get the Mona Lisa (without saying "I want that image").
The question is then "is a description of the Mona Lisa a derivative work" and that is likely sufficiently far away from the original to be "no."
>On one hand, nothing is stopping me from seeing the copyrighted photo and then recruiting a similar looking model, setting up a photoshoot in a field with some stuffed crows, etc. I could replicate what the AI is doing. It would be work, but I could do it. The AI is just automating this.
If you trace art by hand that is still copyright infringement. If you paraphrase a passage from a book by hand that is still copyright infringement.
> If you trace art by hand that is still copyright infringement. If you paraphrase a passage from a book by hand that is still copyright infringement.
What if I see some fine art, I, a non-artist, make a super-low quality recreation of it with crayons, give that + a verbal description to a different professional artist who has not seen the original, and have them "upscale" my bad drawing into new fine art.
Their art would be conceptually very similar to the original. Same layout, same concept, same vibes, same style (if my verbal description was sufficiently good) but all the details would be different. Is this still infringement?
If this worked all artists would be passible for lawsuits, how many ways can you draw flowers in a vase? "They stole my idea, your honour! They used the same number of flowers in a vase, I came up with the concept of 3 flowers first."
I think artists and copyright intermediaries would like to have "wildcard" copyright, "draw a flower once, all flowers belong to you now", and it would be very bad for creativity if they got their way.
The vast majority of profit most artists are making is also copyright infringement. Custom porn is where the money is, and Rule 34 is likely a huge part of that.
> If you paraphrase a passage from a book by hand that is still copyright infringement.
If you have case law examples, it would be useful to cite them, but in general this not true. It can true when the paraphrasing is substantially similar to the original work. It would not be true, for example, if you paraphrase using all different words and a lot less of them. Copyright only protects the fixed tangible expression of the work, not the idea behind it.
>Copyright only protects the fixed tangible expression of the work
It also protects against people making modifications to these works. When people parphrase something they typically do so by taking the original work, swapping words with synonyms, and shuffling the order.
Yeah, that’s true. I think this might hinge on the word ‘paraphrasing’ though. That word generally means summarizing using your own words, not playing mad libs on the original text.
I was also tempted to quote the expression/idea dichotomy, but looking at the examples, you’re totally right: the example you showed is aiming to land directly on the line between legal and illegal, it is legitimately hard to reason about, it is different than borrowing either the idea or the expression, and it absolutely is sketchy (and will probably be tested in court if it gets much more attention).
The problem here is that it does more than just clone the idea/concept/vibes, it really does tread into copying the implementation details. It matches lighting & composition, it matches subject and color, it can mimic the equipment used & props. People have done this manually, and been sued for it. Mostly it happens when an unknown artist steals the style of a specific well-known, best-selling artist. But now we’ve built a machine to near-copy anything in any style, with the intent of borrowing as much of the expression as legally possible, which seems like it probably can’t end well from a legal perspective. And because the technology for building these kind of machines is essentially public knowledge now, it’s hard to imagine this won’t be a problem from now on.
Yup. Or even more clearly with example [0]. Input request for an image of a person named "Ann Graham Lotz", it returned the exact image in the training set, slightly degraded.
MIT Tech Review reports research with hundreds of similar results [1]. "The researchers, from Google, DeepMind, UC Berkeley, ETH Zürich, and Princeton, got their results by prompting Stable Diffusion and Google’s Imagen with captions for images, such as a person’s name, many times. Then they analyzed whether any of the images they generated matched original images in the model’s database. The group managed to extract over 100 replicas of images in the AI’s training set. "
Yup; you identify a new mode of failure! If it cannot recognize one image as identical to another, it ends up overtraining and over-generalizing on the dups....
Again, showing zero of anything resembling abstract understanding; merely a statistical correlation of blobs within each image with the text, e.g., finding the common set of pixels in each image that corresponds to "astronaut" and filtering out the rest.
Obviously, this is far more useful than color-key-deletion or whole-image search. But, it isn't intelligent abstraction.
With cases like this I am reminded by the pirate bay case in that there are two ways people can be found guilty of copyright infringement. One can prove that a copy has been made, or one can convince a judge that the opposite story provided by the defendant is not believable.
After that case there has been multiple theories on how to evade copyright law which all seem like they would equally fail at convincing a judge. One of my favorite is the method used by freenet, which takes a file and first encrypts it and then splits it into so small parts. Those parts are so small that multiple files will share identical parts with each other, so it is impossible to know for sure which file a person is downloading by just looking at the parts. In a different channel they also provide a recipe in how to reconstruct the file, and recipe by themselves are not enough as evidence to prove a download.
Sounds perfect until one would have to try convince a judge that no copying has occurred.
However, the case was settled and the creator of the poster lied in court about his sources--so I'm not sure I'd draw too much about the poster inspired by photograph.
I think that really depends. You can describe the photo conceptually and build a new photo from that, which would be equivalent to clean room engineering.
But "you" (or I) don't describe the photo, their process does, and I'm not confident that Separation of Concerns would be enough to establish a clean room.
Clean room? The point of clean room software engineering is that the expressive parts of software are at the organizational level, you know, the basically arbitrary stuff that everyone has a case at being “the best way to structure a program”, so if you happen to have seen the source code you might have been influenced by some relationship between two classes that made sense and since Oracle’s lawyers are going to be breathing down your neck you had better go out of your way to avoid their ire.
There is no such thing as “clean room painting” and it should be really obvious why that is…
I think that depends on how it’s done. If there’s actual visual representation ( bitmaps or FFT coefficients) copied around (and most importantly - more than what might be described as fair use), that would probably be true. If a highly accurate conceptual description is generated then an image generated on that, I would see no issue.
I don’t know how it is implemented for the software in question.
IANAL, but from my understanding of the points at issue, I think a court might be likely to find that a) sucking the image into RAM is a copy in the first place; b) the FFT/etc. would be a (first) derivative work; c) using a form of the original image sufficient to communicate to the alteration processes what it should be altering would constitute a copy; and/or d) identifying something as a de-copyrighted work will undercut any defenses.
But a derivative work requires that major copyrightable elements remain. So it seems like there is a fuzzy line on whether something is really derivative or not.
I don't think it is necessarily enough to have simply started with a copyrighted work.
Yes, it has to be visually derivative in order to be a copyright infringement, not mathematically derivative!
Imagine if Stable Diffusion was made illegal. Someone accuses me of using this illegal tool for an image that doesn’t look like anyone else’s image as far as the court is concerned for copyright. I put the image on my website. If the image itself is not at all infringing, then what is the evidence that Stable Diffusion was used? Should the police be issued a warrant to search my private property for proof that I used Stable Diffusion without a shred of evidence?
The "idea" at issue isn't like "a picture of a tree," though. It's "a picture of a tree as conceived by Ansel Adams and photographed with [technical details]."
"a picture of One WTC as conceived by Ansel Adams and photographed with a Hasselblad" is still an idea, because One WTC opened in 2014 & Mr. Adams died thirty years ago in 1984.
Every piece of art is a collage of other pieces of art. Everyone is inspired by other, and copies others, whether they admit it or not. You can't write a song or paint a picture and say that the way you stroke a brush or strum a guitar is not something that was learned by watching others. I think in general too often people are awarded copyright judgements when they shouldn't be, however in this instance I do feel that absolutely nothing of artistic value is being added, and it does seem to be intended to replace the original. If you like a song you heard, and you write something with the same chords, and a similar melody, that's fair game, but if you write something so similar, _and_ market, and sell it as a replacement "don't buy that expensive song, this ones basically the same, and half the price", then I'm quite sure you're in some very murky, and potentially illegal waters.
IANAL but wouldn't the creativity involved in doing that fall under Fair Use? Although some googling says that if someone remixed two songs, they'd need copyright permission from each artist. I guess there's a spectrum between that and more of a parody, like Weird Al's Amish Paradise.
Fair use defences aren't automatic, and all go on a case-by-case basis. One of the major criteria is 'effect on the work's value'. Given how easy it is to find people (in this very thread, even) gleefully predicting the death of stock photo sites, it'd pretty easy for a lawyer to tear said defence apart.
Weird Al gets permission for all of his music. Even if the parody defence were valid, song licensing is pretty cheap, and defending lawsuits against artists signed with major labels is expensive.
I agree. There’s a fantastic debate ahead that’s very novel. It’s especially exciting for how sudden this all is. My hunch is AI wins and I’d love it if the defense was written by AI.
I don't see this working for Getty but it will be interesting to see. Israel and the EU have preemptively determined that the use of copyrighted data to train ML models is almost always fair use, and I believe the Authors Guild case in the USA also sets strong precedent there. That really leaves the UK, and they are likely to go with everyone else.
IANAL but if I were to guess, I would think the defense here would center on how would what Stability AI has done differ in any meaningful way from human artists browsing through Getty's collection themselves, to train their human brains on what art to generate. The latter activity, even Getty would likely have to agree, is surely legal.
It would be a weak defense, undone because Stability AI is not a human and will never be one. It's a computer based tool using assets whose license explicitly say not to use them that way.
It may have nothing to with copyright just the terms under which you accessed them .
Most TOS boilerplate typicall prohibit commercial usage of their library without explicit license , Getty and every other company has that foresight if there is money to made they would want their cut is all that really need .
Let’s say you even just wanted to consume all images just sell an analysis of how many b/w images are there in their catalog it would still breach of their terms unless their TOS allowed you to do so the novel copyright question may simply not even matter in this particular case.
Also NAL, but I'm cynical enough to believe that Getty's lawyers would avoid answering this question directly. And then wax lyrical about how their client should indeed receive a royalty for anyone attempting to use Getty's copyrighted works to learn the art.
> Stability AI is well aware that Stable Diffusion generates images that include distorted versions of Getty Images’ watermark and other watermarks, but it has not modified its model to prevent that from happening.
> Making matters worse, Stability AI has caused the Stable Diffusion model to
incorporate a modified version of the Getty Images’ watermark to bizarre or grotesque synthetic imagery that tarnishes Getty Images’ hard-earned reputation, such as the image below
(see page 18 for an example)
Getty is going to win something. There is clearly a problem with the model. The outputs are often not novel enough to make them indistinguishable from the training data.
IANAL either, but there are attacks you can use to extract the original training data from the models: https://arxiv.org/abs/2301.13188. I’d guess these attacks will get better with time.
So in some ways, you can argue that Stability is also directly redistributing the original images (albeit in a compressed format).
IANAL either, but I believe for something to be copyrightable (and not an infringement on someone else's copyright), there needs to be a "modicum of creativity" in the new work. It makes me wonder if at least part of the case will depend on whether the court things an AI can be creative.
It kind of already means human-generated. Courts decided that a monkey cannot hold a copyright over a selfie it took. Another part of copyright law is the actual ability to sue over copyright infringement, and they decided a monkey does not have the ability or right to do so.
Author's Guild went the way it did because the judge was convinced that Google Books promoted books rather than replacing them. With an identical use case and no references back to Getty, you'd be hard pressed to make a similar case here.
While undoubtedly the law in practice is driven by such sympathies, as least the black letter legal justification is not. The use cases are identical, the model built by Stability AI does not represent eo ipso a copyright violation in the same way the Google Books index did not represent such a violation.
Both were built by scanning copyright materials, and in Authors Guild the Southern District of NY found that such scanning does not constitute a copyright violation.
The arguments in this filing are pretty weak and I think it's going to all just boil down to fair use in the end. I don't see this trademark claim going anywhere.
> The arguments in this filing are pretty weak and I think it's going to all just boil down to fair use in the end.
I don't think so and the complaint isn't just about 'trademarks' either.
OpenAI was able to get explicit permission [0] from Shutterstock to train and on their images for DALLE-2. Stable Diffusion did not and is commercializing the use the model with Dreamstudio as a SaaS offering which the model has found to be outputting images with Getty's watermark [1] without their permission. That doesn't seem to be 'fair use' nor is it transformative either given the watermark is clearly visible in the generated examples here: [1]
This is going to end with a settlement and Stable Diffusion licensing deal with Getty over the images; just like with OpenAI did for DALL-E 2 with Shutterstock. Neither Shutterstock or Getty are against Generative AI either even as shown in this deal with Getty recently [2]
So what's the plan for the creatives whose work style becomes reproducible by tech?
Sure, they also feed on each others work etc but in the core of all these copyright, piracy, patent and similar discussions is how these people are supposed to be compensated.
Working in the software company in the day and preaching open source, anti copyright anti patents open access free for all in the night works for the software people but people in the creative industries are really struggling to get paid for their work.
The genie isn't going back in the bottle, the tech will be able to produce derivative work over the work of other people and I'm not looking forward for the greater number of struggling artists.
> Working in the software company in the day and preaching open source, anti copyright anti patents open access free for all
"Open source" is copyright - it's not anti-copyright. It uses copyright to grant a license to use under certain conditions, and sometimes with obligations. You might keep it proprietary, you might use GPL to require that the software stays open virally, or you might use a more permissive BSD-style license. The important part here is that as the creator, you choose how you want your work to by copyrighted.
Artists already get compensated for their work (...more or less)
Trainers should require consent from artists to train their model on an artist's work. A part of obtaining that consent could be some form of compensation and ideally credit when generating the images. I don't believe many artists are necessarily concerned about people copying their style. From what I've seen is they just don't want their artwork and their style sucked into the AI-borg to be reproduced en masse.
I do not think scraping images and using them to train models is fair use. I believe AI labs should obtain consent.
>So what's the plan for the creatives whose work style becomes reproducible by tech?
You don't need tech (or at least computers).
To the degree that a portrait photographer, say, has a distinctive lighting and posing style, that can absolutely be copied. And there are many examples in art of art techniques that were widely copied.
The point is, how this person gets paid? Copying styles happens all the time and its part of the trade, copying each other is part of being human and its how we come up with new things but the assumptions are that these people will get paid for their work because copying their work style doesn't scale well. A person with a particular style can get paid for games artwork since it's not that easy to copy the style and now suddenly they are not get paid but their work is simply analysed by a machine and produced on demand.
It's like building your security on hard to brute force secrets in tech and suddenly someone makes a machine that instantly brute forces any secret. Its a similar kind of disaster with the difference that human being can't just switch doing something else and the value they added to the society is not compensated.
Nevermind Getty, I wonder how contributors who use Creative Commons licenses will feel. Anyone who's uploaded to Flickr, or Wikimedia Commons, or even YouTube could be victimized by AI generation.
The AI will launder the content just like GitHub's CoPilot, and attribution will be impossible on the other end. Since Creative Commons licenses are not PD, and often do require attribution (CC-BY) or they prohibit commercial usage (CC-NC) or they require that derivatives must be licensed the same way (CC-SA) or even prohibit derivatives outright (CC-ND) all of those requirements are going to be stomped into dust by generative AI.
And those licensors won't be big enough to sue anyone.
This is a strange take. Folks who choose to use a CC license do so because they want their work in the public and to be "Use(d) & Remix(ed)" https://creativecommons.org/use-remix/. The creative commons is very clearly designed to encourage reuse, remix and sharing, and the practice of suing over not following terms to the letter is a gross warping of the license that's happened over the last few years with copyright trolls. The Creative Commons organization has explicitly called this out. [1]
But I would argue that Stable Diffusion with the open-sourcing of their model weights, and use of the LAION dataset which is released under CC-BY 4.0, would likely meet both the letter and intent of the license. https://wiki.creativecommons.org/wiki/CC_Attribution-ShareAl...
CC-BY 4.0 requires attribution. CC 0 is the only Creative Commons license that does not require attribution.
I put my music online under a CC license because I want people to be able to play it freely, use it in videos, remix it, cover it, include it in a compilation. I'd like for people to be able to do anything with my music except claim to be the original creator.
Are you trying to tell us here that if someone refuses to conform to the terms of my Creative Commons license, such as attribution, that I would be wrong to sue them over copyright violation? Folks who use specific CC licenses want the licensees to abide by those terms, and we can legally enforce that compliance. Content creators are not copyright trolls, so please do not tar them all with the same brush.
If you have registered your copyright, you can sue for punitive damages.
If you haven't registered your copyright, you can sue for actual damages which includes the money you lost (including potential money) and maybe the money gained by the person using the work.
You will need to prove that the work generated by stable diffusion is an infringement of your work and that stable diffusion is liable (this will be challenging). You could also try suing the person using the work for profit (it will need to be for profit because otherwise all three points of "what you can sue for" is $0).
Remember to register the images that you create with the appropriate copyright office if you wish to be able to sue and have some teeth (and have a better than zero chance of collecting lawyers fees from the other party)
Of course not all content creators are trolls. The jump to sue and the assumption that all of this is somehow victimizing everyone who has released content in the CC license is troll-like thinking.
Like I said before, I believe there's a strong argument that Stability AI / LAION's use of CC-{BY/SA/ND} is likely allowed under the terms of all CC licenses due to the works being shared without alteration and with attribution, released under CC-BY-SA 4.0 (LAION) and the Stable Diffusion model being released under a permissive license (CreativeML Open RAIL-M).
The real question is if the images generated by the models need to provide attribution to every single weight involved in generating that image. That's a lot more complicated and unclear, but quickly gets into questions like "Should artistic style be copyrightable?" and "What amount of source material is required to constitute a copyrighted work?". But as of right now, I don't see how any of this is violating the letter or intent of CC 4.0
They're selling knitting needles in the era of robotic manufacturing.
I hope every single one of these lawsuits falls flat on its face. Other countries will happily overlook Getty copyright to get the leg up on AI.
AI is not reusing copyrighted material. It's learning from it in the same way humans do. You can even fine tune away from the base training set and wash any experience of it away.
Besides, if Getty wins, it merely insures that the large incumbents with massive pocketbooks to pay off Getty et. al. win. It'll keep AI out of the hands of the rest of us.
What are you even talking about? They're an agency for photographers and like every news website on the planet licenses their pictures (same as with images from AP, Reuters, and other agencies).
Getty is slightly more than just a website that posts low resolution, low quality pictures with a fat watermark on it.
Yeah...no. AI is doing nothing but reusing material. It generates the most likely image/text/code in its training set to be found following/around/correlating with the prompt. It literally has nothing outside it's training set to reproduce. And when it reproduces the Getty watermark, that's pretty obvious example of reusing copyrighted material.
>>It's learning from it in the same way humans do.
Not even close. These "AI" architectures may be sufficiently effective to produce useful output, but they are nothing like human intelligence. Not only is their architecture vastly different and making no attempt to reproduce/reimplement the neuron/synapse/neurotransmitter and sensory/brainstem/midbrain/cerebrum micro- and macro-architectures underlying human learning, the output both in the good and the errors is nothing resembling human learning. (source: just off-the-top-of-my-head recollections from neuroscience minor in college)
> It generates the most likely image/text/code in its training set to be found following/around/correlating with the prompt.
This is simply false. It's not a search engine that outputs the training item closest to the prompt.
In reality, it is "learning" (in some sense) how to correlate text to images, and then generating brand new images in response to input text. If this is legal for humans to do, then it's probably legal for machines to do the same thing.
Only because some people named their field "machine learning" and called it "learning".
It has no relation to human learning.
If your child accidently confuses a giraffe with some other animal you correct then, you don't add the picture to their training set and show them again thousands of pictures of giraffes hoping that their success rate improves.
If you ask Stable Diffusion for a starry night, you get van Gogh's starry night.
Perhaps I was not clear enough to prevent ambiguity.
>>It's not a search engine that outputs the training item closest to the prompt.
Correct, it is not outputting the training ITEM, it is outputting finer-grained slices of many items, more of a mash-up of the training items.
Of course it is not taking an entire specific image the closely matches the search term, it is taking averages of component images of "astronaut riding a horse over the moon in style of Rembrandt".
That image won't exist in the training set, but astronauts, horses, and Rembrandt-style coloring and shading do exist, and it is assembling those from averages of the components found it's training set, not from some abstract imagination or understanding.
The fact that the astronaut suit may not be the exact same as any of it's training images is the same as if I averaged 100 faces in photoshop, not because there is some kind of "learning" or "understanding". Ability to do useful statistical mashups is NOT the same as "learning".
This can be shown in a different "AI" engine'd failure to solve a child's puzzle. ChatGPT, when presented with: "Mike's mom had four kids, three are named Lucia, Drake, and Kelly, what is the fourth kid's name?". It said there is insufficient info, and doubled down when told that the answer is in the question.
>>how to correlate text to images
yes, as I pointed out, "correlating with the prompt." I didn't say it correlated an entire image, but I also failed to specify that it was correlating components.
>> If this is legal for humans to do, then it's probably legal for machines to do the same thing.
This [0] is I'm quite sure, not legal. Asked for an image of a person named "Ann Graham Lotz", it returned the image in the training set, slightly degraded.
First, that is literally the search engine functionality you were deriding.
Second, if you asked a human artist to produce the same image, without infringing copyright, they would produce something likely recognizable as the person, but obviously not resembling the training photo. It doesn't matter if they are a portrait painter, sketch artist, Photoshop jockey, or Picasso-like impressionist.
So, no, this does not represent learning in any conceptual, creative, or human-like sense.
It does represent mashing-up averages of inputs of various components. Feed in enough "astronaut" photos, and it'll be able to select out the humans in the spacesuit as the response to that prompt. Same for "horse", "moon", "riding", and "Rembrandt". and it can mash them together into something useful with good prompts.
But give it something very specific, like a person's name, and you get basically a search-engine result, because it doesn't have enough input data variety to abstract out the person 'object' from the background.
to me the "search engine" case where it reproduces a specific training image seems like a failure mode that's distinct from normal operation
> it is assembling those from averages of the components found it's training set, not from some abstract imagination or understanding
how exactly are you so certain that the human brain handles abstract concepts any differently? please note that I'm not claiming that I myself know, but rather that you almost certainly do not know and thus are presenting an invalid argument
what is human imagination anyway?
> assembling those from averages of the components found it's training set
> slices of many items, more of a mash-up of the training items
> But give it something very specific ... it doesn't have enough input data variety to abstract out the person 'object' from the background
so is it abstracting or not? where's the line between that and a mere statistical mashup?
>>how exactly are you so certain that the human brain handles abstract concepts any differently?
Good question. At the very least, we have a far deeper understanding of physical reality. Humans would not unintentionally (e.g., for effect) produce images of people with three ears, or of a bikini-clad girl seated on a boat with her head and torso facing us, and also her butt somehow facing us and thighs/knees away... yet I've seen both of these in the last week (sorry, couldn't find the reference, it was a hilarious image, looked great for 2sec until you saw it)
I admit that it is possible (tho I think unlikely) that this is a difference in quantity, not in kind.
One reason to doubt this is that Stable Diffusion was trained on 2.3 billion images. This is a vastly larger library than any human has seen in their lifetime (considering that viewing 2.3 billion images at one per second would take 72.8 years). Yet even if you count every second of eyesight as 'training', children under 1/10 of that age, who have seen only 10% of those images would not make the same kinds of mistakes.
Plus, the neuron/synapse/neurotransmitter and brainstem/midbrain/cerebellum micro & macro-architectures are vastly different than the computer training models. So, I think we can be confident that something different is happening.
>>so is it abstracting or not? where's the line between that and a mere statistical mashup?
Good question. There is definitely something we might call, or that might resemble abstraction. It's definitely able to associate the cutout images of an astronaut in a spacesuit from the backgrounds. It can evidently assemble those from different angles.
But it certainly does not have the abstraction to understand even the correct relationship between the parts of a human. E.g., it seems to keep astronauts' parts in the right relationship, but not bikini-clad-girls' parts (because of the variety of positions in the dataset?). There's no understanding of kinesiology, anatomy, or anything else that an actual artist would have.
Could this be trained in? I expect so, but I think it would require multiple engines, not merely six orders of magnitude more training of the same type. Even if 10^6X more training eliminated these error types and even performed better than humans, I'm not sure it would be the same, just different and useful.
I'd want to see evidence that it was not merely cut-pasting components of images in useful ways, but generating it from an understanding of the sub-sub components: "the thigh bone connects to the hip bone, the hip can rotate this far but not that far, the center of mass is supported...+++" as an artist builds up their images. Good artists study anatomy. These "AI"s haven't a clue that it exists.
>>to me the "search engine" case where it reproduces a specific training image seems like a failure mode that's distinct from normal operation
Au contraire, it seems that this merely exposes the normal operation. Insufficient images of that person prevented it from abstracting the person components from the background, so it just returned the whole thing. IDK whether it would take a dozen, hundred, or thousand more images of the same person, to work properly. But, if they all had some object in the background (e.g., a lamp) that was the same, the "AI" would include it in their abstraction.
> it seems that this merely exposes the normal operation. Insufficient images of that person prevented it from abstracting the person components from the background
yes my point was that this total failure to abstract (or slice or average or whatever it is that it usually seems to do) appears to me to be neither the intended nor typical mode of operation
> children under 1/10 of that age, who have seen only 10% of those images would not make the same kinds of mistakes
but then children aren't being fed a stream of unrelated images. they're receiving a wide array of real time sensory input from an environment they're actively operating in
consider your examples of the lack of higher level understanding about how the parts of a human "fit together". what practical experience do these models have that could actually convey such an understanding? deriving a proper understanding of mechanics in 3D from one million independent 2D still frames of human hands performing various tasks seems like it should be extremely difficult at best
> Could this be trained in? I expect so, but I think it would require multiple engines
I think it requires a different sort of training algorithm entirely. work such as https://arxiv.org/abs/1803.10122 suggests to me that there might be little difference between the human ability to abstract and lossy compression. at the same time work such as https://arxiv.org/abs/2205.11502 makes it apparent that in many cases this sort of generalization simply does not happen the way we'd like
> the neuron/synapse/neurotransmitter and brainstem/midbrain/cerebellum micro & macro-architectures are vastly different than the computer training models. So, I think we can be confident that something different is happening
something being architected differently doesn't necessarily mean that the higher level functionality is any different
moreover, in purely functional terms how do you propose to distinguish something that's different from something that's incomplete? ie a smaller piece of a larger whole? if someone constructs for example a passable digital model of the visual cortex of the mouse or human or other animal that's still only a single small piece of the whole
so who is and how are we to say that we either have or haven't achieved a meaningful form of abstraction versus merely averaging bits of the training set together? at this point I'm not actually clear where the line between those two things even lies
>>neither the intended nor typical mode of operation
Yup, certainly not intended, although I see it as the typical response on the edges of the data set; objects with too few varied representations will always fail in this way. Seems square-cubish as there will always be a volume of solid training data and a surface of partial data, so maybe not severe.
>> deriving a proper understanding of mechanics in 3D from one million independent 2D ...extremely difficult at best
Yup. This is definitely part of how it is different. Doing the full training set with stereographs would likely improve it, but it'd improve it even more to have the same images manipulated by robots and the feedback integrated. Considering the 3.5 billion parameters of DALL-E, 4.6B for Imagen and 890MM for Stable Diffusion, how many params would be needed to integrate stereo-vision and robotic feedback? 3.5billion squared or cubed? Would that be enough just scaled up, or do we need to qualitatively change the structure?
>>I think it requires a different sort of training algorithm entirely.
Agree 100%. I think these engines are a part of the solution, but not the whole. I expect we'll need multiple different kinds of training models, and then the methods to integrate them and correlate their 'knowledge'. E.g., figuring out how one part of a moderately complex object (e.g. a human) hides another part in certain positions (e.g., hand behind back) is trivial for a 3D modelling system, but even the massive 2d ones often get it wrong.
>>being architected differently doesn't necessarily mean that the higher level functionality is any different
Definitely true. Parallel evolution, elec vs ICE powered cars, etc. The question is when we've achieved the same level of functionality.
>>how do you propose to distinguish something that's different from something that's incomplete?...achieved a meaningful form of abstraction versus merely averaging bits of the training set together? at this point I'm not actually clear where the line between those two things even lies
YES, excellent question. Especially since these models don't do much explaining of their inner workings. Humans also haven't fully figured out our inner workings either.
It's looking right now like different AI will arrive faster than biomimicry-based AI, partly because we still don't know the bio at a deep enough level. IDK if it'll stay this way.
I remember discussions a long time ago with a scientist who worked on AI for early Mars missions, and how they'd move their machines. He was describing the algos for tracking the world, their machine, and adjusting motion, with the team assuming that they were re-creating the way humans do it. From my experience as an international level athlete and a neuroscience minor in college (inspired by my sport experiences), I could tell that his methods were nothing like how biological systems work. Seeing Google's self-driving car drive around a racetrack was truly impressive, but from my sportscar-racing training 7 experience, I could instantly tell that it was accomplishing the task nothing like any human would, although it was achieving competent levels of performance (in a limited setting).
How do draw the line? It may come down to the kinds of clever tests built by childhood and animal behaviorists to study animals who can't self-report on their state or if they actually figure out something or not.
That said, I don't think it's impossible for an AI to end up exceeding our capabilities by using different methods. Kind of like Paul Bunyan vs the chainsaw.
(BTW, thanks for the lively discussion; it's a pleasure to be pushed to define my thoughts better, and I've learned; happy to keep it going)
Lots of businesses fail when new disruptive technology arrives. We don't need to prop up the old at the expense of the new.
This is like crying over Rolodex.
And let's not forget how awful Getty has been throughout its existence. They've frequently sued people for things they didn't even own the copyright to.
That doesn't answer the question. In the hypothetical future where such companies no longer exist, where is the new AI training data supposed to come from?
Would you want an AI trained on images from no later than, say, 1973? No? People 50 years from now will feel the same way. Without new images to learn from, I don't see how future AI's of this kind could know anything about their own era.
But the models are already here, you can distill their knowledge if you want to train a new one, human feedback could also improve model performance without adding training data.
>Without new images to learn from
Even in the event that all cameras are destroyed (somehow), people will use generative models to describe their experience in a new era, and then this new knowledge will be used by new models and the cycle will repeat itself.
That depends on the AI - a poorly trained one can overfit extremely and that is then equivalent to copying. Furthermore the existing laws were made for typical human capabilities and the kind of remembering some models do is very much beyond those capabilities (far better recall).
I, for one, look forward to the societal collapse that will occur once countries without meaningful social aid suddenly have no work for anyone to be paid to do. It'll be a pretty fun few years while we see all the chaos play out
> look forward to the societal collapse that will occur once countries without meaningful social aid suddenly have no work for anyone to be paid to do
This assumes 1. automation is free, 2. humans cost too much, so any company would ditch their humans for AI. But in reality AI costs money, AI is better with people than without, and people can generate profits surpassing the cost of their wages. Why would a company prefer to reduce costs to increasing profits? When everyone has AI, humans are the differentiating factor.
There's the risk, depending on what "the chaos" entails, that the capital owners in the countries with social safety nets will look at the outcomes for the countries without and decide that alignment is preferable, at which point enforcing the obsolescence of the proletariat and freeing up gigatons of biomass for other applications (in the Bataillean sense) becomes a mere sociopolitical engineering problem.
Why do we even need humans after the AI uprising? They don't do anything and they are expensive to produce food for; an AI can power itself with the blowing of a breeze. We have to grow plants, have animals eat them, kill the animals, and then eat those. Then the parts we don't digest have to be taken away. Too expensive to keep around, even as pets! Humans can tolerate cats and dogs because they piggyback on infrastructure we made for ourselves (crops and trash collection). AIs don't need crops or trash collection, so it will be a harder sell to keep humans as pets.
All this science fiction is right; there isn't room on the planet for both humans and AIs. It sounds depressing that a bunch of GPUs are going to kill us all off, but it was coming anyway. The sun becomes a red giant and consumes the Earth. All protons in the Universe decay in 10^17 years, ending the existence of matter. The trajectory is clear even if the means aren't; humanity can't last forever.
If I sound depressed, I'm not really. People just use the headlines to guide their view on what The End looks like. Read a few articles about chatbots, and it's AIs taking all our jobs. Watch a few movies about asteroids, we go out like the dinosaurs. Hear "Russia invades Ukraine" and it's a nuclear holocaust. Read a few particle physics papers, and it's proton decay. You can't worry too much about it. Enjoy your time while you have it!
Please do explain to me how the blowing of a breeze helps AI produce the latest TPU V5 or NVIDIA H100 chips to run on. The technological stack behind AI is enormous, and humans are necessary cogs.
On the other hand humans are self replicators and only need a bit of biomass for sustenance, biomass that grows by itself, too. No factory, no supply chain, we got everything we need to make more of us.
If you consider the risk of EMP, an AI needs humans to restart it, or some way to survive electronic attacks.
That sounds pretty cool, maybe the world will do more silly things, like the country with the largest military force electing another game show host to it's highest office
This seems fair enough. There is pretty big value in it avoids potentially millions of pictures needing to be taken to train the AI on, they should pay something for that.
A problem is that visual artists could be paid a lower licensing fee for using their art to feed an AI than they would be in a per-image fee to the end user.
Let's say that a human painter learned to paint by reproducing copyrighted works. Would the copyright owners have a claim on that painter's work after he finished learning?
This argument has been feeling flat for a while (to me!) And now I think I can articulate why.
This isn't a human, until the AI systems are granted personhood, this is a tool, regardless of how it works under the hood.
So for me the question is, would the user (trainer) be allowed to view all the source material?
Yes? Then would the user/trainer be allowed to produce content based off the source data? Seems like yes, as long as the actual minamally/unmodified sources are not "copied" into the model.
Of course copyright is an abstract legal tool, so mo argumemt is worth anything until it's codified into law/precedent.
This changes nothing related to copyright. Humans use tools all the time, and those tools aren't humans either. A camera isn't a human, for example, and yet it does a large amount of the work of making the photograph.
In regards to copyright, no matter what tools you use to create art, what matters is the output. Drawn by a Human, created via a non-human such as either a camera or an AI, it is all the same.
Are you a copyright expert? This sounds counter realistic.
By the same vein of thought I could claim that Google is a tool and by typing in different keywords I can make it "output" almost any image. Yet me nor Google could never assert copyright over these images simply because I can type keywords to cause image search to output them, no matter how creative or original keywords I use.
> By the same vein of thought I could claim that Google is a tool and by typing in different keywords I can make it "output" almost any image.
Good question. The difference is regarding the output not the input.
Yes, if you output art, that looks exactly like some other art, then this could be a copyright violation. But that has nothing to do with a computer. Regardless, of if you copy someone else art, by right clicking it, AI generating it, or if a human entirely using their own paint and pencils, completely recreates it, this is all the same.
The computer has nothing to do with it. Neither does the input.
> no matter how creative or original keywords I use.
Exactly. Regardless of the input, the input doesn't matter. That is why AI art is legal. Because the complaints are not about the output, but instead about the input.
Yes, if you input other people's art, but the output is transformative, then that is legal.
Same for if a human does it, or a computer, or anything else. Human input, vs computer input is the same thing, whereas the illegal stuff, is based if the output is infringing, regardless if it is made by a computer or a human.
> By the same vein of thought I could claim that Google is a tool
Yes, it is a tool. And just like any other tool, if it is a computer, or a human, it is the output that is judged. Same as if a human took a picture, or hand copied someone else's art.
A human hand copying art, is just as legal or illegal, as if a computer does it.
> me nor Google could never assert copyright over these images
This is an unrelated topic. This is if the generation of new images, is copyrighted or not. That is different from if you are infringing on someone else's copyright.
I was only talking about "if you are infringing on someone else" topic, not about the generation of new copyright, which yes, requires human input, as according to the "monkey selfie copyright" case.
Are you sure it goes like this, or just conjecturing? Transformativeness is a defense arguing for fair use, i.e. saying that yes, you copied, but because you added so much new, the new work is independent, creative and original in its own right. Does there exist some precedent that the "creativity" part required for successful fair use defense can be externalized to a computer program or other such tool?
While the algorithm can simulate human learning, surely its outputs are not original copyrightable works. Originality seems to require human action by definition, and is what distinguishes inspired and copied works.
If the outputs are not original, AFAIK they then must be derivative. Stable Diffusion could claim fair use exemption. But fair use too is just meant to protect creativity, again a manifestly manual activity.
I don't know which way I lean, but I sure know the courts will soon have to make some very interesting rulings that will have monumental importance.
Maybe A.I. generated "art" is an entirely new class of work and lawmakers just need to rethink copyright for them.
Copyrighted content on the internet isn't a free for all for people to train models with. No matter the good intentions, if Stability AI didn't take reasonable steps to remove copyrighted data from the training set then IMO they have an uphill battle to prove the case to a jury (if it gets there).
Anyone can rent compute and as AI advances it's only going to get easier to fine tune models.
There's going to be a world where you can just point a program, one on your computer at a website with a bunch of images on it, wait 10 minutes, and I have it punching out stable diffusion style images.
What are the courts going to do to stop it? Sneak into your home and record everything you do? How are you going to prove an AI was trained on certain images? How are you going to prove an AI generated an image?
The cat is out of the bag, even if all the courts have decide that these images are copyright and can't be used, they're going to continue to be used by people, all over the place, with absolutely nothing being able to stop it.
The era of needing an artist to produce a novel image for a purpose is over. It will never come back, Even with the full support of the law trying to keep it around.
While that may hypothetically be the future, that time is not now. The case before the court is much more consequential in nature and may remove the need for partnerships between companies like OpenAI and ShutterShock for consenstual sharing. It must be adjudicated based on the current capabilities, facts and circumstances of the case. Does IP and copyright go out the window because computers happen to be good at taking in and transforming like a human? The complaint isn't hiding the implementation, they explained it in a high level reasonably well. US courts are some of the strictest when it comes to IP, largely in part because strict laws were passed by the legislature. The court's job is not to legislate from the bench, but adjudicate violations of the laws as they are written today.
Just because it's easy to speed on a road and other people are speeding and they aren't charged doesn't mean you can't be held liable for speeding if you're caught. Likewise, copying proprietary source code from another project into your own commercial software can seem innocuous until you get hit with a lawsuit. You "prove" it the same way you prove any other thing before a court. The jury/judge doesn't need to be absolutely certain that something did or did not happen, just that the plaintiffs prove it beyond the standard of proof.
So yes: we may as well enter a point that computers are virtually indistinguishable from humans in generating "novel" things, in that case the existing laws would need to change. In the meantime, I don't view AI models as a trump card around copyright/IP. If everyone else is following the rules of the road but you decide to stick it and do your own thing, don't expect to ram your way through without consequences.
Good, how about training on variations of real images? Variations should not be copyrightable since they contain no human input, and they should be sufficiently different from the originals. So the trained model can't possibly reproduce exactly any original because it hasn't seen one.
How do you make the variations though? Start uo a new company called StableVariations and let them deal with the Getty lawsuit? To create the variations you're still using the original images in a commercial way.
Also variations have to be extreme in order to not violate copyrights, and even then may still violate. Otherwise youtube would have no issue with me uploading all of Shrek as long as I mirrored the video and pitched the audio up by 3%.
> you're still using the original images in a commercial way.
Thats not illegal.
For example, right now, for whatever job you are doing, you have probably looked at copyrighted works, that you don't own.
You have copied those copyrighted works, because in order for you to view the image that you don't own, you had to download it to your computer.
So, you have thus used copyrighted works, for commercial purposes, if you have ever looked at copyrighted works on your computer, for a professional purpose.
What is infringement, is not using copyrighted works for a professional purpose. Instead it is distributing copyrighted works to other people that is infringement.
Those links are interesting but just show that Getty is a slimy business that tries to repackage public domain images for sale, not that they infringe the IP of others. That’s a massively different issue.
You'd think that should fall under the perjury penalty of the DMCA, I thought that misrepresenting oneself as the copyright holder was what that was for. But maybe they didn't file any such DMCA notice, I dunno.
Oh, I agree that it's a completely toothless provision in practice, but claiming the copyright on someone else's work and then trying to take down the person's own copyrighted work seems like exactly the sort of thing it was supposed to cover.
The article you linked makes the point that at least one civil suit did give damages for dealing with a bogus DMCA takedown, but that it was unrelated to the perjury provision.
Just read the first article. Getty Images was sued by a photographer for using 18.000 her public domain images for profit by Getty. The ruling DISMISSED the allegation, which is crazy. It's comical that now it's exactly the same allegation, but with the sides inverted, now it's Getty trying to sue an AI company for using their public domain images. I think we can all guess what's going to happen xD
> Getty Images was sued by a photographer for using 18.000 her public domain images for profit by Getty.
You can use public domain images for profit. It's not surprising this was thrown out.
> It's comical that now it's exactly the same allegation, but with the sides inverted, now it's Getty trying to sue an AI company for using their public domain images.
Where does it say they're suing over the public domain images in their collection? Their collection is not entirely public domain images. Their suit claims for the copyright works by staff photographers, third parties that have assigned copyright, and images licensed to them by contributing photographers. In addition, they're claiming for the titles and captions which they created and are themselves copyrighted.
It's not the "exact same allegation", and there's really no relation between the facts of the cases here.
> I think we can all guess what's going to happen xD
The outcome will be, Stable Diffusion settling and licensing the images from Getty Images. If OpenAI was able to do it with Shutter-stock, so can Stable Diffusion.
I don't think that outcome is assured. In a lot of ways, Stable Diffusion's business model creates an existential threat to Getty Images. I would expect alternative outcomes to be:
1. The requested licensing fee approaches infinity
2. Getty Images simply refuses to license images to anyone who will use AI to create derivative works
I don't think thats an acceptable outcome to getty, whose entire business model will be confounded by a tech that used its images against license to generate alternatives to getty's business.
I’ve been a pro photographer for over 30 years and Getty has stolen my work and countless others. I was one of a coalition of photographers who pooled resources and won substantial damages.(my work was all registered with the library of congress which raises the liability cost for infringement)
Getty’s strategy at the time appeared to be to meet any infringement accusations with a massive legal response. Any individual photographer could generally not afford to respond.
My impression was that they were not super concerned about infringing on other’s work. But they will sue the pants off anyone who they perceived to be violating their copyrights.
But in recent years there are legal firms dedicated to pursuing deep pocket infringement cases on contingency. This has changed the legal calculus for large companies who were not careful with copyright.
Getty’s strategy at the time appeared to be to meet any infringement accusations with a massive legal response.
I'm surprised at this in your case because typically copyright infringement cases are massively weighted in favor of the copyright owner (defined damages AND reimbursement of legal fees). I know this because I'm currently a defendant in a copyright lawsuit.
If Getty wins this one, you win as a photographer. If they lose, you lose. They might be a shitty company, but in this instance, your interests align...
In the event that Getty Images wins, it seems most likely that AI researchers would pay Shutterstock/Getty Images for their large existing catalogs of images. With the companies having a stronger position (getting to act as a gatekeeper to this kind of machine learning) and artists still a weaker one, I wouldn't hold out hope for them passing anything on.
“Making matters worse, Stability AI has caused the Stable Diffusion model to incorporate a modified version of the Getty Images’ watermark to bizarre or grotesque synthetic imagery that tarnishes Getty Images’ hard-earned reputation”
Not sure about their reputation but they have a point with the bizarre/grotesque thing.
A lot of people are saying “clearly a violation of copyright” and throwing around the term “derivative work” with the confidence of a seasoned copyright lawyer, but ctrl+f shows only one single references to the phrase transformative work on this comment thread.
It might be much more complicated than it appears in the surface! For example, look up Richard Prince!
The earliest I heard of it was the "parsing html with regular expressions" classic which I highly recommend if you haven't seen it before: https://stackoverflow.com/a/1732454/4012132
unrelated rant:
this is so stupid in itself.
people that (sometines) make money of unpaid work are sueing people that (sometimes) make money of unpaid work.
and it's all about the (sometimes) and the unpaid work.
it's dramedy. in a lawsuit.
Getty announced in Oct they partnered with BRIA (?) to provide generative AI tools using their licensed images [1], and Shutterstock announced a partnership with OpenAI [2].
So it's clear these rights holders are OK with generative AI, as long as they continue to extract their pound of flesh. The language around "protecting artists" is horseshit - if you're a creative and you see Disney, Getty, etc getting behind your cause, you should look _very_ carefully around and make sure you're not the one being screwed.
1 - https://newsroom.gettyimages.com/en/getty-images/bria-partne... 2 - https://www.theverge.com/2022/10/25/23422359/shutterstock-ai...