I've asked Stable Diffusion to generate 250 pages of 1987 RadioShack catalog

TheOtherHobbes · on Dec 1, 2022

OK but... what's happening here is SD is capturing textures and some of their relationships. It clearly has no understanding of the objects it's generating.

So the output is a kind of Dali-esque mushed up melted version of the original content.

It's entertaining because it's simultaneously referential and heavily distorted in unexpected ways.

You could use this a manual art technique, and it would be interesting-ish.

I'm curious if it's capable of getting to the next stage, of understanding and distorting the actual design relationships in these objects to the point where it could deepfake catalog pages and they'd be indistinguishable from the real thing.

I suspect there's quite a gap between that stage and this one.

notahacker · on Dec 1, 2022

Feels like you'd make more progress with discrete models working together. At the very simple level, Stable Diffusion is pretty terrible at words and fonts but a reasonable model of where [patterns of pixels that look like] text might fit, and how big should it be. Add a second step which recognises Stable-Diffusion generated pseudotext blocks and replaces them with GPT-generated text to the same prompt set in an actual font scaled to match StableDiffusion's attempted font and you'll more likely get something that passes the "zoom in and try to read it" test. Though there may not be very much correspondence between the images and text.

A more complex arrangement adds in a model which chooses which does high level structure to fit a prompt (subprompts on suitable topics for images and text for each page), a 'house style' model to pick fonts and copy/paste stuff like the RadioShack logo direct from its source material, plain old StableDiffusion to draw lots of individual pictures "cassette player deluxe RadioShack 1970" and plain old GPT to write the text which is typeset to the 'house style' model's specification, and probably an "observer" model that forces new iterations of really bad pages.

Great thing about this magazine creation process is it also tends to work better for humans than having one person do everything!

uoaei · on Dec 1, 2022

The analogy in my mind is the heyday of x86 ubiquity vs the coming revolution of many SoCs in one device.

Huge, all-purpose, massive-capacity NNs are the norm for the current advances in SOTA. This is probably because of a certain limit on development complexity: designing a complex system of richly interacting parts is hard, so as long as development of these sorts of tools is manual, it will be much easier just to shove more compute at the problem with singular, massive models. Compare this to the all-encompassing architecture hegemony of x86. This makes it relatively easy to write software that people will be able to use on devices they already own, and this reduction in developer effort and complexity is enormous for enabling rapid growth in the number of possible software programs that can be created per unit time.

As ML design becomes more automated and functional, new possibilities open up regarding breaking out sub-tasks to individual, hyper-specialized tools and combining them into a resilient and capable whole that is more than the sum of its parts. That is the true power of automatic differentiation frameworks: when your system can be end-to-end differentiable throughout many devices and specialized functions, and you can train parts as easily as you can train the whole, you begin to observe the creation -- the growth, really -- of a new kind of digital cognition machine.

flemhans · on Dec 2, 2022

So the first "prompt" would be "write a series of prompts for an AI to use as steps to make variations of this image, and connect it together afterwards"

orbital-decay · on Dec 2, 2022

Thing is, SD in particular is relatively small (860M parameters), and quantity can be somewhat transformed into quality in this case. As Google Parti landing page [0] conveniently demonstrates, more parameters with the same architecture yield more coherent output, including text and symbols. Given enough room (enough weights) in all parts of the model, starting from CLIP, it could even construct coherent text, as the compression ratio would be much less. However you'd need much beefier hardware to run it, and it might not be as efficient as having a better architecture or a dataset/training skewed towards symbols.

[0] https://parti.research.google/

rjh29 · on Dec 2, 2022

https://medium.com/dipchain/mario-klingemann-memories-of-pas...

SquibblesRedux · on Dec 1, 2022

"Looks like" and "resembles" are fairly orthogonal to "understands." Of one is looking for understanding, I would say Stable Diffusion is traveling in the wrong direction.

imranq · on Dec 1, 2022

Its so interesting seeing how these txt2img models represent text. Its sort of like how someone who doesn't know how to read might represent language, as shapes instead of characters and words.

That said, it would be a fun experiment to try an img2txt model on these individual catalog items to find out what they actually do (or image search)

futureshock · on Dec 1, 2022

If the computer was a bicycle for the mind then Stable Diffusion is LSD for the computer.

Who cares how theoretically soulless it is under the matrix transforms, the fact is that this spits out the weird, and us humans chew it up and spit it right back.

I think this is the first new era in Art since postmodernism.

bhaney · on Dec 1, 2022

So using Stable Diffusion is like giving LSD to your bicycle?

alex_young · on Dec 1, 2022

So, bicycle day? https://en.m.wikipedia.org/wiki/History_of_lysergic_acid_die...

dpflan · on Dec 1, 2022

Seems like a good prompt to try...

wickedsight · on Dec 1, 2022

I tried, the results were pretty much what you'd expect: https://imgur.com/a/5F4AS2D

Yahivin · on Dec 1, 2022

basch · on Dec 1, 2022

How long until what is currently photography is either full length video or 3d navigable worlds?

These Star Wars galleries are just phenomenal. I can't imagine it is long until you can apply these as filters to movies, for example taking Rogue One and applying a Fritz Lang or Stanley Kubrick filter.

https://www.facebook.com/groups/395755276049376/?multi_perma...

If I took that galleries above and used it as the input photos, and then used a tool like this, which takes about 5 seconds a frame .. https://gizmodo.com/disney-ai-art-vfx-visual-effects-de-age-...

Or akin to Cars being a remake of Doc Hollywood with anthropomorphized Cars, being able to say something like "I want to see a remake of the 2013 movie Rush, in the world of Zootopia, with 15% styling from Speed Racker, set in the 70's, with flying animals, and the main two rival protagonists being flying squirrels, one of which has a birth defect and a prosthetic wing."

WrtCdEvrydy · on Dec 1, 2022

> for example taking Rogue One and applying a Fritz Lang or Stanley Kubrick filter.

I mean, what's to stop someone from doing that using a basic ass LUT?

basch · on Dec 1, 2022

Do you mean just color, or redrawing the costumes and sets and special effects to match aesthetics? In 3D space, and consistently from scene to scene?

WrtCdEvrydy · on Dec 2, 2022

Aesthetics would be a whole different ball game.

basch · on Dec 2, 2022

If you can get into the facebook links I posted, you can see its a lot more than color grading im talking about.

jeremyjh · on Dec 1, 2022

Kubrick's version was better than the studio release.

GuB-42 · on Dec 1, 2022

The think that I find the most LSD-like, both in the visuals it generate and in principle is Google DeepDream.

It works by using an image classifier in reverse. So for example, you have a neural network that identifies bicycles, feed it an image, get the results, and feed them back to the image, boosting the bicycle-like characteristics of the image, repeat the process a number of times. In the end, you get something that looks like the original image, but made of bicycle parts. It is commonly done with faces, it can also be done on intermediate layers, amplifying more abstract details like geometric shapes.

Originally intended as a way to reveal the inner workings of a neural network (for research, debugging, etc...), it has also been used by artists for really trippy visuals.

dpflan · on Dec 1, 2022

This is what I'm talking about, amazing work! Generative AI can unleash infinite un-realities -- shared reality it at drastic risk of being lost even more.

zwkrt · on Dec 1, 2022

Baudrillard's visison of the hyperreal is becoming overwhelmingly true and at an exponential rate. I envision that soon we won't even need historical documents because we can just auto-generate documents based on historical data. Which will then become the new historical documents, in a process that keeps folding in on itself until the bubble our immediate reality shrinks and shrinks until there is only the smallest part of the inner ear vaguely recalling that gravity is acting upon us, while the rest of our senses and thoughts are wrapped in the warm blanket of an auto-generated fugue.

basch · on Dec 1, 2022

When I think about procedurally generated entertainment, that can be materialized on demand, I find my self thinking about the movie Strange Days.

Will the future be one of societal fragmentation where 1) nobody ever rewatches anything, they just generate more new stuff 2) nobody watches each others work ("hey watch my feed I just generated, it was awesome" followed by "yeah sure, someday") and just watches more new custom stuff generated for them which just becomes 3) everybody is watching an endless feed of new procedurally generated levels like endless runner games and 4) there is no longer any shared experience between people that they can realistically use as a foundation to communicate or interact.

OR will the future be MOSTLY the previous but with a counterculture market for vintage (and modern) "authentic experiences" some of which will be black market. And then as part of that counterculture demand, how much of the "authentic experience" content be counterfeit procedurally generated to look real. And then the act of consuming counterfeit "authentic experiences" en mass just becomes a role play archeology treasure hunting game.

OscarCunningham · on Dec 1, 2022

But at the moment, people watch new TV shows when they come out, rather than watching old shows which are just as good. I think it's because they enjoy watching the same thing as other people.

layer8 · on Dec 1, 2022

That’s true, but it’s also because of marketing. At least streaming services also list “related” productions that are older.

unixhero · on Dec 1, 2022

I still do rewatch X-Files, Star Trek TNG, Seinfeld and Stargate SG-1.

I uh, want to believe.

basch · on Dec 1, 2022

Would you if there were endless new episodes? Or upon rewatch that they could change slightly?

What about your kids kids? Would they look back on older generations who are watching non procedurally generated content that never changes as weirdos?

unixhero · on Dec 2, 2022

For story arcs, the thrill is an evolving story with a coherent plot. I suppose singular X-Files and Seinfeld episodes could easily fit in with the procedurally generated category. That is very exciting prospect. Seinfeld infinity.

dejj · on Dec 1, 2022

Xerox did it in 2013, changing 6s into 8s in scanned documents: https://www.dkriesel.com/blog/2013/0802_xerox-workcentres_ar...

JasonFruit · on Dec 1, 2022

We can also, you know, just decide we won't do that.

vjk800 · on Dec 1, 2022

There is no mechanism in existence to collectively decide anything. It's why things suck in general.

ltbarcly3 · on Dec 1, 2022

There are many mechanisms to collectively decide things, so you aren't even close about that (Democracy, the market, representative government at every level, the United Nations, proxy votes for corporations, school boards, boards of governors, juries, HOAs, ballot initiatives and referendums, elections generally, zoning commissions, family meetings, 4 friends debating where to get dinner, really far too many to list here, if anything almost all decisions of any importance are made via some mechanism to collectively decide things).

There isn't a way to collectively decide things with absolute authority, but that is why things don't suck worse in general. If we make collective decisions we could force people to do some things which are more optimal to our goals. However, that assumes that we universally agree on what the goals are (not even close) and that the decisions won't actually be worse for the chosen goals (sometimes they will be far worse) and that the decision making process will never be irreversibly hijacked by some group for their own benefit (it absolutely will be). So you are not just wrong about this, your premise is incorrect and your conclusion does not follow from that premise even if it were.

JasonFruit · on Dec 1, 2022

Where did I say we would or could decide against it collectively? Individual people can decide for themselves not to engage with harmful technologies in the future, just as many do today with cell phones, computers, television, etc. Not everything has to be done by governments.

dinkleberg · on Dec 1, 2022

But will we? Those of us who have been alive long enough to know what life was like before any of this may choose not to. But our youngest generation, and those to come, may grow up not knowing any different.

Not suggesting it will happen, but it is an unappealing thought.

dwringer · on Dec 1, 2022

If anything, I think technology like this has gradually empowered people to rise above the autogenerated fugue that has historically always been a part of everyday life (though historically the human mind did a pretty good job generating that on its own thanks to widespread ignorance, superstition, and fear). Speaking personally, I find AI chatbots and image generators when I'm using them myself to be like a refreshing drink of cool fresh water compared with the sensation of being only drip fed or waterboarded by businesses wielding the technology to influence my behavior without my explicit input or full consent.

replygirl · on Dec 1, 2022

as individuals we have enough trouble asserting will over impulse as it is--hard to imagine us collectively deciding on anything like this when institutions are even more vulnerable to reactivity

gaterin · on Dec 1, 2022

That sounds like the view of an anti-progress luddite. Be careful where you make such statements.

dpflan · on Dec 1, 2022

"[A]uto-generated fugue" is a superb phrase for this situation.

tablespoon · on Dec 1, 2022

> I envision that soon we won't even need historical documents because we can just auto-generate documents based on historical data.

A.K.A. forgeries.

theandrewbailey · on Dec 1, 2022

The term "historical documents" gives me strong Galaxy Quest vibes. Someday, maybe we can generate the TV show that it was based on.

tomjen3 · on Dec 1, 2022

Lots of Historical documents are, eh, less historical than might appear.

pupppet · on Dec 1, 2022

I miss getting these catalogs! Someone should send out a weekly newsletter of random interesting gadgets linked to online stores where you can buy them, with a layout that perfectly mimics these old catalogs. I'd eat that up.

kle · on Dec 1, 2022

If you like this you will enjoy "an improbable future" on instagram: https://instagram.com/an_improbable_future

shubhamverma · on Dec 1, 2022

I have been following this account and enjoying all their posts of amazing product designs - and I didn't even realise until I read this comment that it's AI -generated!

jgalt212 · on Dec 1, 2022

It's sort of neat, but for me, the longer you look at this images are they really different from visual lorem ipsum?

RandomWorker · on Dec 1, 2022

General mastodon question here, I click login but I can login with my credentials because I’m on mastodon.social. How do I login from this interface to comment in this thread or is this not possible?

rjmunro · on Dec 1, 2022

Paste the URL into the search box on your mastodon instance. It's a bit ugly, but it works.

bmitc · on Dec 1, 2022

Why is this stuff interesting to people?

You could write an algorithm that does image recognition and cut and paste from a plethora of image resources and build a similar catalog of actual products from a given time period.

This just looks like the typical machine learning throw up of dumping a bunch of statistically averaged and collaged images that's then been blurred as if someone ran their finger across the image and random text. It isn't clear to me where the intrigue is aside from a "hmph, I guess (?) that's cool".

dwringer · on Dec 1, 2022

I think it's just a common sort of ex nihilo first steps example of the technology that's easy to show off to anyone, and it's a good example of how the results can be iterated and cherry-picked to filter out the most garbage-laden images to get stuff that's basically (what another commentor called) "visual lorem ipsum".

There's a lot more that stable diffusion can do when there is a feedback loop between the user and the computer, but I don't think it's very easy to convey with pop articles or even long form ones - I hope one day everyone has a chance to approach models like this and learn from them in their own way, and I appreciate articles like this in their attempt to get a wider audience interested in the technology.

bmitc · on Dec 1, 2022

That’s a well balanced perspective. And I don’t mean to be a Luddite or anything, but I just don’t think I see what this will be useful for outside of the typical advertising and abusive use cases.

One use case I can currently imagine, outside of advertising, is maybe storyboarding, because you don’t really care much about fidelity or even style there, and it’s primarily a scaffolding tool. However, I’m not terribly sure of the feedback loop being anywhere near that of a director or cinematographer or writer sitting down with a storyboard artist. But maybe you don’t have access to a storyboard artist.

There is a valid position of asking “why do we need this?”, and I don’t think it gets asked enough in technology. One thing I am sure of is that this type of machine learning art will be abused.

There is an unfortunate inevitability though with humans and technology.

YurgenJurgensen · on Dec 1, 2022

I don't understand the motivation of the person who sees 60 hours of video uploaded to YouTube per second, 30+ new video games released on Steam per day, 100,000 songs uploaded to Spotify every day and 6,000 Tweets being made per second, and decides "You know what the World needs? A way of enabling more people to make more content faster."

sideshowb · on Dec 1, 2022

> typical machine learning throw up of dumping a bunch of statistically averaged and collaged images

On the scale of my lifetime, that's a fairly new phenomenon

simonw · on Dec 1, 2022

> You could write an algorithm that does image recognition and cut and paste from a plethora of image resources and build a similar catalog of actual products from a given time period.

And if you did that I would think it was super interesting and cool too.

lm28469 · on Dec 1, 2022

> Why is this stuff interesting to people?

Because by typing three words and clicking two buttons you get to claim you're an artist and your ticket for the future at the same time

It's one more step in the general dumbing down of everything tech touches. You don't need skills, you don't need to devote time it, you don't even need to understand how it works, just go on a website, give them your money, write something and boom you're done

I think being stuck in a content consumption cycle for a while made people slowly realise that creation is much more fulfilling than consumption, these things give them the illusion of creating things

zoover2020 · on Dec 1, 2022

You must be fun at parties

dang · on Dec 1, 2022

"Edit out swipes."

https://news.ycombinator.com/newsguidelines.html

Corollary: If editing out swipes results in an empty comment, the comment should probably not be posted.

beefman · on Dec 1, 2022

So this is why I can't read in my dreams...

givinguflac · on Dec 1, 2022

I use stable diffusion almost daily and this is a really creative use! I have leaned into the weird text that comes with these types of prompts and I love it.

johnwheeler · on Dec 1, 2022

Abstract amalgamation of machines. I think one reason these are so magical is it feels like they’re doing something our brains do when they dream.

wolpoli · on Dec 1, 2022

I like how Stable Diffusion generated a SanDisk looking logo - a company that was found, according to Wikipedia, in 1988.

bmitc · on Dec 1, 2022

I mean, surely the SanDisk logo was in the data set. It didn't generate it as much as it convoluted the original logo.

sireat · on Dec 2, 2022

Practically speaking how does this line work securely?

  const clientSecret = "my_client_secret";
  //where does the my client secret actually come from?
  //surely client secret is not hard coded?

moistly · on Dec 2, 2022

The J. Peterman catalog, which was a storyline on Seinfeld, is a real catalog and apparently as pretentious as the show’s mockery made it out to be. It would be good grist for the SD/GPT mill.

hungrygs · on Dec 1, 2022

I worked at a Radio Shack in 1986-87, sort of a dream job for 18 years old. Now I know what it would have looked like if I showed up for work one day on LSD!

JKCalhoun · on Dec 1, 2022

Stable Diffusion trying to sell me batteries.

ploum · on Dec 1, 2022

Side note : it’s nice to see that Mastodon reached the point where you could point at it for stuff completely unrelated to Mastodon. Good job HN!

snowwrestler · on Dec 1, 2022

It’s funny though, one consequence of the federation is that the domain name on the link gives you no clue you’re going to see Mastodon when you click it. (Unlike centralized services like Twitter or Facebook.)

If you wanted to see all the Mastodon links posted to HN, you’d have to either start with a list of all known Mastodon server domains and search those, or scrape all the links and pick out the ones that land on a Mastodon instance.

paulmd · on Dec 2, 2022

I'm seeing increasing usage on discord as well, in the usual exchange-of-memes process. Things that would have been tweet links are now mastodon links.

This is probably significantly increasing the discoverability of mastodon pods. I previously had no idea what pods to join, now I can see what pods the people in my discords are active in or consuming content from.

recursivedoubts · on Dec 1, 2022

if only you knew how good things really weren't