Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The issue with this is there's a false assumption that an image is a collection of objects. It's not (necessarily).

I want a picture of frozen cyan peach fuzz.



https://imgur.com/ayAWSKr

Prompt: frozen cyan peach fuzz, with default settings on a first generation SD model.

People _seriously_ do not understand how good these tools have been for nearly two years already.


If by people you mean me, then I wasn't clear enough in my comment. The example given implied an image without any objects the GP was talking about, just a uniform texture.



can do this with any image generation model.

Disclaimer: I'm not behind any


Running that image through Segment Anything you get this: https://imgur.com/a/XzCanxx

Imagine if instead of generating the RGB image directly the model would generate something like that, but with richer descriptive embeddings on each segment, and then having a separate model generating the final RGB image. Then it would be easy to change the background, rotate the peach, change color, add other fruits, etc, by editing this semantic representation of the image instead of wrestling with the prompt to try to do small changes without regenerating the entire image from scratch.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: