Does someone know what FLUX 1.1 has been trained on?
I generated almost hundred images on the pro model using "camera filename + simple word" two word prompts, and it all looks like photos from someones phone. Like, unless it has text I would not even stop to consider any of these images AI. They sometimes look cropped. A lot of food pictures, messy tables and appartments etc.
Did they scrape public facebook posts? Snapchat? Vkontakte? Buy private images from onedrive/dropbox? If I put as the second word a female name, it almost always triggers nsfw filter. So I assume images in the training set are quite private.
[edit]
Looking at these images feels uneasy, like I am looking at someones private photos. There is not enough "guidance" in a prompt like "IMG00012.JPG forbid" to account for these images, so it must all come from the training data.
I do not believe FLUX 1.1 pro has radically different training set than these previous open models, even if it is more prone to such generation.
It feels really off, so, again, is there any info on training data used for these models?
It's not just flux, you can do the same with other models including Stable Diffusion.
These two reddit threads [1][2] explore this convention a bit.
DSC_0001-9999.JPG - Nikon Default
DSCF0001-9999.JPG - Fujifilm Default
IMG_0001-9999.JPG - Generic Image
P0001-9999.JPG - Panasonic Default
CIMG0001-9999.JPG - Casio Default
PICT0001-9999.JPG - Sony Default
Photo_0001-9999.JPG - Android Photo
VID_0001-9999.mp4 - Generic Video
Edit: Also created a version for 3D Software Filenames (all of them tested, only a few had some effects)
Autodesk Filmbox (FBX): my_model0001-9999.fbx
Stereolithography (STL): Model0001-9999.stl
3ds Max: 3ds_Scene0001-9999.max
Cinema 4D: Project0001-9999.c4d
Maya (ASCII): Animation0001-9999.ma
SketchUp: SketchUp0001-9999.skp
I’m not sure what saar means here but these images are fairly standard and a drop in the bucket compared to the hideous number of porn fine tunes published daily on civit ai if that’s what you’re looking for
I highly doubt it’s a product of the raw training dataset because I had the opposite problem. The token for “background” introduced intense blur on the whole image almost regardless of how it was used in the prompt, which is interesting because their prompt interpretation is much better.
It seems likely that they did heavy calibration of text as well as a lot of tuning efforts to make the model prefer images that are “flux-y”.
Whatever process they’re following, they’ve inadvertently made the model overly sensitive to certain terms to the point at which their mere inclusion is stronger than a Lora.
The photos you’re showing aren’t especially noteworthy in the scheme of things. It doesn’t take a lot of effort to “escape” the basic image formatting and get something hyper realistic. Personally I don’t think they’re trying to hide the hyper realism so much as trying to default to imagery that people want.
Did they scrape public facebook posts? Snapchat? Vkontakte? Buy private images from onedrive/dropbox? If I put as the second word a female name, it almost always triggers nsfw filter. So I assume images in the training set are quite private.
See for yourself (autoplay music warning):
people: https://vm.tiktok.com/ZGdeXEhMg/
food and stuff: https://vm.tiktok.com/ZGdeXEBDK/
signs: https://vm.tiktok.com/ZGdeXoAgy/
[edit] Looking at these images feels uneasy, like I am looking at someones private photos. There is not enough "guidance" in a prompt like "IMG00012.JPG forbid" to account for these images, so it must all come from the training data.
I do not believe FLUX 1.1 pro has radically different training set than these previous open models, even if it is more prone to such generation.
It feels really off, so, again, is there any info on training data used for these models?