Building upon Google's research Rich Human Feedback for Text-to-Image Generation we have collected over 1.5 million responses from 152'684 individual

Datasets: Rapidata / text-2-image-Rich-Human-Feedback like 13 Follow Rapidata 17

submited by
Style Pass
2025-01-10 16:00:05

Building upon Google's research Rich Human Feedback for Text-to-Image Generation we have collected over 1.5 million responses from 152'684 individual humans using Rapidata via the Python API. Collection took roughly 5 days.

We asked humans to evaluate AI-generated images in style, coherence and prompt alignment. For images that contained flaws, participants were asked to identify specific problematic areas. Additionally, for all images, participants identified words from the prompts that were not accurately represented in the generated images.

Accessing this data is easy with the Huggingface dataset library. For quick demos or previews, we recommend setting streaming=True as downloading the whole dataset can take a while.

Users identified words from the prompts that were NOT accurately depicted in the generated images. Higher word scores indicate poorer representation in the image. Participants also had the option to select "[No_mistakes]" for prompts where all elements were accurately depicted.

The coherence score measures whether the generated image is logically consistent and free from artifacts or visual glitches. Without seeing the original prompt, users were asked: "Look closely, does this image have weird errors, like senseless or malformed objects, incomprehensible details, or visual glitches?" Each image received at least 21 responses indicating the level of coherence on a scale of 1-5, which were then averaged to produce the final scores where 5 indicates the highest coherence.

Leave a Comment
Related Posts