Child sex abuse images found in dataset training image generators, report says

Sapphire Velvet@lemmynsfw.com · 1 年前

Child sex abuse images found in dataset training image generators, report says

KoboldCoterie@pawb.social · edit-2 1 年前

While I agree with the sentiment, that’s 2-6 in 10,000,000 images; even if someone was personally reviewing all of the images that went into these data sets, which I strongly doubt, that’s a pretty easy mistake to make, when looking at that many images.

RecallMadness@lemmy.nz · 1 年前

“Known CSAM” suggests researchers ran it through automated detection tools which the dataset authors could have used.

Sapphire Velvet@lemmynsfw.com · 1 年前

They’re not looking at the images though. They’re scraping. And their own legal defenses rely on them not looking too carefully else they cede their position to the copyright holders.

snooggums@kbin.social · 1 年前

Technically they violated the copyright of the CSAM creators!