Photos of Australian children found in AI training dataset, create deepfake risk

Personal photos of Australian children are being used to train AI through a dataset that has been built by scraping images from the internet – exposing kids to the risk of private information leaks and their images being used in pornographic deepfakes.

Biometrics researchers have been struggling with how to train algorithms to recognize children, particularly as they age, for instance for investigations of child sexual abuse material, and have turned to synthetic data to avoid potential harm to real data subjects.

The images of the children were collected without the knowledge or consent of their families and used to build the Laion-5B dataset, according to findings from human rights organization Human Rights Watch (HRW). The photos were then used by popular generative AI services such as Stability AI and Midjourney, The Guardian reports.

HRW claims that AI tools trained on the dataset were later used to create synthetic images that could be categorized as child pornography.

The dataset was created by the German nonprofit open AI organization Laion. The photos were collected from personal blogs, video and photo-sharing sites, school websites and photographers’ collections of family portraits. Some were uploaded decades before the Laion-5B dataset was created while many of them were not publicly available.

Human Rights Watch has so far found 190 photos of children from Australia but this is likely only the tip of the iceberg. The database contains 5.85 billion images and captions and the organization has only managed to review less than 0.0001 percent. Some photos were listed with the children’s names and other information, making their identities traceable.

Laion has confirmed that the dataset contained children’s photos found by Human Rights Watch and pledged to remove them. The non-profit also said that children and their guardians were responsible for removing children’s personal photos from the internet.

“LAION datasets are just a collection of links to images available on public internet. Removing links from LAION datasets DOES NOT result in removal of actual original images hosted by the responsible third parties on public internet,” the organization told The Guardian.

HRW’s children’s rights and technology researcher Hye Jung Han called on the Australian government to urgently adopt laws to protect children’s data from “AI-fueled misuse.” Australia is currently preparing to amend its Privacy Act, including drafting the Children’s Online Privacy Code.

“Generative AI is still a nascent technology, and the associated harm that children are already experiencing is not inevitable,” says Han.

Article Topics

Latest Biometrics News

Jul 3, 2024, 6:15 pm EDT

A collection of trade bodies and leaders within the UK’s digital identity ecosystem have published an open letter calling on…

Jul 3, 2024, 6:08 pm EDT

In a bid to safeguard children from graphic and harmful online content, Australia’s eSafety Commissioner has issued directives to “key”…

Jul 3, 2024, 5:04 pm EDT

The second group of startups awarded funding by Worldcoin to work on extensions and applications of its digital identity and…

Jul 3, 2024, 4:22 pm EDT

AI-generated content such as deepfakes is facing increasing scrutiny. Three new research resources – authored by search giant Google, identity…

Jul 3, 2024, 2:43 pm EDT

A new standard for biometric passports from the International Civil Aviation Organization (ICAO) is fast approaching. The passport standard is…

Jul 3, 2024, 11:43 am EDT

The U.S. General Services Administration (GSA) has released a request for proposals (RFP) for the Alliant 3 Governmentwide Acquisition Contract…

Originally Appeared Here

Article Topics

Latest Biometrics News

Related Articles

Explore Below!