Hey Travis,
Google, the company that's putting AI-generated answers in Google search, promoting AI-generated content, and creating its own suite of generative AI tools, is sounding the alarm on the potential harms of generative AI. This isn't another AI executive speculating about the rise of artificial general intelligence replacing humanity, but a realistic assessment of real harm generative AI is already causing in the real world, much of which cites our own reporting. The biggest problem: these harms are the result of these tools doing exactly what they were designed to do.
-Emanuel
Generative AI could “distort collective understanding of socio-political reality or scientific consensus,” and in many cases is already doing that, according to a new research paper from Google, one of the biggest companies in the world building, deploying, and promoting generative AI.
The paper, “Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data,” was co-authored by researchers at Google’s artificial intelligence research laboratory DeepMind, its security think tank Jigsaw, and its charitable arm Google.org, and aims to classify the different ways generative AI tools are being misused by analyzing about 200 incidents of misuse as reported in the media and research papers between January 2023 and March 2024. Unlike self-serving warnings from Open AI CEO Sam Altman or Elon Musk about the “existential risk” artificial general intelligence poses to humanity, Google’s research focuses on real harm that generative AI is currently causing and could get worse in the future. Namely, that generative AI makes it very easy for anyone to flood the internet with generated text, audio, images, and videos.
Much like another Google research paper about the dangers of generative AI I covered recently, Google’s methodology here likely undercounts instances of AI-generated harm. But the most interesting observation in the paper is that the vast majority of these harms and how they “undermine public trust,” as the researchers say, are often “neither overtly malicious nor explicitly violate these tools’ content policies or terms of service.” In other words, that type of content is a feature, not a bug.
This segment is a paid ad. If you’re interested in advertising, let's talk.
As generative AI shapes our future, keeping it safe is becoming a top priority for regulators, consumers, and tech platforms alike.
One of the most powerful tools in the AI safety stack is red teaming - but what is it and how is it done? ActiveFence’s latest report, "Mastering GenAI Red Teaming: Insights from the Frontlines," dives deep into the essential strategies and methodologies for effective AI red teaming. Learn from real-life scenarios and case studies and discover how to identify and mitigate risks in AI models. This comprehensive guide provides a framework for red teaming that balances safety and functionality, ensuring your AI systems remain robust against adversarial attacks.
Read the report now to enhance your AI safety protocols and stay ahead in the rapidly changing landscape of AI technology.
The researchers first took the 200 media reports, which included several 404 Media articles and a few articles we wrote or commissioned at Motherboard, and identified “key and novel patterns in GenAI misuse [...] including potential motivations, strategies, and how attackers leverage and abuse system capabilities across modalities (e.g. image, text, audio, video) in an uncontrolled environment.”
“[W]e find that most reported cases of GenAI misuse involve actors exploiting the capabilities of these systems, rather than launching direct attacks at the models themselves (see Figure 1). Nearly 9 out of 10 documented cases in our dataset fall into this category.”
As the researchers explain later in the paper, “The widespread availability, accessibility and hyperrealism of GenAI outputs across modalities has also enabled new, lower-level forms of misuse that blur the lines between authentic presentation and deception. While these uses of GenAI—such as generating and repurposing content at scale and leveraging GenAI for personalised political communication—are often neither overtly malicious nor explicitly violate these tools’ content policies or terms of services, their potential for harm is significant.”
This observation lines up with the reporting we’ve done at 404 Media for the past year and prior. People who are using AI to impersonate others, sockpuppet, scale and amplify bad content, or create nonconsensual intimate images (NCII), are mostly not hacking or manipulating the generative AI tools they’re using. They’re using them as intended.
In some cases, people will bypass safeguards that these tools have in place with clever prompts, but there’s nothing preventing a user, for example, from creating a voice clone of a coworker or anyone else with ElevenLabs’s AI voice cloning tool. Civitai users can create AI-generated images of celebrities, and while the platform has a policy against NCII, there’s nothing preventing users by downloading a model of a celebrity, a model that’s good at generating pornographic images, and generating NCII locally on their machine with free to download tools on GitHub like Automatic1111 or ComfyUI. When Jason writes about AI-generated images flooding Facebook and turning it into a zombie platform, the people doing that might be violating Facebook’s policies in some instances, but they are not violating the policies of the AI-image generators they’re using.
We are only starting to see the consequences of a venture capital-backed industry that has made it easy for anyone to flood the internet with AI-generated content, but the risk is clear. As the researchers write:
"GenAI-powered political image cultivation and advocacy without appropriate disclosure, for example, undermines public trust by making it difficult to distinguish between genuine and manufactured portrayals. Likewise, the mass production of low quality, spam-like and nefarious synthetic content risks increasing people’s skepticism towards digital information altogether and overloading users with verification tasks. If unaddressed, this contamination of publicly accessible data with AI-generated content could potentially impede information retrieval and distort collective understanding of socio-political reality or scientific consensus. For example, we are already seeing cases of liar’s dividend, where high profile individuals are able to explain away unfavorable evidence as AI-generated, shifting the burden of proof in costly and inefficient ways."
The researchers recognize that relying on media reports is not a perfect way to assess the landscape of generative AI misuse because it “can introduce biases,” which I think is true, but not in the same way as the researchers.
“Media outlets often prioritize incidents with sensational elements or those that directly impact human perception, potentially skewing our dataset towards particular types of misuse,” the researchers write.
One issue with this view is that while it’s true that the media can only report on incidents it can confirm, and it’s possible that there are many more misuses of generative AI we’re not yet aware of, NCII is, despite all the coverage, an underreported issue. First because porn is still taboo and a subject publications are hesitant to cover, and second because each story 404 Media writes about NCII points to so many individual instances of abuse we’re unable to count them all. Simply put: a research paper that is relying on media reports about specific instances of AI harm is directionally likely to be accurate but does not come close to capturing the scale of what is actually happening.
One story I wrote about one person who monetized NCII on Patreon had made 53,190 nonconsensual images of celebrities before I reached out to Patreon to comment and shut down his account. There are two other NCII makers mentioned in that story, and I’ve discovered others since. The Telegram and 4chan communities where people first shared the AI-generated nude images of Taylor Swift that went viral on Twitter were active before and after that story broke in January, and they have been posting NCII every single day since. I and other reporters don’t write a story about every single one of these images and the people who create because we wouldn’t have time to do anything else if we did.
The other curious omission in the research paper of course is that Google itself is one of the major companies rapidly developing and deploying generative AI tools that cause harm, most famously now with AI overview answers in Google search that tell users to eat glue. Instead, the researchers note that everyone involved or impacted should somehow do a better job, saying their findings “underscore the need for a multi-faceted approach to mitigating GenAI misuse, involving collaboration between policymakers, researchers, industry leaders, and civil society.”
Google acknowledged a request for comment but did not provide one in time for publication.
|