Watermark and generative AI

Watermarking & generative AI: what, how, why (and why not) 

It’s been more than 70 years since computer scientist Alan Turing suggested that a test of any machine’s “intelligence” should be its ability to answer questions in a manner indistinguishable from humans. But in the era of generative AI, many argue that the “imitation game” – which has since been criticized for its gendered assumptionsis no longer a useful benchmark for judging machine intelligence, let alone whether something has been created by a machine or by a human. 

“Original” content produced by generative AI tools is quickly approaching, if not already at, the point of being indistinguishable from similar human-created content. And as it becomes increasingly widespread, so too do concerns about the impact on education, work, intellectual property, and even democracy itself. The question now becomes: how can we tell the difference? 

One proposed solution is watermarking. Tech companies including Google, Microsoft, Meta, and Amazon have all pledged to develop technical mechanisms – including watermarking systems – that will let users know when content is AI-generated. In the past month alone, Google announced the trial of its digital watermarking tool, SynthID, for identifying AI-generated images, while Microsoft said it would add invisible digital watermarks to all images generated in Bing by OpenAI’s DALL-E 3 model

But does this kind of content lend itself to digital watermarking? What are the potential benefits and risks for human rights? Access Now’s latest discussion paper on identifying generative AI content shares the full scoop on watermarking, but read on for the basics you need to know. 

What is digital watermarking? 

Watermarks aren’t new. First implemented by Italian paper manufacturers in the 13th century, they’ve since been used on everything from banknotes to books to show ownership, thwart forgery, and prove authenticity. Digital watermarks have varying degrees of visibility; from the opaque stamp of origin clearly seen behind a Getty Images picture, to hidden manipulation of pixels or patterns in text punctuation, to the use of cryptography to identify signatories. All digital watermarks should also be forensically verifiable, making it clear they were put there on purpose and haven’t been tampered with. 

Embedding watermarks into digital content can serve a variety of needs, from identifying who created the content (whether that’s the AI model used, or the person who promoted it), to demonstrating that the content is “authentic” or “genuine.” Watermarking can also confirm that content hasn’t been tampered with, restrict interactions with content, or help trace who leaked the content in case of a breach. 

Can digital watermarks be used to identify AI-generated content? 

That depends on the format. Watermarking isn’t well suited to text-based generative AI content. Cryptographic watermarks are easily lost when text is copied and pasted, so text components such as the words chosen, sentence structure, or punctuation have to be algorithmically manipulated to create an extractable, interpretable pattern that makes up a watermark in the body of the text itself. But doing so impacts the quality of the final output, it isn’t possible in short texts (which constitute most of the outputs generated by tools like ChatGPT), and it’s very easy to defeat – all you need to do is run the output through another AI model that will remove the pattern. 

On the other hand, cryptographic watermarks can be relatively easily placed in binary files – such as images, videos, and audio files – but questions around who controls the information and what exactly is being watermarked (e.g. the AI system itself vs. the person prompting the system) have significant implications for privacy and other fundamental rights.     

What are the human rights risks of using watermarks for AI-generated content? 

Most of the time, watermarks are used in AI-generated content to identify either the AI model used to produce the content or the person who prompted the output. Mandating either kind of watermark is a potential risk to fundamental rights such as privacy, freedom of expression, and freedom from discrimination. 

No one using generative AI should have to reveal their personal information to third parties in order to do so, particularly as this could empower authorities to identify dissidents or expose whistleblowers, for example. Watermarks which identify the AI model used, rather than the prompting user, are somewhat less dangerous, but they still carry risks of harm. For example, if a dyslexic person uses ChatGPT to make their writing more understandable for a broad audience, they could be subsequently discriminated against if that content is labeled as having been produced by AI rather than acknowledging their authorship. 

Even if AI model developers choose to apply watermarks on content generated from their models in order to ensure traceability, watermarking should not be a legal or default obligation for people prompting the model. Instead, people should be able to choose whether or not watermarks are included in content generated by their prompts, and to tailor the level of identification to either the model or themselves. 

If watermarking doesn’t work, what are some alternatives? 

Ultimately no amount of watermarking will solve the challenges presented by AI-generated content, nor will it answer difficult questions around the appropriate and rights-respecting role of generative AI in our society. Rather than focusing on the technologies themselves, we should build a rights-based foundation and set people-centered goals for defining our relationship with emerging technologies.

While watermarking may be useful in certain contexts, it must be deployed thoughtfully and only when the benefits are clear, the risks can be properly mitigated, and individuals at risk of human rights harms can fully control how they interact with the technology. 

Instead of focusing on how to detect what is or isn’t AI-generated content, we should be enabling verification of trusted sources and encouraging people to identify the content they generate, whether through watermarking or other means. Given their unreliability, the absence of watermarks should prompt a closer look at the source of the content, rather than being taken as proof that it was not AI-generated. 

To learn more about how watermarking generative AI content works, what it can, and what it can’t do, read the full discussion paper.