Thoughts on Google SynthID
Like many other people, I've done some experiments with all the AI content generators that have been popping up and improving at a rapid rate lately. I've then mostly lost interest in the outputs, but what has intrigued me is the fairly obvious problem under the surface that nobody seems to notice. Assume I train an AI to produce images, by taking lots of images from places (like the internet) and it produces images. It's human nature to share those images, and it's arguable that most services rely on that fact, so they appear on the sources where the original training images came from. Then if I want to train the AI again, how can I know that I'm not training it on what a poorer previous version of itself has already produced? The same, of course, goes for text. The more successful they are, the more they risk impacting on their own data set for future improvements.
Google's new SynthId has been in the news lately as a way for detecting AI generated images. It's described like this:
While generative AI can unlock huge creative potential, it also presents new risks, like enabling creators to spread false information — both intentionally or unintentionally. Being able to identify AI-generated content is critical to empowering people with knowledge of when they’re interacting with generated media, and for helping prevent the spread of misinformation.
And it is, but it's also very important for creating large datasets to train Google's own AI image generation itself. Arguably more so.
With SynthId, Google claim they've created a new type of watermark and it's "robust and scaleable", though they qualify "robust" as being against common image manipulations. They also compare it to traditional methods such as stamping an image on the image, or embedding the data in the image.
When something is described as robust, then that is always a kind of challenge. "Can we remove the signature?". Unfortunately this technology has restricted availability at the moment and I don't have access. But the description gives some more details:
Since SynthID’s watermark is embedded in the pixels of an image, it’s
compatible with other image identification approaches that are based on
metadata, and remains detectable even when metadata is lost.
Google have therefore managed to invent the 500 year old concept of steganography (representing information within another message or object, in such a manner that the presence of the information is not evident to human inspection). Maybe it's not quite the cutting edge technology it seems after all.
To understand how we might do steganography on an image, let's first talk about how an image may be represented. One possible, and common representation, is to represent the colors red, green, and blue, each with values 0-255 and to do this for each pixel (tiny little square) in the image. Using this method we can represent 1677216 colours for each pixel. That's a lot of colours. So many you won't notice if they're off a tiny little bit, especially if you don't know what the original image was.
Let's work with the red part of that. Our first 8 pixels might have the values 62, 66, 98, 105, 0, 1, 98, 211. Now let us imagine that I want to hide the first character of a message in the image. The character is 'A'. By representing this in ASCII, we get the decimal number 65 which we can represent in binary as 1s and 0s : 01000001. We need a method where if we read the numbers for the red pixel, we could generate this binary number and therefore know it's the character A. That's pretty simple - one easy algorithm is to adjust the numbers so that where there is a 0 we ensure the number is odd, and where it is 1 we ensure it is even. We simply add or subtract a little bit if necessary and it'll be hard to spot. Our first number is 62, and the first number in binary is a 0, so we keep it because it's even. The next number is 66, but the binary is 1. We want this to be odd, so let's change it to 65. We carry on doing this and we end up with 62, 65, 98, 104, 0, 2, 98, 211. The image is virtually the same, imperceptibly so, but if we read back the red values for the file and write 0s for even numbers and 1s for odds, we get 01000001. Otherwise known as 'A'. As you can imagine, it's pretty easy to carry on and hide a text document in the image (and there are many applications to help do that). In fact, we can hide anything we can represent in 1s and 0s like this - which in practice means anything we can represent on a computer (or in other words all data).
Steganography isn't new, and nor are invisible watermarks. But there is a problem with our method and that's if somebody changes the image. Imagine somebody resizes the image, then some of our pixels are going to be dropped and that data is no longer there to recreate the text. Or somebody might reduce the red in our image by 50%. Then all the odd numbers will be halved and rounded off, making every number even! That's destructive and it's problematic in the "watermark" instance as the watermark is destroyed (in the original steganography sense then the recipient presumably will not do this). If our message is short then an obvious solution would be to repeat the message around the file using slightly different methods and slightly different distribution and spacing through the image. All this redundant data means that if something happens to the image that destroys some data, we hopefully have enough that we can still reconstruct what the message was.
In Google's SynthId the message that's being hidden seems to be as simple as you can get. It's 1-bit. You only need a single 0 or 1 to represent it. We can just assume that 0 means it is not watermarked and 1 means it is watermarked. Of course the message must be spread out lots of times around the file because otherwise an image can hit these just by chance.
If we go back to the SynthId description it says:
SynthID uses two deep learning models — for watermarking and identifying
— that have been trained together on a diverse set of images. The
combined model is optimised on a range of objectives, including
correctly identifying watermarked content and improving imperceptibility
by visually aligning the watermark to the original content.
So two neural networks have been developed. The first has essentially learnt how to hide that simple message in the file (it must be in multiple ways, I'd presume many many ways) and the second to read it. The identifying part does not really return a yes or no, but more a confidence score. I have no clue what they mean by "visually aligning the watermark to the original content" though two things come to mind - there are a lot of edges in the images they display and things like compression are less likely to affect some aspects of them. But otherwise it doesn't make much sense.
Google are being careful about not giving access too quickly. And well they might - if an AI can be trained to watermark an image that an identifier AI can see, it seems that the opposite tax of training an AI to unwatermark an image so that the identifier cannot see it is at most only equally as hard and in reality likely much much easier.
Perhaps I have not understood because there is limited information, but if they could encode a text message, even a short one, that would be more impressive. It's possible that they may have inadvertently done that since an image could be divided up into sections to represent the binary values - but that would rely on a key piece of missing information - what size of image is necessary?