Pick your (AI) poison

What do you do when you don't agree with somebody? You poison them. That's how the Borgias dealt with their enemies. This time, researchers from the University of Chicago are creating the magic potion to make AI sick. You can read the details in this article, titled “New data poisoning tool would punish AI for scraping art without permission.”

The researchers’ objective here is to insert pixels into digital images, which will confuse AI training algorithms, but would be transparent to the human eye. When the image is ingested by the training algorithm, it might label the image as “cat” instead of “dog.”

This sort of digital defense or sabotage is nothing new. For many years, people have been creating and selling t-shirts to confuse facial recognition systems. You can find blog posts that suggest a particular hair or make-up style can make you invisible to these cameras. Perhaps you saw a “STOP” sign at an intersection with a label stuck onto it expressing disagreement with some social or political issue. Another research paper suggests that certain marks on the road could swerve self-driving cars off the road. The possibilities are limitless!

This poisoning is not unique to images. I am sure that you’ve heard about computer viruses or malware described as a Trojan, meaning it can pose as legitimate code behind computer protection and then attack. The same concept is now applied to AI, where the adversary is designing training inputs that can be triggered later to obtain the desired effect. A simple example might be an image of 'Turn Right Here' with a label 'Turn Left' and then placing it somewhere on the road with a cliff when applied to training for self-driving cars.

The question is: will image poisoning solve the problem? Can the artist gain anything by doing it?

It’s not only visual artists who are complaining and want to do something about it. It is also developers sharing source code on dev platform GitHub or the software community on Stack Overflow, which was harvested by OpenAI as training data. 

What can we do about this?

Here is my opinion: After an initial euphoria about the magic of systems like ChatGPT, which can produce content and images, people will realize that it is just a circus monkey and has limited use. It will serve as an example of what is possible, and eventually, business people will start thinking about how to use it to make money. At that moment, the AI is not going to be trained on random content or images, but a very specific dataset, which will be authored and vetted by humans. The content will be properly sourced and licensed. Contracts will be negotiated, signed, and enforced. Can you imagine a bank (or any sizable organization) using unknown content to manage its operations? The vendors will - in order to compete and sell their wares - adhere to published standards.

Right now, we are in this AI Wild Wild West era, where (almost) everything goes. But the sheriff will eventually show up to bring law and order, and AI poisoning will become the domain of spies and thriller movies. People have better things to do.

Previous
Previous

Different… or same-same?

Next
Next

The death of wealth advisers by AI