AI eating its own children

Dec 1

Anthropic, a company where people believe that 'AI will have a vast impact on the world,' just hired an AI welfare researcher.

Why is this position required? And why now?

You would be in awe to know that we are coming to a moment — now, soon, in some time, possibly, in the near future, future (the researchers are still debating that) — where there is '...a realistic possibility of consciousness and/or robust agency, and, thus, moral significance...' where we have to start looking after the welfare of AI.

You can read the pearls of wisdom in this report, 'New report: Taking AI Welfare Seriously'. Not to be mistaken for any previous reports, the authors made sure to include the word 'New' in the title itself.

Right at the beginning of that report, the authors state, 'We then make recommendations for how AI companies (and others) can start taking AI welfare seriously.'

And one of the authors was the lucky one, because he landed the job at Anthropic as the AI welfare researcher.

In the report, the researchers argue 'there is a realistic possibility that some AI systems will be conscious and/or robustly agentic in the near future,' and, because of that, 'the prospect of AI welfare and moral patienthood — of AI systems with their own interests and moral significance — is no longer an issue only for sci-fi or the distant future.'

What can (AI) companies do about that, aside from hiring an AI welfare researcher?

I am glad you asked.

First, the companies can acknowledge that AI welfare is an important and difficult issue (while brainwashing the AI to say the same).

Second, they can start assessing AI systems for evidence of consciousness and robust agency, and finally prepare policies and procedures for treating AI systems with an appropriate level of moral concern.

The above is music to the ears of the HR department, especially the ones using the system from Lattice. You might remember the lunatics there declaring‘Lattice made history to become the first company to lead in the responsible employment of AI “digital workers” by creating a digital employee record to govern them with transparency and accountability.’

Here’s the obvious next question: How do we know AI is suddenly self aware?

I am glad you asked.

This study, 'Towards Evaluating AI Systems for Moral Status Using Self-Reports,' is providing a glimpse into how we can find out.

Let the authors explain it to you: 'We argue that under the right circumstances, self-reports, or an AI system’s statements about its own internal states, could provide an avenue for investigating whether AI systems have states of moral significance.' and 'we propose to train models to answer many kinds of questions about themselves with known answers, while avoiding or limiting training incentives that bias self-reports.'

Since we have nothing better to do, we start 'training' the AI, aka Large Language Model to start thinking about its own internal state and then...we start asking it questions about itself.

What kind of questions could we ask?

Are you tired? For the HR department, make sure your policies for digital workers clearly state working hours are to be only from 9 to 5. Any overtime should be approved by the AI manager.
Do you have enough memory?
Is it too cold in the data center?
Aren't you tired of answering stupid questions?

More importantly, once we get an answer from the machine, what are we going to do about it? Do we tell HR or the AI shrink?

I can't wait for the day when we will need government permission to unplug a PC.

And if you are still not tired of reading this, here is another paper for you — Looking Inward: Language Models Can Learn About Themselves by Introspection, where you can find a gem like this — 'Instead of painstakingly analyzing a model's internal workings, we could simply ask the model about its beliefs, world models, and goals.'

Far better, however, is the book, Do Androids Dream of Electric Sheep?

What will happen when AI awakens? As Marvin said, 'It gives me a headache just trying to think down to your level.'

Is there any recurrent pattern in all this?

Yes — it is the endless anthropomorphisation of technology where lonely people are trying to find a new friend and look after its welfare.

AIAnthropic

Vaclav Vincalek

AI eating its own children

MS Copilot. Flying straight into the mountain

In search of the (European) search strategy

Get in Touch