AI, the broken brand promise of tech

It was a glorious week for Poolside, an AI startup. It received $500 million in new financing, which gives the company a $3-billion valuation.

What is the product? What is the revenue from their customers?

The website has two pages:

On the home page, we learn 'At Poolside we’re building an advanced foundational AI model, from the ground up, for the challenges of software engineering.' It will be the tool which 'fuels you and your teams to build better, faster, and happier than ever before.'

And 'so while everyone is figuring out how to make AI work for them, you’ll have your model, built specifically for you, that’s constantly improving.'

That's the first $250 million.

The second $250 million is in the vision.

You guessed correctly. There are lots of buzzwords — 'believe,' 'humanity,' 'health,' 'housing,' 'food' and 'education.'

These guys are ready for the geek beauty pageant competition.

And they talk about fulfilling the dream.

'To get there, we need to develop sufficiently advanced Artificial Intelligence.'

Unlike their peers, who are building general-purpose AI, Poolside is creating 'software-building AI.'

Naturally, it will progress from 'human-led, AI-assisted' all the way to 'AI-led, human-assisted.'

The rest of the vision page discusses the way they want to attack this problem.

So far, so good.

It might be worth noting that the company was started by Jason Warner, the former CTO at GitHub. Yes, that GitHub, a place for developers to store, manage, share their code. A company which was purchased by Microsoft for $7.5 billion in 2018.

Perhaps you heard about content creators unhappy with big companies using their content for training purposes. That was not only about books or articles or pictures.

That was also about programming code. Stack Overflow started charging for access to its content, which was created by 20 million registered users. They will not get compensated for their contribution.

In 2013, GitHub reported 100 million developers created or contributed to 420 million code repositories on its platform. 28 million of those are public.

Should or shouldn't the developers be compensated for the use of their code for training AI? Should that even be allowed? That's for the court to decide, since there are numerous copyright lawsuits waiting for the judge to make a call.

That's not the problem.

My readers might recall several posts from the past.

When I wrote ChatGPT, another step away from the truth, one of the points was that these models will create the separation between the source of the information and the answer which the model provides.

In another post, I said it really depends on the complexity of the task and the skill level of the programmer to ensure AI is a useful extension and not a knife in the hand of the baby.

Finally, in the newsletter titled, Don't You Forget About Me, I illustrated that once you train the model, nobody knows how to make it forget incorrect information.

And this is the series of problems the Poolside team will need to resolve.

First, they have to identify fully functional and tested code on which they can train their models. Will they license the code from GitHub, or take snippets from Stack Overflow or elsewhere?

On that note, when you visit either website, you will find a section for reporting bugs or spirited conversation about which line of code achieves the desired outcome.

Perhaps you would suggest starting with an open source software, where the whole community is involved in building software out of the goodness of their hearts.

Then it can be used by anyone, free of charge.

You might have heard of Linux or Apache, two famous projects used by millions around the globe.

Their armies of volunteers are building, maintaining and checking the quality of the product — or are they? This article, 'Malicious xz backdoor reveals fragility of open source,' from The Register, is a sobering example that not everyone has enough goodness in their heart.

Yes, we have tons of code for training AI models on how to program, but we don't have enough people to validate the accuracy or the quality of the code.

Another challenge is that every software which is not maintained eventually has a security vulnerability. As new forms of attacks are discovered, you have to constantly modify your code to patch any issue.

When you start adding unverified code — which might be intentionally poisoned — without retraining your model from scratch, you have no idea how to make your model forget the bad code.

If you give it to people who can't tell if there is anything wrong with the code — you have a perfect recipe for disaster.

The one thing which people from Poolside will copy from ChatGPT, will be the disclaimer that 'ChatGPT can make mistakes. Check important info.'

Let's see what $500 million will do for 'humanity,' 'health,' 'housing,' 'food' and 'education.'

The term AI is used as an umbrella for many things, but mainly as a synonym for Large Language Models (LLMs), as well as their variations and various enhancements. At the same time, marketing people are gluing the label on every product they can find. I saw an AI fridge and an AI toothbrush.

With technology, we have the expectation that we know how it is designed and that it is predictable. We have high expectations that it works.

By introducing the term ‘AI’ to the general public, the proponents are creating a new paradigm — this technology is far better than anything else, and it will soon be better than humans.

But it can make mistakes.

That's very dangerous, and I hope it's not going to become the new Recurrent Pattern.

Previous
Previous

LinkedIn's continuous slide into AI nothingness

Next
Next

Software with a Soul