Products
Glide CMS image
Glide CMS arrow
The powerful intuitive headless CMS for busy content and editorial teams, bursting with features and sector insight. MACH architecture gives you business freedom.
Glide Go image
Glide Go arrow
Enterprise power at start-up speed. Glide Go is a pre-configured deployment of Glide CMS with hosting and front-end problems solved.
Glide Nexa image
Glide Nexa arrow
Audience authentication, entitlements, and preference management in one system designed for publishers and content businesses.
For your sector
Media & Entertainment
arrow
Built for any content to thrive, whoever it's for. Get your content out faster and do more with it.
Sports & Gaming
arrow
Bring fans closer to their passions and deliver unrivalled audience experiences wherever they are.
Publishing
arrow
Tailored to the unique needs of publishing so you can fully focus on audiences and content success.
For your role
Technology
arrow
Unlock resources and budget with low-code & no-code solutions to do so much more.
Editorial & Content
arrow
Make content of higher quality quicker, and target it with pinpoint accuracy at the right audiences.
Developers
arrow
MACH architecture lets you kickstart development, leveraging vast native functionality and top-tier support.
Commercial & Marketing
arrow
Speedrun ideas into products, accelerate ROI, convert interest, and own the conversation.
Technology Partners
AWS image
AWS
arrow
Getty Images image
Getty Images
arrow
Brightcove image
Brightcove
arrow
Poool image
Poool
arrow
Solution Partners
Endava image
Endava
arrow
The App Lab image
The App Lab
arrow
Code Store image
Code Store
arrow
Polemic Digital image
Polemic Digital
arrow
Resources
Developer Experience
arrow
Find out more how to work with Glide headless CMS, Glide Go, and Glide Nexa identity management
Customer Support
arrow
Learn more about unrivalled customer support from team Glide
Documentation
arrow
User and Technical documentation for the Glide Publishing Platform, Glide Go, and Glide Nexa
Newsroom
News arrow
Comment arrow
Newsletter arrow

The publisher's AI dilemma: my enemy's enemy is my enemy

While battling AIs are stealing content wherever they find it, would suing some into oblivion just make the survivors more powerful?

by Rob Corbidge
Published: 11:40, 21 June 2024
A group of angry people fighting each other

We currently exist in a tech epoch in which it's quite possible over a single morning coffee to absorb the news that the Pope has given his qualified blessing to AI, a US bank was revealed to have an AI system which calmed stressed workers by showing pictures of their family - I am sure Pavlov showed us where this might end up going - and McDonald's paused an AI drive-thru project after some "issues". If that morning coffee was from McDonalds, perhaps take a sieve.

Developments in the AI arena are so frequent and varied, and reported on with such simulated insight, that it's tricky to separate the wheat from the chaff. Wheat there is for sure, but likely outweighed at stupendous ratios by hype-powered chaff.

Back in publishing's buffeted world, we find "AI answers" company Perplexity surfacing barely concealed content from Forbes as its own work, even citing aggregated versions of the same original story as its sources to show it had done its homework. It seems to be saying it is using AI to summarise other AIs, so that's OK then.

Having been involved in my formative years in what I can only describe as an informal homework copying cartel, it seems there's an obvious flaw in presenting copies of copies to prove you didn't copy something. You can't parse cunningness into training data, yet.

We're happy to see that Forbes aren't messing around and are demanding Perplexity "remove the misleading source articles, reimburse Forbes for all advertising revenues Perplexity earned via the infringement, and provide 'satisfactory evidence and written assurances' that it has removed the infringing articles".

Randall Lane, Chief Content Officer at Forbes Media, asserted to the AP that the dispute was an “inflection point” in the conversation about AI.

“It’s a case study in where we’re heading,” Lane told the AP. “If the people who are leading the [AI] charge don’t have a fundamental respect for the hard work of doing proprietary reporting, and keeping people informed with value-added content, we’ve got a big problem.”

Perplexity's boss Aravind Srinivas has been on the defensive, saying they're "trying to build positive relationships with news publishers... we can definitely coexist and help each other." 

He's right to be agile in his dealings: despite Perplexity already earning the valuation of $1bn, a suit this early in its life could prove highly damaging to its trajectory.

The scenario presents me with a bit of a conundrum.

As a tech optimist I generally welcome promising entrants to markets dominated by super-funded incumbents. Conversely, I'm not shy about airing my views on AI piranhas hoovering up content and the consequences they should face. 

But, while Perplexity may be small fry in a world of sharks such as OpenAI and Google, their one shared goal is to be both the start and end point of any given query - at the expense of the very publishers which have helped create their AIs.

To complicate things yet more, I fear that cases of this type risk entrenching the most egregious content ripping culprits as market leaders, making the whole problem that much worse. 

Is it possible we ultimately see the big players get away with content harvesting murder, while less powerful start-ups are hobbled by a vigorous publisher rightfully defending their IP? 

Backing from Microsoft or Google is like an unlimited supply of steroids for legal teams, the AI lawsuit equivalent of the joke about two tourists running from a lion: "I don't need to outrun the lion, I just need to outrun you." 

Are publishers doing the work of the AI giants for them?

Before we are all moved to tears at plucky Perplexity being picked off from the AI herd so more dangerous pack members can feast more voraciously, an excellent analysis by Wired gloriously headlined "Perplexity Is a Bullshit Machine" reveals some unpleasant insight into Perplexity's practices.

It would seem that Perplexity's site crawler ignores the fundamental protocol of adhering to instructions in website robots.txt files, and helps itself to website content by scraping what it has been told not to.

This is a known risk of course because robots.txt files are about as legally binding as Christmas wishes to Santa, but that's still some underhand stuff. Another aspect of the internet that relied on trust is nudged towards regulation, I'd wager.

But there lies the rub: some of those with their eye on AI hegemony think it's better to steal first and defend later, than move slow and be honest. 

Early players were able to harvest petabytes of content before those that produced it had even woken up to the threat. To gain some competitive traction, basic fair use rules were trampled without a moment of worry about the consequences: witness Google's lack of action over the harvesting of YouTube scripts by OpenAI, because they were doing it themselves, on the quiet, with anything else they could lay their hands on. 

This is what publishers are up against, like shepherds tending a flock surrounded by wolves. 

It's an AI gold rush where content is the gold, and constant vigilance is the name of the game.

Latest articles

A robot gangster stealing content and data.
The rise of the robot gangsters
arrow button
A scientist rearranging papers that are flying around him
Peering into the grim data of generative search
arrow button
A man working on a computer surrounded by flying newspapers
AI content deals: publishers weigh money against pressure
arrow button

Ready to get started?

No matter where you are on your CMS journey, we're here to help. Want more info or to see Glide Publishing Platform in action? We got you.

Book a demo