arrow Products
Glide CMS image Glide CMS image
Glide CMS arrow
The powerful intuitive headless CMS for busy content and editorial teams, bursting with features and sector insight. MACH architecture gives you business freedom.
Glide Go image Glide Go image
Glide Go arrow
Enterprise power at start-up speed. Glide Go is a pre-configured deployment of Glide CMS with hosting and front-end problems solved.
Glide Nexa image Glide Nexa image
Glide Nexa arrow
Audience authentication, entitlements, and preference management in one system designed for publishers and content businesses.
For your sector arrow arrow
Media & Entertainment
arrow arrow
Built for any content to thrive, whomever it's for. Get content out faster and do more with it.
Sports & Gaming
arrow arrow
Bring fans closer to their passions and deliver unrivalled audience experiences wherever they are.
Publishing
arrow arrow
Tailored to the unique needs of publishing so you can fully focus on audiences and content success.
For your role arrow arrow
Technology
arrow arrow
Unlock resources and budget with low-code & no-code solutions to do so much more.
Editorial & Content
arrow arrow
Make content of higher quality quicker, and target it with pinpoint accuracy at the right audiences.
Developers
arrow arrow
MACH architecture lets you kickstart development, leveraging vast native functionality and top-tier support.
Commercial & Marketing
arrow arrow
Speedrun ideas into products, accelerate ROI, convert interest, and own the conversation.
Technology Partners arrow arrow
Explore Glide's world-class technology partners and integrations.
Solution Partners arrow arrow
For workflow guidance, SEO, digital transformation, data & analytics, and design, tap into Glide's solution partners and sector experts.
Industry Insights arrow arrow
News
arrow arrow
News from inside our world, about Glide Publishing Platform, our customers, and other cool things.
Comment
arrow arrow
Insight and comment about the things which make content and publishing better - or sometimes worse.
Expert Guides
arrow arrow
Essential insights and helpful resources from industry veterans, and your gateway to CMS and Glide mastery.
Newsletter
arrow arrow
The Content Aware weekly newsletter, with news and comment every Thursday.
Knowledge arrow arrow
Customer Support
arrow arrow
Learn more about the unrivalled customer support from the team at Glide.
Documentation
arrow arrow
User Guides and Technical Documentation for Glide Publishing Platform headless CMS, Glide Go, and Glide Nexa.
Developer Experience
arrow arrow
Learn more about using Glide headless CMS, Glide Go, and Glide Nexa identity management.

It's Robot Wars in courtrooms and parliaments - and we're all placing bets on the future

It's good bot vs bad bot for the future of the internet, with a sprinkle of legal and legislative sauce to spice it up.

by Rich Fairbairn
Published: 19:51, 22 August 2024

Last updated: 21:12, 22 August 2024
Glide Publishing Platform, Glide CMS, Glide Go, and Glide Nexa are a suite of products which help publishers and media bring audiences and content together.

Courts worldwide are alive with the sound of lawyers singing that theft and fair use are interchangeable terms - both in prosecution and defence.

'Fair use' has become the modern battleground for AI companies versus content and IP owners in large and creaky courtrooms, because it's the most common legal provision which AI bandits are trying to hide behind to justify their raiding of content libraries and IP.

In the US, judges are becoming swamped with cases in which they will have to redefine what fair use even is in the era of automated site scraping and library raids, as are judges in China, the EU, the UK and beyond. 

Basically - and I'm not a lawyer here, or anywhere - the US legal defences will at some point infer that if it was OK in 1989 for a rapper called Luke Skywalker to release a very rude version of Roy Orbison's Pretty Woman, it's OK for OpenAI et al to copy everything on the internet and become trillionaires. 

You may well be spluttering "It's clear why you're not a lawyer, Rich!", but that actual scenario is central to one of the most influential copyright cases in modern copyright history and will be referenced many times to great amusement of anyone with childish humour, which is definitely not me.

While 'fair use' exists in the US and China, it does not in the UK or EU. However, those territories have similar 'fair dealing' provisions under different ancient legal terms which allow some usage of copyrighted works for comment, learning, satire, etc, so the principle of theft-vs-homage will be tested in broadly the same way. 

Copying each other's work, and yours

No major copyright law is in effect which has been written with knowledge of, or provision for, today's AI-boosted echoes of original works, so every case is going to be setting a precedent in one way or another. 

And while a case in China will have no legal bearing on US courts, it's naive to think all these future judgements will exist in a bubble: just as the AI companies are feverishly copying each other's ideas and actions, courts and governments similarly peer over borders to see how others are handling it too - especially US, UK, and EU lawmakers who definitely read each other's homework. (It's not copying when the Government does it!).

While individual cases will go either way, overall, it's clear that copyright laws are due for overhaul worldwide. The countdown for AI firms is on, and everyone knows it. The facts to be decided are whether or not they will have to unpick their work, or be allowed to carry on, and what recompense there will be. 

My gut says the AI firms will mostly keep what they have, fines will put some out of business (and be mere blips to others), and regulated frameworks will emerge to mandate the flow of revenues between IP owners and AI firms. 

Additionally, I foresee that - regardless of legal obligations - AI firms will start to make a song and dance of their formal content deals simply for marketing reasons, for the same sort of reason a brand signs David Beckham. It's not just your content they want, Taylor, it's your endorsement...

A new challenger enters the ring

Meanwhile, a second stream of legislative opinion is gathering pace which could be just as influential on the future activities of AI burglars as any fair use case: a possible reshaping of the internet's Robots Exclusion Protocol (REP).

The standards behind REP manage how search engines and other automated tools scour websites for information and content. REP and the robots.txt file is where you make clear what pages and content on your site can be searched and what cannot.

REP has been key to the growth of the modern web, but has one major problem: robots.txt instructions are not legally enforceable or binding, and - who could have seen this coming - countless AI companies have been found completely ignoring robots.txt instructions stating "do not crawl" sites, to instead waltz in and help themselves to anything and everything their bots could detect. 

Not only is this easily viewed as dishonest and dishonourable, it can also significantly slow victim sites and rocket their hosting costs too - sites have reported millions of requests for data from crawlers in very short spaces of time and had to rush devops personnel into action to prevent going offline.

To rub salt into the wound, it's finally starting to become more noticed that Google, the most influential robot of all, is looking very extortionisty by telling site owners that if they block the Google AI crawler from seeing their content, they also block the Google Search crawler. Maybe not your money or your life, but certainly your content or your SERP. 

Again, I am still not a lawyer, but surely that has gigantic EU/FTC fine written all over it? I am already picturing a judge in France, which sees Google fines as a cash cow and isn't afraid to milk it, gleefully adding zeroes to another mega-euro decision as we speak.

Similarly, while ignorance of robots.txt is not yet a legal precedent, I can absolutely see disrespect of robots.txt instructions being a contributory factor in future fines and penalty payments in many of these ongoing cases. Not a good time for the former CEO of Google to suggest that the winning technique for AI firms is to steal first, pay later.

With perfect timing, ignorance of site conditions and robots.txt is now right at the centre of a content theft case in Germany, which is centring on interpretations that just because instructions saying DO NOT USE are readable by a human, that does not mean it is readable by a machine and therefore can be ignored.

Perhaps more importantly I can also see this transparent condition of robots.txt becoming the subject for action by regulators, elevating the instructions from honour-based and ignorable towards being a core component of legal statutes, swinging much power back towards site owners.

Meanwhile, as all this plays out, other robots are on the way - friendly robots which hope to beat the bandits at their own game - blocking the burglars, asserting rights and primacy, and clawing back cash.

Automation and robotry might be a scourge to modern content and site owners, but it can be its saviour too - backed up by old-tech humans in old-tech buildings.