It's Robot Wars in courtrooms and parliaments - and we're all placing bets on the future

It's good bot vs bad bot for the future of the internet, with a sprinkle of legal and legislative sauce to spice it up.

by Rich Fairbairn

Published: 21:12, 22 August 2024

Glide Publishing Platform, Glide CMS, Glide Go, and Glide Nexa are a suite of products which help publishers and media bring audiences and content together.

Courts worldwide are alive with the sound of lawyers singing that theft and fair use are interchangeable terms - both in prosecution and defence.

'Fair use' has become the modern battleground for AI companies versus content and IP owners in large and creaky courtrooms, because it's the most common legal provision which AI bandits are trying to hide behind to justify their raiding of content libraries and IP.

In the US, judges are becoming swamped with cases in which they will have to redefine what fair use even is in the era of automated site scraping and library raids, as are judges in China, the EU, the UK and beyond.

Basically - and I'm not a lawyer here, or anywhere - the US legal defences will at some point infer that if it was OK in 1989 for a rapper called Luke Skywalker to release a very rude version of Roy Orbison's Pretty Woman, it's OK for OpenAI et al to copy everything on the internet and become trillionaires.

You may well be spluttering "It's clear why you're not a lawyer, Rich!", but that actual scenario is central to one of the most influential copyright cases in modern copyright history and will be referenced many times to great amusement of anyone with childish humour, which is definitely not me.

While 'fair use' exists in the US and China, it does not in the UK or EU. However, those territories have similar 'fair dealing' provisions under different ancient legal terms which allow some usage of copyrighted works for comment, learning, satire, etc, so the principle of theft-vs-homage will be tested in broadly the same way.

Copying each other's work, and yours

No major copyright law is in effect which has been written with knowledge of, or provision for, today's AI-boosted echoes of original works, so every case is going to be setting a precedent in one way or another.

And while a case in China will have no legal bearing on US courts, it's naive to think all these future judgements will exist in a bubble: just as the AI companies are feverishly copying each other's ideas and actions, courts and governments similarly peer over borders to see how others are handling it too - especially US, UK, and EU lawmakers who definitely read each other's homework. (It's not copying when the Government does it!).

While individual cases will go either way, overall, it's clear that copyright laws are due for overhaul worldwide. The countdown for AI firms is on, and everyone knows it. The facts to be decided are whether or not they will have to unpick their work, or be allowed to carry on, and what recompense there will be.

My gut says the AI firms will mostly keep what they have, fines will put some out of business (and be mere blips to others), and regulated frameworks will emerge to mandate the flow of revenues between IP owners and AI firms.

Additionally, I foresee that - regardless of legal obligations - AI firms will start to make a song and dance of their formal content deals simply for marketing reasons, for the same sort of reason a brand signs David Beckham. It's not just your content they want, Taylor, it's your endorsement...

A new challenger enters the ring

Meanwhile, a second stream of legislative opinion is gathering pace which could be just as influential on the future activities of AI burglars as any fair use case: a possible reshaping of the internet's Robots Exclusion Protocol (REP).

The standards behind REP manage how search engines and other automated tools scour websites for information and content. REP and the robots.txt file is where you make clear what pages and content on your site can be searched and what cannot.

REP has been key to the growth of the modern web, but has one major problem: robots.txt instructions are not legally enforceable or binding, and - who could have seen this coming - countless AI companies have been found completely ignoring robots.txt instructions stating "do not crawl" sites, to instead waltz in and help themselves to anything and everything their bots could detect.

Not only is this easily viewed as dishonest and dishonourable, it can also significantly slow victim sites and rocket their hosting costs too - sites have reported millions of requests for data from crawlers in very short spaces of time and had to rush devops personnel into action to prevent going offline.

To rub salt into the wound, it's finally starting to become more noticed that Google, the most influential robot of all, is looking very extortionisty by telling site owners that if they block the Google AI crawler from seeing their content, they also block the Google Search crawler. Maybe not your money or your life, but certainly your content or your SERP.

Again, I am still not a lawyer, but surely that has gigantic EU/FTC fine written all over it? I am already picturing a judge in France, which sees Google fines as a cash cow and isn't afraid to milk it, gleefully adding zeroes to another mega-euro decision as we speak.

Similarly, while ignorance of robots.txt is not yet a legal precedent, I can absolutely see disrespect of robots.txt instructions being a contributory factor in future fines and penalty payments in many of these ongoing cases. Not a good time for the former CEO of Google to suggest that the winning technique for AI firms is to steal first, pay later.

With perfect timing, ignorance of site conditions and robots.txt is now right at the centre of a content theft case in Germany, which is centring on interpretations that just because instructions saying DO NOT USE are readable by a human, that does not mean it is readable by a machine and therefore can be ignored.

Perhaps more importantly I can also see this transparent condition of robots.txt becoming the subject for action by regulators, elevating the instructions from honour-based and ignorable towards being a core component of legal statutes, swinging much power back towards site owners.

Meanwhile, as all this plays out, other robots are on the way - friendly robots which hope to beat the bandits at their own game - blocking the burglars, asserting rights and primacy, and clawing back cash.

Automation and robotry might be a scourge to modern content and site owners, but it can be its saviour too - backed up by old-tech humans in old-tech buildings.

Rich Fairbairn • Cofounder & CPO

Rich is co-founder and Chief Product Officer at Glide Publishing Platform, building on his editorial and publishing experience to shape our vision to make publishing easier for anyone to do well. Rich worked in newspapers, magazines, and digital publishing for over 20 years before embarking on our shared quest to remove technical hurdles from content and media teams so they can get on with telling great stories.

Latest articles

A masked thief mid-sprint, clutching a book, with shattered glass behind them and a door swinging open, frozen in the moment of escape

Grammarly's creative smash-and-grab ends in rapid retreat

A production line of robots holding typewriters

If the robots all look the same, how do you stand out in an era of AI everything?

A very small tiny violin being held in a human hand.

Tears over a giant AI "heist" find few sympathies in the real world

Ready to get started?

No matter where you are on your CMS journey, we're here to help. Want more info or to see Glide Publishing Platform in action? We got you.

Book a demo