arrow Products
Glide CMS image Glide CMS image
Glide CMS arrow
The powerful intuitive headless CMS for busy content and editorial teams, bursting with features and sector insight. MACH architecture gives you business freedom.
Glide Go image Glide Go image
Glide Go arrow
Enterprise power at start-up speed. Glide Go is a pre-configured deployment of Glide CMS with hosting and front-end problems solved.
Glide Nexa image Glide Nexa image
Glide Nexa arrow
Audience authentication, entitlements, and preference management in one system designed for publishers and content businesses.
For your sector arrow arrow
Media & Entertainment
arrow arrow
Built for any content to thrive, whomever it's for. Get content out faster and do more with it.
Sports & Gaming
arrow arrow
Bring fans closer to their passions and deliver unrivalled audience experiences wherever they are.
Publishing
arrow arrow
Tailored to the unique needs of publishing so you can fully focus on audiences and content success.
For your role arrow arrow
Technology
arrow arrow
Unlock resources and budget with low-code & no-code solutions to do so much more.
Editorial & Content
arrow arrow
Make content of higher quality quicker, and target it with pinpoint accuracy at the right audiences.
Developers
arrow arrow
MACH architecture lets you kickstart development, leveraging vast native functionality and top-tier support.
Commercial & Marketing
arrow arrow
Speedrun ideas into products, accelerate ROI, convert interest, and own the conversation.
Technology Partners arrow arrow
Explore Glide's world-class technology partners and integrations.
Solution Partners arrow arrow
For workflow guidance, SEO, digital transformation, data & analytics, and design, tap into Glide's solution partners and sector experts.
Industry Insights arrow arrow
News
arrow arrow
News from inside our world, about Glide Publishing Platform, our customers, and other cool things.
Comment
arrow arrow
Insight and comment about the things which make content and publishing better - or sometimes worse.
Expert Guides
arrow arrow
Essential insights and helpful resources from industry veterans, and your gateway to CMS and Glide mastery.
Newsletter
arrow arrow
The Content Aware weekly newsletter, with news and comment every Thursday.
Knowledge arrow arrow
Customer Support
arrow arrow
Learn more about the unrivalled customer support from the team at Glide.
Documentation
arrow arrow
User Guides and Technical Documentation for Glide Publishing Platform headless CMS, Glide Go, and Glide Nexa.
Developer Experience
arrow arrow
Learn more about using Glide headless CMS, Glide Go, and Glide Nexa identity management.

Fruit juice, fair use, and moving the legal goalposts

Copyright and fair use are under attack as AI businesses look to squeeze as much juice as they can from a currently forbidden fruit. It turns out all Adam & Eve needed was a better lawyer or lobbyist.

by Rob Corbidge
Published: 17:25, 24 October 2024

Last updated: 18:18, 24 October 2024
A man running through a vegetable field taking a shortcut to a demo of a publishing CMS for media, entertainment, and sports organisations.

When Microsoft CEO Satya Nadella mixes his broccoli and guava juice in the morning, which in my mind he does to prime himself for a tough meeting with Microsoft's Chief Licensing Compliance Officer to pore over the latest fines dished out for unlicensed Microsoft product usage, does he regard the smoothie-making process to be equally important to the ingredients used, or less important? 

Or not important at all? Are the broccoli and guava even relevant to Nadella? Could they instead be anything blended with anything to his own particular recipe, and that it's the decision to conjure a new recipe literally from thin air that kicks him into gear best each day? His is a CEO after all, and decisions are his bread and/or butter.

Or, as an ex-engineer, maybe what really does the job to wake him up is getting to use the maximum wattage Full Power Mode in his blender (at the risk of waking anyone nearby), or the gentler rhythmic pulsing mode - for a less smooth smoothie, but he gets to feel like a DJ for a few seconds? 

What if Nadella is making it for someone else? Do those people need to know the ingredients? And did he buy the broccoli and guava, or perhaps grow it himself? Perhaps they were a gift, and he intends to pay it back in kind at some point in the future.

I don't for one second believe he helped himself to the produce from the shelves or fields and ran off without paying. That would be awful, especially if he was selling the smoothie.

Transformed beyond recognition

One thing is certain. Whatever he ends up making will not look like a broccoli, or a guava, any more. They are transformed beyond recognition - but probably not beyond flavour or nutrition, or ownership, despite the best efforts of the machine to destroy its previous form and transform it into a non-vibrant goop.

Why are we musing on the philosophical underpinnings of the Microsoft CEO's morning smoothie? Well, it's because Nadella is the latest boss of an AI firm to question the idea of copyright. In an interview this week with The Times, Nadella took the opportunity to promote the idea of copyright laws being re-weighted away from  creators and owners and towards businesses who wish to freely harvest data to train their LLMs. Businesses such as Microsoft, and its AI wunderkid OpenAI. 

"What’s copyright?" Nadella asked of The Times. "If everything is just copyright then I shouldn’t be reading textbooks and learning because that would be copyright infringement."

Which is, if you don't mind me saying from the cheap seats, a rather stupid argument. Maybe it sounded good in the canteen to some interns.

What Nadella is holding up as an example relies on the human endeavour of learning, and the laudable idea of progress based on such learning, to make us believe that unless we all labour to generate knowledge for use in AIs and LLMs, then human progress will stall. He did all this without reference to Microsoft's share price, which I believe to be an important factor in his views.

I'm put in mind of the Communism meme popular a few seasons ago, in which some poor fool would declare something theirs, such as a cake, to be told by a giant Soviet bunny that it was in fact "our cake". It's all Microsoft's cake, and it's obtained by Text and Data Mining (TDM).

Copyright isn't theft

Such matters of copyright are of pressing importance right now. This week also saw the CEO of AI-search startup Perplexity say he wants to strike revenue deals with publishers for using their content, after the parent company of the Wall Street Journal and the New York Post, Dow Jones, began legal proceedings against his business for "allegedly misappropriating their content". Staring down the barrel of a well-funded legal action does wonders for clarifying the mind. 

Yet many creators of original work have no such legal artillery to their fingertips, and rely more on existing national legislation to protect their endeavours.

In the UK, the previously settled position was that TDM is appropriate only for "non-commercial activities", meaning research. But that is under government scrutiny, as reported by the law firm A&O Shearman, with the current thinking being "that the government is set to consult on a new TDM scheme that will cover commercial activities but will allow content owners 'to opt out' i.e. expressly reserve their rights in certain works".

Opting out puts all the onus on the creators and publishers of original content to marshal anyone "reading" it, and engage with each one separately if they spot a bot or methodical scanning which they wish to reject. FYI, setting up a new bot is not difficult to do, while discovering the correspondence details of a bot is considerably harder than discovering the address of a minister to send them an invite to an event.

It's a bit like opting out of being burgled by putting a notice on your front door that is addressed to the burglar by name, which is legally ignorable by anyone else.

Given the observed behaviour of companies looking to TDM all they can, making the default position that a publisher needs to opt in to share text and data is a far less dangerous position than a default free-for-all.

Unfair use

If TDM comes under "fair use" - reminder again that this is a US term, which does not travel well - why have so many AI businesses signed deals for content use? 

As US researcher Suchir Balaji outlines in his excellent summary of the current situation, "given the existence of a data licensing market, training on copyrighted data without a similar licensing agreement is also a type of market harm, because it deprives the copyright holder of a source of revenue".

The fact is, this is a commercial situation, not a technological one. If we are to value the output of LLMs, then we want the data they are trained on to be good data. The good data they need is the result of human effort, and that effort must be rewarded. 

The Fear of Missing Out card is being played hard by the AI sector, causing some within to ignore a basic commercial rule that they themselves will not forget when it comes to sending out invoices: you don't give anything away unless there's some benefit to you.

What is accepted as the first true copyright law, the Statute of Anne of 1709, started with the words:  "Whereas Printers, Booksellers, and other Persons, have of late frequently taken the Liberty of Printing ... Books, and other Writings, without the Consent of the Authors ... to their very great Detriment, and too often to the Ruin of them and their Families." 

It wouldn't be that bad if we could all agree that lots of things LLMs had come up with were good and original ideas. I've seen lots of bad ones, but even they weren't new per se. 

When that starts, maybe I'll shut up and drink my unspecific transformed ingredient smoothie.