Google AI Overviews = Theft? Court Ruling Sets Precedent

seo@optimus42.com

11 months ago

Google’s ambitious new direction for online search, driven by AI advancements, has ignited a widespread backlash within the industry over concerns about potential harm to the internet’s open structure.

At the heart of the controversy lies Google’s recent introduction of “AI Overviews,” concise summaries designed to directly address search queries by aggregating information from various online sources.

These AI-generated overviews prominently feature at the top of search results, potentially reducing users’ reliance on visiting publishers’ websites.

This move has triggered legal challenges in France, where publishers have filed lawsuits alleging Google’s infringement of intellectual property rights by utilizing their content to train AI models without proper authorization.

In April 2024, a coalition of French publishers achieved a significant legal victory when a judge ruled in their favor, mandating Google to engage in fair negotiations regarding compensation for the utilization of excerpts from their content.

Similar objections have emerged among US publishers, who argue that Google’s adoption of AI-driven search overviews poses a threat to diverting traffic away from original sources, thus unfairly benefiting from others’ content.

This debate underscores the pressing need for updated regulatory frameworks to govern the ethical utilization of online data in the era of AI.

Concerns From Publishers

Industry analysts warn that the advent of AI overviews could have far-reaching consequences for countless independent creators who rely on referral traffic from Google Search.

Frank Pine, executive editor at MediaNews Group, voiced his concerns to The Washington Post, stating: “If journalists did that to each other, we’d call that plagiarism.”

Pine’s company, which oversees publications like the Denver Post and Boston Herald, is among those taking legal action against OpenAI, alleging the unauthorized scraping of copyrighted articles to train their language models.

Google’s revenue model has traditionally centered on directing traffic to external websites and capitalizing on this traffic through paid advertising avenues.

However, the introduction of AI overviews poses a potential threat to this revenue model, potentially altering the landscape of online monetization.

Kimber Matherne, a food blogger, voiced her concerns in The Washington Post, emphasizing: “[Google’s] goal is to make it as easy as possible for people to find the information they want. But if you cut out the people who are the lifeblood of creating that information, then that’s a disservice to the world.”

The Post’s report also highlights insights from Raptive, an advertising services firm, which predicts that these changes could lead to a staggering $2 billion loss in revenue for online creators.

Furthermore, Raptive warns that certain websites could potentially experience a dramatic decline, losing up to two-thirds of their search traffic.

Michael Sanchez, CEO of Raptive, expressed his apprehension to The Post, stating: “What was already not a level playing field could tip its way to where the open internet starts to become in danger of surviving.”

Concerns From Industry Professionals

Google’s AI overviews have understandably sparked concerns within the industry, evident through a flurry of critical tweets from professionals.

Matt Gibbs raised questions about the foundation of Google’s AI knowledge base, bluntly accusing: “They ripped it off publishers who did the actual work to create the knowledge. Google are a bunch of thieves.”

Kristine Schachinger echoed similar sentiments in her tweet, describing Google’s AI answers as “a complete digital theft engine which will prevent sites getting clicks at all.”

Gareth Boyd shared a quote from The Washington Post article, shedding light on the challenges faced by blogger Jake Boly, whose website recently experienced a staggering 96% decline in Google traffic.

Boyd expressed his apprehension, stating, “The precedent being set by OpenAI and Google is scary…” and emphasized that “more people should be equally angry” at both companies for the “open theft of content.”

Avram Piltch didn’t mince words, directly accusing Google of theft, declaring, “the data used to train their AI came from the very publishers that allowed Google to crawl them and are now going to be harmed. This is theft, plain and simple. And it’s a threat to the future of the web.”

Similarly, Lily Ray made a damning assertion regarding Google: “Using all the content they took from the sites that made Google. With little to no attribution or traffic.”

Legal Gray Area

The debate touches upon wider discussions regarding intellectual property and fair usage, given that AI systems are trained on vast amounts of data collected from across the internet.

Google contends that its models solely utilize publicly accessible web data and highlights that publishers have historically profited from search engine traffic.

Publishers essentially agree to their content being cataloged by search engines unless they actively choose to opt out.

However, existing laws did not anticipate the training of AI models, raising questions about their adequacy in addressing these developments.

What’s The Path Forward?

This discussion underscores the necessity for fresh regulations governing the utilization of online data by AI.

Moving forward presents a complex challenge with significant implications.

Suggestions range from implementing revenue sharing or licensing arrangements for the use of publisher content in AI training, to advocating for an opt-in framework granting website owners greater authority over how their content is utilized.

Recent legal rulings in France indicate that courts may intervene in the absence of explicit guidelines and constructive negotiations.

The internet has traditionally thrived on a symbiotic relationship between search engines and content creators. Any disruption to this equilibrium, without the introduction of new safeguards, risks undermining the information exchange that defines the internet’s value.

Original news from SearchEngineJournal