The Future

We still don’t know how Google News works

There is no specific criteria and no full list of publications.

The Future

We still don’t know how Google News works

There is no specific criteria and no full list of publications.
The Future

We still don’t know how Google News works

There is no specific criteria and no full list of publications.

Alphabet executive chairman Eric Schmidt said during an interview this week that Google is trying to “engineer the systems” so that Russian state-owned media outlets RT and Sputnik stop dominating Google News results and making money via Google’s ad network Adsense. “We don’t want to ban the sites,” Schmidt said. “That’s not how we operate.”

How does Google operate when it comes to news sources? It’s true that Google rarely bans sites outright from its search engine at Google.com, which crawls the entire open web — sites only get kicked out if they are illegal or attempting to game the algorithm. But Google has always maintained more discretion over Google News, which is restricted to sites that “primarily offer timely reporting or analysis of recent events,” according to the company. As of this writing, Google was still surfacing Sputnik and RT in Google News.

Unlike Google’s main search engine, which will pick up any site on the web, sources have to be approved before they are included in Google News. The criteria for inclusion are broad and vague. “In general, you should write original content that’s clear and free of grammatical errors,” Google says in its guidelines for publishers. Other factors include original reporting, clear attribution with bylines and datelines, transparent author bios, honesty — “Sites included in Google News must not misrepresent, misstate, or conceal information about their owner or primary purpose” — and having an amount of content that exceeds the amount of advertising.

It’s not clear how stringently those guidelines are enforced. AllBusiness, which shows up in Google News results, doesn’t appear to have datelines on any of its articles. The Economist, which famously eschews bylines, is also included. In general, Google seems reluctant to remove any publisher that was once approved — which could be why Schmidt seemed more willing this week to adjust the platform’s entire algorithm than to kick two outlets out.

Google News launched in 2002. Creator Krishna Bharat, a longtime Google research scientist who headed the company’s news development development for years before he left the search giant in 2015, found himself bouncing around the web after Sept. 11, getting coverage from different media sources. “It seemed fundamentally inefficient. That’s not the way the web was supposed to work,” he said at the time. “The web was supposed to have a link structure that helped you find content.” He conceived of a news aggregator that categorized related stories into “clusters” of coverage, going back 30 days, from thousands of approved sources. The first Internet Archive snapshot of Google News shows eclectic headlines from sources ranging from MTV to the Singapore Strait Times. In 2003, Google put the number of news sources at around 4,500. In 2011 the company released a cache of news stories about Osama bin Laden, which researchers found to contain articles sourced from some 4,500 separate publishers ranging from USA Today to small local papers like the Bennington Banner. A Google spokesperson declined to provide an up-to-date list of Google News sources for this story.

“In general, you should write original content that’s clear and free of grammatical errors.”
Google News guidelines for publishers

That lack of transparency can create the type of credibility problems that Google now finds itself pushing back against. Google News still indexes the U.K. tabloid The Daily Mirror, which is notorious for glaring errors like depicting a traditional Russian pancake festival as a training camp for violent soccer hooligans, and bottom-tier content mills like Business2Community, which runs endless listicles about thinly-sourced business topics, or Elite Daily, a lifestyle blog that’s been accused of copyright infringement, letting authors post pseudonymously with photos of models as profile pictures, and even posting under the name of a Gawker writer.

Google News’ prominence, though, is undeniable. In 2012, Google started saying that there were 50,000 sources in Google News. The Guardian noted in 2013 that while Google's "crippled communication machine" had struggled to justify Google News' benefits to the news media, its 72 editions in 30 languages were drawing six billion visitors per month in an era when The New York Times was attracting just 40 million visitors monthly.

That popularity is probably due to the site’s significant technical achievements, patent filings for which describe how the site evaluates the newsworthiness and originality of each source in deciding its rank — a news engine that applies Google’s aptitude for grading web content to the entirety of web publishing.

Google is now increasingly grappling with criticism around which sources pop up in its more curated products like Google News and the answer boxes that appear at the top of search results, which are called featured snippets. In another prominent example, Google’s “Top Stories” section, which serves a purpose much like Google News, showed conspiracy theories sourced to 4chan after the October mass shooting on the Las Vegas strip.

Kevin Carty, a researcher at the Open Markets Institute, said that Google’s enormous stature gives it a special responsibility to offer some form of transparency about how its algorithms work — especially since Google News depends on news outlets in order to exist as a useful tool.

“Google News and Google Search are interesting because they’re only possible and profitable because these other services and publications are providing things of such great value,” he said. “Google News would be nothing without CNN and the Washington Post and NPR.”

Carty, who favors the solution of regulating Google’s services like a public utility, worries that leaning on the search giant to stamp out misinformation on its platform will foster a narrative in which a corporate Big Brother can make closed-door algorithmic decisions that affect users around the world with no public oversight.

“This Google News thing illustrates this problem where a company like Google or FB has enough power to control a whole sector of trade,” he said. “If you have a problem, like election interference or fake news, these companies are being asked to behave as government.”

Schmidt seems to believe that Google must rework the algorithm so that RT and Sputnik are sort of naturally de-ranked rather than intervene manually. That is consistet with Google thinking; from the beginning, Google News was touted as being run entirely by “computer algorithm.” But this position is increasingly unconvincing as Google boots channels off YouTube and demonetizes publishers en masse. Google let RT and Sputnik into Google News. Why pretend it can’t kick them out?

Jon Christian is a contributing writer at The Outline. He last wrote about how spam rose from the dead.