Knowledge is power

The Whitehouse.gov reset broke Wikipedia links en masse

Here’s what editors are doing about it.

Knowledge is power

The Whitehouse.gov reset broke Wikipedia links en masse

Here’s what editors are doing about it.
Knowledge is power

The Whitehouse.gov reset broke Wikipedia links en masse

Here’s what editors are doing about it.

At noon on January 20, when Donald Trump was sworn in as president of the United States, an IT administrator somewhere clicked a button that flipped the First Website, Whitehouse.gov, to a new version that reflected the new administration. Unfortunately, that broke a large number of the approximately 1,935 links to Whitehouse.gov pages on Wikipedia — and that’s just in English.

Wikipedia editors jumped into action. “Links to URLs within http://whitehouse.gov were broken en masse when the new administration changed the main web site,” user econterms wrote on a forum for editors. “I’d welcome advice and correction.”

“My advice is to relax and get used to Trump’s America, which I predict will not be friendly to wikipedia,” another user grumped.

“Yup, they don’t always love external sources of fact,” econterms counseled. “But wp is widely appreciated as such a source, and it can really contribute helpfully here. If you feel defeated, I empathize. But I'm not defeated yet, on day 3!”

The Wikipedians set about finding backups for the broken links, including the backup at https://obamawhitehouse.archives.gov/ maintained by the National Archives and Records Administration, the nonprofit Internet Archive maintained at archive.org, and direct links to nonpartisan agencies.

In addition to eroding credibility, broken links are an opportunity for spammers and marketers.

Wikipedia has plenty of experience fixing links that break over time, also called link rot. The same thing happened in 2009 when Barack Obama’s administration came in, Wikipedia spokesperson Samantha Lien told The Outline. When that happened, editors used a combination of automated tools such as AutoWikiBrowser, which can help editors find and replace broken links on a large scale, and manual corrections to replace all broken links. Additionally, many of the links on English Wikipedia are automatically archived by the Internet Archive. When a link dies on Wikipedia, a bot (User:InternetArchiveBot) automatically replaces it with the archived version of the page.

“Link rot is something editors across Wikipedia are actively preventing and addressing — ensuring that even when a page is removed, archived, or changed significantly on the internet, a version of it can always be referenced on Wikipedia,” she said. Wikipedia has also published best practices to prevent link rot.

The system works, but slowly and incompletely. Link rot is still considered one of Wikipedia’s biggest challenges. Some highly cited sources such as The New York Times will take care to redirect links in a redesign, but not every site has the savvy or resources to do so — and sites can be obliterated for other reasons, as when Radiohead’s entire web presence disappeared. In addition to eroding credibility, broken links are an opportunity for spammers and marketers to insert whatever they want to promote.

In theory, Wikipedia could automatically archive every page that is added as a source link. That would make Wikipedia considerably more massive than it already is, but it would prevent link rot. In 2013, it was proposed that Wikipedia should take over a struggling web archive service called WebCite and put a system like this in place; however, that never got off the ground.

Pete Forsyth, a long-time Wikipedia contributor and editor of the volunteer-run Wikipedia newspaper The Signpost, said what happened with Whitehouse.gov is pretty routine. “The removal is not, as far as I know, of much consequence to Wikipedia,” he said in an email. “On scientific matters, Wikipedia values academic, peer-reviewed source materials; we would not consider the contents of Whitehouse.gov (regardless of what administration is in power) to be among the best sources, and we might expect that Whitehouse.gov would rely on the same reference materials as we do.” The site would be used in politics articles and as a source for what the current administration prioritizes, but its loss as a primary source is not that critical.

Given the topic’s high profile and the availability of working replacement links, the Whitehouse.gov link replacement effort could be done in a matter of days. “It will not be a problem for long,” Forsyth said. But until then, Wikipedia readers clicking through on old Whitehouse.gov links will be seeing a lot of this:

Error page on Whitehouse.gov after the Trump administration took over.

Error page on Whitehouse.gov after the Trump administration took over.

Update: This story has been updated with information about the way the Internet Archive is used to automatically replace broken links on Wikipedia.