I fully support Aaron Swartz as he fights unjustified charges from the U.S. government, and hope that my readers will support him too. Aaron is a researcher who works with huge datasets and has worked on many open data projects. Aaron is being charged for having accessed JSTOR, a repository of academic journal articles, and downloading them.
JSTOR itself didn’t want to press charges and says it hasn’t suffered loss or damage. But the U.S. Government indicted Aaron because they feel like they “caught a hacker”.
I’m incredulous that they would pursue this case against a well known researcher and activist who allegedly was doing something quite benign — scraping data.
I worry that this case will have a chilling effect on open data projects. The government has gone to great lengths here to stop a respected activist’s work, siccing the Secret Service on him and wasting an incredible amount of resources to trump up this case. The FBI has already investigated Aaron at least once for downloading PACER data . It looks bad to me, like the government was basically waiting for any excuse to build some sort of charge against Aaron for his briliant open data activism.
Here’s Aaron’s background in open data and analyzing large data sets:
In conjunction with Shireen Barday, he downloaded and analyzed 441,170 law review articles to determine the source of their funding; the results were published in the Stanford Law Review. From 2010-11, he researched these topics as a Fellow at the Harvard Ethics Center Lab on Institutional Corruption.
He has also assisted many other researchers in collecting and analyzing large data sets with theinfo.org. His landmark analysis of Wikipedia, Who Writes Wikipedia?, has been widely cited. He helped develop standards and tutorials for Linked Open Data while serving on the W3C’s RDF Core Working Group and helped popularize them as Metadata Advisor to the nonprofit Creative Commons and coauthor of the RSS 1.0 specification.
In 2008, he created the nonprofit site watchdog.net, making it easier for people to find and access government data. He also served on the board of Change Congress, a good government nonprofit.
In 2007, he led the development of the nonprofit Open Library, an ambitious project to collect information about every book ever published.
I would also like to say that I think that libraries and academics should stop buying into the JSTOR model. JSTOR aggregates academic journal articles which it doesn’t even own, and sells limited access to those articles to large institutions for thousands of dollars. Libraries and universities should act to enable access to information, not to limit it.
ETA: Here is JSTOR’s official statement on the case.