Support open data and defend Aaron Swartz

I fully support Aaron Swartz as he fights unjustified charges from the U.S. government, and hope that my readers will support him too. Aaron is a researcher who works with huge datasets and has worked on many open data projects. Aaron is being charged for having accessed JSTOR, a repository of academic journal articles, and downloading them.

JSTOR itself didn’t want to press charges and says it hasn’t suffered loss or damage. But the U.S. Government indicted Aaron because they feel like they “caught a hacker”.

Aaron Swartz
Aaron Swartz

I’m incredulous that they would pursue this case against a well known researcher and activist who allegedly was doing something quite benign — scraping data.

I worry that this case will have a chilling effect on open data projects. The government has gone to great lengths here to stop a respected activist’s work, siccing the Secret Service on him and wasting an incredible amount of resources to trump up this case. The FBI has already investigated Aaron at least once for downloading PACER data . It looks bad to me, like the government was basically waiting for any excuse to build some sort of charge against Aaron for his briliant open data activism.

Here’s Aaron’s background in open data and analyzing large data sets:

In conjunction with Shireen Barday, he downloaded and analyzed 441,170 law review articles to determine the source of their funding; the results were published in the Stanford Law Review. From 2010-11, he researched these topics as a Fellow at the Harvard Ethics Center Lab on Institutional Corruption.

He has also assisted many other researchers in collecting and analyzing large data sets with His landmark analysis of Wikipedia, Who Writes Wikipedia?, has been widely cited. He helped develop standards and tutorials for Linked Open Data while serving on the W3C’s RDF Core Working Group and helped popularize them as Metadata Advisor to the nonprofit Creative Commons and coauthor of the RSS 1.0 specification.
In 2008, he created the nonprofit site, making it easier for people to find and access government data. He also served on the board of Change Congress, a good government nonprofit.
In 2007, he led the development of the nonprofit Open Library, an ambitious project to collect information about every book ever published.

I would also like to say that I think that libraries and academics should stop buying into the JSTOR model. JSTOR aggregates academic journal articles which it doesn’t even own, and sells limited access to those articles to large institutions for thousands of dollars. Libraries and universities should act to enable access to information, not to limit it.

ETA: Here is JSTOR’s official statement on the case.

Related posts:

Civic fictions at conferences

Because of the Amina and Paula Brooks controversies and my part in unraveling them, I spent the last few weeks talking with media and giving talks about online hoaxes, identity, sockpuppets, and astroturfing.

I did an impromptu lightning talk at Noisebridge‘s 5 Minutes of Fame, making my slides right there on the spot. That was a lot of fun — because of the informality of that crowd I was very frank and could have a (bitter) sense of humor about the whole thing.

At O’Reilly’s FooCamp, I gave the talk I had planned on How to Suppress Women’s Coding. But as the Amina story unfolded over the weekend of Foo Camp, I was talking with more and more people about what was going on and at some point actually had a bit of a nervous breakdown on Molly Holzschag and Willow Brugh because of the constant stress and uncertainty about how to proceed and what I was choosing to do. I added in a discussion session “Lesbian Sockpuppet Detective Story” to talk about online identities and think that it went fairly well. People had very good stories about how they detected and fought astroturfers and sockpuppets. Anyway, I could write a giant post for every conversation I had at FooCamp! And might do that — I have pages of notes.


Three things really stood out for me as themes of Foo Camp: Big (open) Data and Visualization; our collective imaginary picture of Oof Camp (the “bad guys” doing the opposite of Foo Camp, working to do things we would disapprove of or find deeply unethical) alongside an examination of what we do believe is right and “our” geek culture; and women in tech talking with each other in public about sexist patterns and strategies to deal with them, which isn’t new, but which seemed to me to be scaled up and comfortable beyond what I normally see at mixed-gender tech conferences. On the women in tech front I think Foo Camp and O’Reilly might be progressing, a sense I’ve had building slowly over the last few years. It seems glacial to me but still positive. In short, I didn’t feel tokenized, I felt respected and valuable, I made tons of great connections with women and men, there were lots and lots of women there kicking ass, I didn’t know all of them, and as an extra bonus, nothing creepy happened at all, at least to me. Huzzah!

Media Lab

After FooCamp, John Bracken gave me a last minute invite to the Knight Foundation/MIT Future of Civic Media conference. This was an absolutely fantastic conference. I loved the MIT Media Labs spaces and all the projects I heard about. Ethan Zuckerman led a panel called Civic Fictions, with Dan Sinker, me, and Andy Carvin. The audience questions and discussion went off in some fairly deep and interesting directions. Here’s a video of the panel with a link to a bare-bones summary. I’ll try to transcribe the entire thing soon.

Civic Fiction: MIT-Knight Civic Media Conference from Knight Foundation on Vimeo.

Dan Sinker talked about writing the @MayorEmanuel Twitter story: 40,000 words of satire in 2000 tweets. I later read the entire MayorEmanual saga which was hilarious & compelling. His analysis of identity and online media and history at the end of his talk blew me away which is part of why I want to transcribe the entire panel. Also, Dan absolutely rocks. We had a fun conversation about being unable to describe ourselves neatly or give any sort of elevator pitch to explain why we were interesting to the suits and … well you know.. the actually legitimate people. Dan has a long history of zine making as the founder of Punk Planet and has done countless fabulous things.

Ethan introduced the panel and told his own story of heading up Global Voices & having to determine whether people were “real” or not, including his doubts from years ago about the blogger Sleepless in Sudan and his relief at finally meeting her. I remember him bringing up Sleepless as an example of deep uncertainty in the discussion at my talk on online fictional personas at SXSWi in 2006.

I told some of the Amina/Paula story, my part in it, how I worked with other investigators, bloggers, and journalists to figure out and expose what was going on. In the discussion afterwards I was most happy with my answer to (I think) Waldo Jacquith‘s question about history and truth. I mentioned Songs of Bilitis partly because it’s the first thing that popped into my head. But it’s a good example of a historical literary hoax that was then actually used by lesbians as a name for the first lesbian rights organization in the U.S., the Daughters of Bilitis.

Andy Carvin then talked about his involvement with the Arab Spring and the Amina hoax in that context. When Amina was “kidnapped” by security police and her identity began to be questioned, all his Syrian contacts went silent for over a week. Andy’s thoughts were great to hear and I really enjoyed talking with him and respect his particular skills in Firehose Immersion.

At these sorts of talks we keep discussing ethics. Many people appear to *want* to do such projects, to tell compelling stories for a political purpose to mobilize particular audiences to have empathy & take action for marginalized people. Some people want to try it, or perhaps have already done it and want to hear that they didn’t do something wrong — or maybe just want to believe that something good came of the attention to bloggers in Syria that the Amina hoax brought. There is also a strong thread of “but… what about creativity and post modern identity?” running through the attempt to save something good out of all this.

It was a great conference and there was only one mildly ew-tastic drunk guy who I had to work to escape from (Larry, you gotta hold your liquor better, dude, and not talk about your junk like that to strange feminist ladies well-known for blogging everything.)

I came out of all these talks thinking that many more hoaxes and large-scale astroturfing situations are coming. Elections and political movements are going to be even more confusing. I think there is a field emerging for analysis of online identity, personas, authenticity, and so on — in fact perhaps an academic discipline which might best be part of journalism/new media schools. “Internet Sleuth” will become a profession that needs much better tools than we have now. As better “persona management” tools are built, we need better and easier to use tools to detect those personas — open source tools in the hands of everyone not just government and huge corporations.

I did think of a few great and inevitable ways that civic fictions could exist without being immediately offensive and appropriative. Here are two.

We could have fictional universe reporters intertwined with our own. Basically, crossover fanfic reporting in first person, crossing some media nexus of fiction, preferably a politically complicated one, with breaking news. Harry Potter, for example. If you had freaking Harry Potter, on location, or better yet several Potterverse characters reporting on breaking news, you would attract entirely new audiences to the news. It would provide ways in for young people to talk about politics and to think about politics in a context of stories they’ve thought a lot about. I mention Potterverse because it’s popular, but also because its story *is* politically complex as a story of child soldiers and armed resistance to dictatorship. Well, anyway, that could be horrible and disrespectful if done clumsily, but I think it *will happen* probably with TV show franchises.

We could have civic fictions that consciously and collaboratively explore a real situation. I thought of one for the town I live in, Redwood City. Redwood City has very strong ties with a specific town in southern Mexico, Aguililla. I’m not sure of the real numbers but I’ve read that 40% of the population of Aguillia has at some point lived in Redwood City in a migration, remittance, and return pattern that has lasted for at least 40 years. People could have *many* reasons for not wanting to tell their personal or family stories of migration and return. How interesting it would be to write a collaborative soap opera or epic stretching over time, twittering and blogging it in a network of friends and family (all fictionalized) perhaps bilingually in Spanish and English (or trilingually since not everyone in Michoacan has Spanish as their first language) to show some of the issues and drama in people’s lives — and perhaps to show the relations, friendships, and tensions between Aguillia emigrants and the other residents of Redwood City and Menlo Park. Good idea isn’t it? Maybe someone will take it and run — or do a similar project in their own home town. Keeping in mind firmly the principle of “Nothing about us without us“.

After I got back from Boston I said I’d do an Ignite talk for IgniteSF but then flaked at the last minute out of exhaustion.

Many people have asked me if I’m still investigating hoaxes and if I found more Fake Internet Lesbians. I did find a few including Becky Chandler the sassy libertarian post-modern feminist in short-shorts who wrote a book on how it’s great to spank your children, but other people have already debunked her and exposed her as a creepy porny p*d*phile spanking-fetishist dude, and looking at that whole case made me throw up my hands in complete disgust. Plus, I really had to get back to my real work projects.

I have to mention my employer’s awesomeness in all of this: As soon as the Amina thing started eating my life, I let my boss and co-workers know about it and BlogHer basically gave me permission to do all the media stuff, radio interviews, talk with reporters, go to the MIT conference, and continue the bloggy sleuthing I was doing and delay my Drupal development projects for a couple of weeks. They were very supportive! But now I am back in the saddle and mucking around with code again, which is VERY SOOTHING.

Coming up in August in San Diego at the BlogHer ’11 conference, which is basically 3000+ women who blog and are heavy social media users hanging out with their laptops, I’m going to be speaking on a panel called “Viral Explosion”, giving a Geek Bar workshop talk with Skye Kilaen on what to do if your blog is hacked or if you lose your data — basically on security and disaster recovery — and then one more talk on Internet Sleuthing on You Know What and You Know Who and the tools I used to track all of it (like Maltego, which I recommend you try), a private wiki, and good old index cards. I’ll post again about BlogHer ’11 and these talks and all the kick ass geekiness that happens at BlogHer conferences!

Related posts: