11 December 2009

Cheering for Open Hearings in San Jose

The Merc issued a smart editorial today demanding that arbitration hearings that affect taxpayer money should be opened to the public.

I cheered them on, with the below statement.  Enjoy.

Access to the public is exactly appropriate for any arbitration hearing that affects the public's money. We should go further, however, and demand that all negotiations that affect public funds, such as those highlighted by Councilman Oliverio, should be recorded and made public after a suitable embargo period. This embargo would allow negotiators enough latitude to propose compromises, while remaining eventually accountable to their constituents–whether union members, shareholders, or voters. Even India, with some of the worst corruption in the world, has implemented their Right to Information act, putting similar information into the public domain after 30 days. Democracy only works when our right to know is respected. It is ridiculous that our access to information is worse than India's. Kohl S. Gill Sunnyvale

10 December 2009

Scientists Playing with FOIA, Bound to Get Burned: Stop Worrying, Embrace Transparency

Today's scientists are effectively public servants, and should start behaving accordingly.  The public has a right to know what scientists do and how they do it.  To prevent scandals like Climategate, scientific correspondence, critiques, and even data manipulation must be done in public.  The University of East Anglia and associated climate scientists didn't understand that. I'm not sure they do, even now that their emails were hacked and used to sow doubt about climate science.  Jon Stewart sums up the problem better than I can:

The Daily Show With Jon Stewart
Mon - Thurs 11p / 10c
Scientists Hide Global Warming Data

Daily Show
Full Episodes

Political Humor
Health Care Crisis

The hackers who extracted emails from UEA committed a crime.  In so doing, they exposed that several scientists called people dismissive names, manipulated access to blogs, adjusted data in non-obvious ways, and derided their obligations under the law such as the U.S. and U.K. Freedom of Information Acts (FOIAs).

What bothered me most was the disregard for the law.  The most damning example from the emails I've found includes this passage, from Phil Jones to Michael Mann (emphasis mine):

I presume congratulations are in order - so congrats etc !
Just sent loads of station data to Scott. Make sure he documents everything better this time ! And don't leave stuff lying around on ftp sites - you never know who is trawling them. The two MMs have been after the CRU station data for years. If they ever hear there is a Freedom of Information Act now in the UK, I think I'll delete the file rather than send to anyone. Does your similar act in the US force you to respond to enquiries within 20 days? - our does ! The UK works on precedents, so the first request will test it.
We also have a data protection act, which I will hide behind. Tom Wigley has sent me a worried email when he heard about it - thought people could ask him for his model code. He has retired officially from UEA so he can hide behind that. IPR should be relevant here, but I can see me getting into an argument with someone at UEA who'll say we must adhere to it !

The passage goes on to other topics, but you get the picture.

The reaction to this crime should not be bigger and thicker firewalls, or at least not solely that.  Scientists should realize by now that privacy is not absolute, and in most cases, it's not even a good idea.  They should embrace transparency, by moving correspondence to more open formats like blogs, wikis, and Google Wave.  This move is critical to maintaining the long-term respect and public support for science.  Obviously these platforms need to be adjusted to accommodate the complex manipulations of large data sets; the scientific community is in the best position to demand for and contribute to the development of appropriate applications.  They will find great allies in the library science and open government communities, folks tackling the same essential problem.  One great example is the Alliance for Taxpayer Access.  In any case, scientists need to get out in front regarding transparency, or risk letting deniers define the route, themselves.

Some scientists may argue at this point, claiming that such presumptions of transparency will stifle free and open discussion and debate, that scientists working in the public sphere will censor themselves, and that the best ideas will not come forward.  They might worry that science would become even more politicized than it already is.  These might be fair points; after all, much of scientific inquiry is based on proposing and discarding ideas, ideas that seem crazy at first, but may just be right.

The answer is to embrace the embargo.  Embargoes are already used throughout the scientific enterprise.  Usually, they're used by scientific journals to delay access to non-paying readers, but they can be used in other ways, as well.  Scientist A collects data and wants to analyze it and publish.  While working on it, she shares her data with Scientist B so as not to delay further analysis.  Oftentimes, as a courtesy or by agreement, Scientist B holds his own results under an embargo until Scientist A has had a chance to publish her results.  This avoids confusion over which person should get credit for the initial data and analysis.

Wary scientists should know that everything they do, every email they write, every correction of data, every keystroke, could eventually wind up in the public domain.  The platforms that will serve those scientists best will incorporate a time-bound embargo, with definite and obvious, rolling expiration dates.

As Judith Curry of Georgia Tech puts it (emphasis mine):

[G]iven the growing policy relevance of climate data, increasingly higher standards must be applied to the transparency and availability of climate data and metadata. These standards should be clarified, applied and enforced by the relevant national funding agencies and professional societies that publish scientific journals... The need for public credibility and transparency has dramatically increased in recent years as the policy relevance of climate research has increased. The climate research enterprise has not yet adapted to this need, and our institutions need to strategize to respond to this need.

This actually isn't a loss of privacy–which doesn't really exist, anyway–but rather a move to make science even more legitimate and accessible to the public.  We should all recognize the great value the world has derived from access to earlier scientific correspondence.  Occasionally, the public needs to be reminded that scientific inquiry is a human enterprise.  Transparent scientists are ones they can believe in.

[P.S.: Judith Curry has done a fantastic job of corresponding with (initially hostile) commenters at the Climate Audit blog.  Some of her top comments are here, here, herehere, here, herehere, here, herehere, here, here, here and here.]

[Edit: Added emphasis.]

08 December 2009

Yahoo Thinks You're Gay. Now, Find Out Why.

Ever wonder why you get the particular advertisements you see when using tools like Yahoo?  Most of the time, the ads relate to the content you're accessing at the time, like ads for pet services when you spend all day looking at cute pictures of kittens instead of applying for jobs like you promised you would.

But even if you're not logged in at sites like Yahoo!, Google, or Facebook, those companies tend to collect quite a bit of information on you, creating their own consumer profile of you, which they use to match you up with targeted ads.  Wouldn't it be great to take a look at that profile?  Wouldn't you like the option to edit such a profile, or turn it off altogether?

Thanks to increasing pressure from transparency advocates and the U.S. Federal Trade Commission, Yahoo! is taking a baby step in this direction.  Yahoo! is launching a test version of it's Ad Interest Manager (AIM), which shows you a summary of the info that Yahoo! has on you.  I haven't logged in at Yahoo! in a while, so without logging in, I took this screenshot of the AIM site.  Now, it definitely gets some things right, like my age range, gender, OS and browser.  It still thinks I'm in DC, which is odd because I'm on the other coast, now.  When I log in, I see exactly the same profile, except that "Entertainment" has been added as an interest.

Note that Yahoo! appears to know who I am, even when I'm logged out!  Yahoo's policy is to de-identify data after 6 months at the most, but they never delete it.

To be sure, this kind of access is only the beginning of what consumers should demand, and the AIM tool has received mixed reviews from confusing, but better than Google to half-hearted.  But, it is a start.

Are there alternative search engines that will show you what data they collect?

[Google's closest analog - Dashboard]

[Edit: Grammar, and the last question.]