Something interesting I came across, via IEET.
The current mainstream narrative is that the easy and cheap availability of data makes greater transparency possible, and this in turn — thanks to the mediation of the well-designed infographic — increases the public’s awareness and control of governments, corporations, and other organizations it would do well to be aware and in control of.
It’s more than a comforting story; it’s an inspiring one. Computing, from its mathematical core to its layering of protocols, is defined by the processing of information in its rawest, or perhaps its most refined, state: pure information, usually metaphorically described as a pure sequence of bits. That information alone, with the help of the transformational properties of good design, can make positive change possible has been the dream of everybody deeply involved with computers since long before a young Julian Assange read (we can assume) Neuromancer or one of its many inheritors.
But although both the rise to power of data and the proliferation of transparency-seeking infographs, cable dumps, and APIs are true phenomenons, they are at odds at a fundamental level. Data is seldom useful, or seldom at its most useful, when it’s transparent — in the sense of accessible to and understandable through sight. None of the world-changing successes of information technology has ever been transparent. Google’s PageRank is a mathematical construct, not something that can be seen as obvious with the right infographic. High-throughput molecular biology (whether DNA sequencing, microarray experiments, or statistical meta-analysis) depends on sophisticated inference algorithms that are as opaque to simple depicting as any encrypted message. Cryptographic analysis, in fact, the original and most literal killer app of high-performance computers, is on itself a mathematical discipline altogether different from the highly visual ways in which it’s described in movies.
In short, as a rule, data is not transparent. It can be in some cases, but those are usually the very cases in which it’s simply used to reinforce what we already knew or thought we knew. To extract value from data it must be subject to cryptic, or at least highly abstract, operations, at the end of which we are left not with a clear and evident sign of Truth — something already sought by Renaissance philosophers enthralled by a newly visual culture — but with mathematical patters whose validity, interpretation, and uses are themselves far from obvious.
The passage from data to infographic, or to its much debased spiritual predecessor, the PowerPoint slide, is not an exercise in translation. It’s rhetoric; it takes the result of the hopefully careful analysis of hopefully relevant information, and then presents it in whatever form it’s more likely to be believed by the target audience. And the point of all rhetoric is that it needs not begin with truth, nor even with data-based assertions. In fact, it’s nimbler and more effective when it’s not fettered in this way. Medieval rhetoricians and their often uneasy masters were already aware of this through a centuries-long tradition, yet when we move from the statistical analysis of large data sets to the rhetorical synthesis of effective graphs (or, for that matter, blog posts, twitter messages, or any other form of human-targeted communication) we often believe or profess to believe that we are introducing a new element in political discourse, that somehow the data, the subtlety and richness of large data sets and immense processing power, is still with us, informing decisions. It’s not. We might have based our infographic on the best available data, but that has no impact on how credible it looks, and how much of an impact it has.
What makes data-based decision processes so powerful when they work is not that they provide raw material for beautiful graphs, but that they themselves are granted a measure of power. When an algorithm decides what advertisement to show on your page, or whether your experiment has been a success, you have effectively given it power, by definition a political operation in itself. Nothing else could have been disruptive, because everything else would have still passed through the same well-worn bottlenecks and limitations of the human mind. This political compromise, the surrender of control over some decisions in exchange for a more powerful overall organization, is one that some companies, groups, and disciplines have managed to do, and others haven’t. It’s relatively easy to determine to what group something belongs: has it become radically more effective, positively and undeniably smarter during the last thirty years?
The larger political bodies clearly haven’t. Both voters and their representatives haven’t. And why should they? Particularly politicians, a group tasked with specializing in the holding of power — it would have been exceedingly strange for them to surrender even an smidgen of discretionality to the inhuman, unapproachable coldness of statistical equations. Even voters, which as individuals hold relatively small power, are wary of, if not vocally opposed to, anything that would diminish their absolute, discretional control over the little power they have.
This is perhaps a good thing. At the very least, it’s a, if not the, foundational core of representative democracies, the uncontested tyranny of humans, a feature that, it must be said, proved a clear advantage to the much more hypothetical, and usually less benevolent, tyranny of divine beings and their (self-)appointed delegates. But the alternative is no longer so binary. We have created symbolic systems, and embodied them in data processing devices, capable of outdoing ourselves in at least some fields of decision making. The benefits of giving them a greater share of power are, well, statistically demonstrable.
But of course that’s not enough.
Personalised radio service Last.fm has launched a new data visualistation called Listening Clock, which tracks how much music you listen to at different times of the day.
When you play a track through any service that Last.fm has its database hooked into (like Spotify, iTunes, Windows Media player and Last.fm itself), it logs it. The company calls this process “scrobbling” and die-hard Last.fm fans will know that when it comes to music software, “if it doesn’t scrobble, it doesn’t count”.
While you might be uncomfortable that someone out there knows that you have a habit of listening to weepy Alanis Morisette tracks at 3am, the aggregation of those statistics shows anyone with a Last.fm account who their most popular bands, albums and songs are. But if you’re a paid subscriber, then you can delve deeper than that.
Using circular statistics formulas, a trio of researchers named Perfecto Herrera, Zuriñe Resa and Mohamed Sordo wrote a paper titled “Rocking around the clock eight days a week: an exploration of temporal patterns of music listening”, which will be discussed at a music recommendation and discovery conference called WOMRAD2010.
I was going to do my own, but I’m not a subscriber
Came across this PDF which is proving to be an interesting data source for an infographic.