Humans broke the internet, understanding them better might help fix it

By Timo Hannay

Here's a multiple-choice question: Is the internet (a) the most open, egalitarian and empowering means of communication ever devised, or (b) a dystopian nightmare populated by hucksters, trolls and miscellaneous abusers of human rights? The answer is, of course, (c) all of the above and much else besides. This stark contrast between the internet's light and dark sides has become a defining characteristic of the digital age, but is not an inevitable consequence of the mostly innocuous technologies on which it's built. Rather, it is the product of their bewilderingly diverse and eccentric user base – otherwise known as humanity.

 Illustration - woman pushing over smartphones

We are complex, unpredictable beings, which perhaps explains why it has turned out to be so amazingly hard to realise the internet's potential benefits without also succumbing to its pitfalls. In human hands, the internet's principal strengths – its sheer versatility and pervasiveness – have resulted in a huge number of unintended and unanticipated consequences. Many of these are welcome: the endless feast of words, images and sound catering for almost every conceivable taste; the sense of mass connectedness that comes from exchanging thoughts during a public spectacle; the explosion of creativity that has breathed life into old forms (such as the essay or the radio broadcast) and given flight to new ones (such as at the intersections between novels and games, or between research papers and databases).

Any such cornucopia of wonders might be expected to create a few adverse consequences, just as the rich world is battling with obesity now that food is so plentiful. Too much information and entertainment is surely better than too little, just as obesity is far less of a problem than its opposite. Yet the downsides of the internet have been much more surprising and profound than that analogy suggests. This great leveler of opportunity has resulted in some of the most overweening monopolies the world has ever seen. This medium of unfettered self-expression has delivered oppressive surveillance and invasions of privacy of a kind that were previously the sole preserve of police states. This enabler of a better informed, more engaged citizenry has instead come to be seen as subverting democracy itself.

How has it come to this? In part through over-optimism and complacency, not least among technologists like me. But mostly – and ironically – through a lack of information. This may seem like a perverse claim: surely we're drowning in the stuff. In a sense we are, but like the shipwrecked sailor dying of thirst at sea, we lack the right kind of knowledge to save ourselves. Specifically, we lack insights about how the internet is changing us and the societies in which we live.

The reasons for this are not hard to fathom. Unlike the information explosion in fields such as medicine or earth science, which have taken place largely in the public domain, the richest data sets on human activities are overwhelmingly owned and controlled by commercial companies (along with certain, usually secretive, government agencies). This means that they have been mostly unavailable to the very domain experts who are best able to analyse and interpret what's going on – and ultimately to help the rest of us understand this too. It's as if the Hubble Space Telescope were owned by Boeing, or the Large Hadron Collider by General Electric.

This is why we should welcome a new initiative described by Gary King and Nate Persily and implemented by the US Social Science Research Council, Facebook and a variety of other collaborating organisations. In short, it provides a means for academics to conduct independent research using Facebook data, while at the same time keeping the source information confidential and subjecting all activities to legal and ethical oversight.

Admittedly this setup has a number of shortcomings. For a start, it limits access to established academics, with decisions made by senior members of that same community. Politicians confronted with an uncomfortable result are bound to dismiss them as a cloistered, privileged elite with little knowledge of the real world. And since experts have sunk in public esteem by almost as much as policymakers, such objections may well resonate.

Another, more justified criticism is that we still lack ways for the subjects of the data to provide informed consent for its use. This is made all the more complicated by the fact that divulging information about yourself often also reveals information about other people with whom you're associated. (This isn't unique to online data; it also applies to genetic information, for example.) Such a Gordian knot of complex competing interests won't be untangled anytime soon because it requires not just a suitable legal framework but also public acceptance, which can only be achieved with years of real-world experience and open debate.

Yet, when you compare it with the other options, this model seems a lot more attractive. Are we to ban anyone from ever analysing this kind of data? Surely not. Are we to limit such work to employees of the companies concerned, or perhaps release the data to all-comers? We've tried both of those approaches and they didn't work out so well. The onus on critics of the proposed model, then, is not to point out its shortcomings, which certainly exist, but to suggest better alternatives.

Some might even go so far as to argue that we should keep academics out altogether, and that the correct response to big tech's overreach is legally imposed regulation. More effective privacy and competition laws are certainly required, but not as an alternative to deeper insights derived from the data. Indeed, legislation ought to be informed by such research. In any case, one look at Mark Zuckerberg's recent testimony before the US Congress should be enough to convince anyone that politicians don't have all the answers – most of them don't even appear to have particularly good questions.

It's also important to appreciate that this isn't mainly about data leaks or even individual privacy. Those are valid topics of investigation and debate, but in the grand scheme of things they are no more than ripples on the ocean surface. We also need to understand how societies function (or don't), and how technological and other developments are changing them for good or ill. These are the great tides whose forces, while less conspicuous, will be far more powerful in shaping our future.

Crucially, the principles of this initiative seem very broadly applicable. While its first implementation has taken the form of an alliance between social scientists and Facebook, this isn't just about social media or even big tech. For decades all sorts of organisations – from banks and retailers to telecommunications companies and government agencies – have been gathering information about our incomes, spending habits, opinions, social activities, movements and much else besides. It is often said that big technology firms' information data harvesting activities are unprecedented, but this is really only true in terms of their brazenness. In a sense they may have done us an unintended favour by lifting the lid on a long-running covert trend that significantly predates them and continues to extend well beyond their organisations.

So let this be the first of many such initiatives. Your move, Google – also Microsoft, Amazon, Vodafone, HSBC, Walmart and governments everywhere.

Timo Hannay is the founder of SchoolDash and a non-executive director of SAGE Publishing. The views expressed here are his own.