We live in the era of abundant information – in particular it is the real-time information that competes for our attention. And our peers obviously know that all this information is immediately available to us, so we are supposed to stay informed. It has become common sense to react quickly to new information and to be aware of it.
There are several different kinds of digital information that we are flooded with on an hourly basis.
First, there are social networks like Facebook or Google+ that keep you up to date with your friends’ interactions, activities and thoughts. You might feel like you’re missing out on some important information, so you find yourself checking your wall every now and then. However a lot of time is spent reading through spam postings. Then there is the real-time service Twitter that feeds you with the newest bits of information of people that you follow. Again, you have to spend an awful amount of time reading through the spam tweets.
Second, there are the classical news portals like the New York Times that provide you with serious news about politics, the economy, sports, arts, culture and all that. Since you’re supposed to stay informed, you regularly check the news sites throughout the day to see if something important has happened. Most of the time, however, you spend time reading some lurid or funny articles that you wouldn’t define as being of any importance to you. In fact, you would be happy if you hadn’t spent time reading them, but since you stumbled upon them, you felt the immediate urge to consume them.
Third, there are special interest blogs and sites that feed you with information about particular subjects. Usually, there is only a handful articles that a really worth a read from your point of view, but you still have to check all posts to see what deserves your attention. And again it happens that you get distracted by posts of no particular importance to you.
There are a couple of problems related to the way we digest digital information today. First and foremost, there is the problem of scanning: most of the incoming information bits are not relevant to us. But we have to commit our attention to every single item – at least for a short amount of time – to take the decision whether it is of any relevance to us. In other words, we are filtering through the content.
Second, there is the problem of content aggregation: information bits that would belong to each other coming from different sources are not grouped together. This entails the problems of invalidation, duplication and subsumption. We want to read information bits that belong to one context in one go. We do not want to read information that is not valid anymore or if there is updated information available. We do not want to read the same content twice. We do not want to read an article that is contained in another one (because we then read content twice again).
Third, there is the problem of temporal relevance. On the one hand, we need to decide whether this information piece needs our immediate attention or if we can consume it at a later stage. On the other hand, we need to decide that when we are about to consume an information piece, whether it is relevant anymore. If it’s not, there is no point in still reading it, even if we’ve decided to save it for later (but intentionally not that late) in the first place.
There are a couple of software solutions that try to solve the problem. Google News, for instance, aggregates news and presents them in a uniform way. At least for news, that solves the problem of duplication and aggregation. However, you never know whether the aggregation shows you real duplicates and you don’t really know whether one article is subsumed by another. There is also no way of invalidating news – there is only an implicit invalidation given by temporal distance and number of readers per temporal distance, but that isn’t necessarily the right way of invalidating news.
Other software systems like Flipboard or Prismatic ask you for your interests and then try to compile a personalized dashboard with information pieces published by news sites, blogs and your social networks that you might enjoy. However, the algorithms are not transparent to its users – sure, there is a lot of statistics going on and association rule mining and all that, but that’s not transparent to the layman that just wants to consume information relevant to him – and therefore not successful in solving the problem of filtering accurately. They might filter information out that would have been relevant to you. Even more problematic, they filter information types that are completely new to you, so you couldn’t say whether you’re interested in it or not. In other words, the discovery of new information is not possible with such systems in a satisfactory way.
The problem of temporal relevance is not handled in any useful way by any of these applications.
To my mind, a combination of both software and people could help to ease the problem of information overload. The software would be more of a framework like Wikipedia that allows people to commit their capacity to solve the problem of collective information overload.
First, the problem of filtering could be solved in several ways by involving human people. Instead of letting software filter information streams for you, you could instead subscribe to filtered and aggregated streams that are published by people that you put your trust in. There could be a filtered and aggregated stream by Guy Kawaski that features the important articles from the important tech blogs about the relevant new valley hot shots. There could be a filtered and aggregated stream by Kofi Annan on important articles on international policy. And so on. I would trust experts much more to filter information for me than any algorithm.
Second, the problem of content aggregation and the entailed issues could also be solved by human people. They could mark posts as invalidated, as duplicated, as being a subsumption and so on. They could even digest the most important information that is contained in a lengthy article. In many cases, I would rather subscribe to the digested version of economic news that just outlines the article’s facts about a company.
Third, the problem of temporal relevance can also easily be decided by the people. If there is an article about movie history, there is obviously almost no temporal relevance at all. If there is a feature on the upcoming soccer game, it is only relevant as long as the soccer game hasn’t happened.
All this can be decided by the people. A software system would give people the capabilities to act on the information in such ways and to promote a peer reviewed system like Wikipedia to prevent biases and vandalism. What would happen really is that a small fraction of people devotes their time to filter, tag, digest and select information so that a large fraction of people can save time – because today, all these things are done in parallel by people. And that seems like a waste of time from a society’s point of view.