Adrian has lots of important stuff to say in this interview.
For example:
Much of the information that journalists collect, day to day, is structured. Information such as crime reports, obituaries and event listings always follow a certain pattern, which can be richly exploited by databases.
The majority of newspapers take the time to *collect* this information -- which is the hard part -- but they dramatically reduce its value by NOT storing it in structured formats. Instead, they distill it into big blobs of text for publication in their print editions, and then they shovel those big blobs of text onto their websites. At this point, all structure is lost: Crime reports can't be sorted or searched intelligently, and event listings can't be viewed in any sort of user-friendly way.
The very act of distilling information into a news story -- which is essentially a big blob of text -- removes any sort of structure. Information is exponentially more valuable if it's structured.
He’s absolutely right. Newspapers take structured data and feed it to people in an unstructured format.
He could have also pointed out that after distilling data into structureless blobs, they often try to re-apply structure to articles by adding all sorts of descriptive metadata. That can be a messy process.
Adrian says papers can produce more valuable information by retaining the structure of the data they publish, i.e. by publishing databases.
I agree, but as we build these databases, we need to remember what those big blobs of text do well: they tell stories.
We need to publish all the data we have – get it out there so that readers can find the things that are important to them. (How many murders were there on MY block?)
But at the same time, we need to organize and filter the data in a
ways that resonate with readers – in ways that jump out of Blackberries and
seems bigger than single blocks.


Comments