Adrian has
lots of important stuff to say in this interview.
For example:
Much of the information that journalists collect, day to day, is structured.
Information such as crime reports, obituaries and event listings always follow
a certain pattern, which can be richly exploited by databases.
The majority of newspapers take the time to *collect* this information --
which is the hard part -- but they dramatically reduce its value by NOT storing
it in structured formats. Instead, they distill it into big blobs of text for
publication in their print editions, and then they shovel those big blobs of
text onto their websites. At this point, all structure is lost: Crime reports
can't be sorted or searched intelligently, and event listings can't be viewed
in any sort of user-friendly way.
The very act of distilling information into a news story -- which is
essentially a big blob of text -- removes any sort of structure. Information is
exponentially more valuable if it's structured.
He’s absolutely right. Newspapers take structured data and feed it to people
in an unstructured format.
He could have also pointed out that after distilling
data into structureless blobs, they often try to re-apply structure to articles
by adding all sorts of descriptive metadata. That can be a messy process.
Adrian says
papers can produce more valuable information by retaining the structure of the
data they publish, i.e. by publishing databases.
I agree, but as we build these databases, we need to remember what those big
blobs of text do well: they tell stories.
We need to publish all the data we have – get it out there so that readers
can find the things that are important to them. (How many murders were there
on MY block?)
But at the same time, we need to organize and filter the data in a
ways that resonate with readers – in ways that jump out of Blackberries and
seems bigger than single blocks.