Petabytes vs. Puny Books



The July 2008 issue of Wired has a thought provoking article by Chris Anderson entitled The Petabyte Age.  A petabyteis an unimaginably large amount of data – 1,000 terabytes or a quadrillion bytes.  The article catalogs a number of important applications using datasets measured in petabytes; everything from agriculture to politics.  Anderson has asserted that the availability of these huge datasets is lessening our reliance on the predictive value of theory and mathematical / statistical models.  Models have an elegant and convenient compactness, but often a limited predictive ability.  “Big data” closes the predictive gaps if you have the storage and processing power to manipulate and make sense of it.

The Petabyte Age is the natural outcome of three “laws” (ahem, recall those compact models).  These are Moore’s Lawwhich governs the growth in computer processing power; Kryder’s Law which predicts hard disk storage cost per unit of information; and Butter’s Law which measures the capacity of the fiber optic network underpinning the Internet.  These laws are synergistic – processing power can be greatly amplified by hooking servers and PCs together in computing networks; storage can be extended via disk arrays; and huge datasets can be accessed over high speed, high capacity fiber optic networks. 

So all this discussion of “Big Data” got me thinking: The whole bibliosphere could be radically changed.

  • Authors could tap into gigantic databases to do incredibly detailed research on people and places.  Novelists could scan the entire body of literature to see where “story gaps” might exist to be exploited. 
  • Publishers could track readership trends based on accumulated book sales data and accurately predict the success or failure of any book prior to its publication.   
  • Readers could go to their favorite online bookstore get a pinpoint recommendations based upon analyses of buying histories, correlated with with behvaioral, demographic and psychographic profiles. 

But the one thing that probably won’t change is the way we package all the new knowledge that “peta processing” delivers?  We will likely use the same book size packets – whether in print or electronic form – we use today.  Why?



The book is the anti-petabyte.  It is perfectly tuned to the human mind.  Stories are how we make sense of things.  Our brains are confronted by petabytes of raw data during our lives; yet the memories we create out of that torrent can be squeezed into a terabyte or two.  The stories we tell – whether of fact or fiction – represents the imprecise model, the compact and convenient approximations that leave us wanting more.  Biologists tell us that this filtering is the core of our success as a species. 

I can marvel at the power of Big Data and Cloud Computing.  But being human, I will always believe the real power lies in the Little Story.

Related Posts
Bookmark this Post

This entry was posted in author tools, Book videos, publishing analytics, publishing technology and tagged , , , . Bookmark the permalink.