“I think storytelling is becoming one of the new frontiers,” said Luke Lonergan, co-founder of Greenplum, now part of EMC Corp. But beyond that, “it really matters a lot to bring the brain to the problem in a way that you can untangle the complexities.” "Social Media, Genomics Driving Data Tsunami" Wall Street Journal 18 Feb 2011 http://on.wsj.com/g9Lt5A
I've found many times that it is very difficult for audiences to use and consume analysis -- no matter how insightful it might be. I suspect this is why effective analysts always find and present "the story" that the data tells. Audiences, especially many decision makers, simply glaze over when presented with the details of a complex analysis. But presented as a story they can interact, explore, test, and consume the analysis. Used correctly, storytelling can be the common language for both consumers of analytics and the those that truly revel in the abstractions of the analysis process.
There is a measure of irony in Lonergan's comment about storytelling being a "new frontier" since it has to be one of the most ancient and powerful modes of human thinking and communication. I'm guessing he means storytelling as a means to facilitate the application of big data (I don't know Lonergan, although I'd like to, so all I can do is guess) and that would be a new, but not unprecedented, application for storytelling.
I think that storytelling is more than a communication mechanism -- something that we think about after the analysis is complete. Storytelling can provide an analytic framework. As I read the interesting WSJ blog post and got to Lonergan's quote at the end I was prompted to describe some of my thinking about the relationship of storytelling and analytics and explore some ideas about how it might be relevant to the promise of "Big Data".
This is not a meta-physical argument. As I'm sure you remember from your literature class (You did take one, right? You do remember it don't you?) a story has plot, characters, and a narrative point of view. Let's see how these concepts are relevant in data driven analysis.
Aristotle argues that a plot has "a beginning, middle, and an end, and the events of the plot must causally relate to one another as being either necessary, or probable." In other words, the "analytic story" has to describe events over time that are driven by cause and effect relationships. To say we know the story the data tells means we know the cause-and-effect framework that has caused the events to transpire. Or, if this is a story of the future, then we assert that we know the cause-and-effect relationships that will cause events to happen at some future time. Knowing these relationships means that we have synthesized a set of rules that can describe events over time -- we have created a model. Generally by exploring (analyzing) the data we first develop a "mental model" and sometimes we go on to describe an explicit, shareable, simulation model. If you are a reader of this blog then you know that this is a pretty exact description of a system simulation model and its creation.
To describe cause-and-effect relationships in stories we generally need characters with motivations or purpose that act based on the events. To tell a story of consumer choice our analysis needs some characters -- consumers, manufacturers, retailers for example. And a series of events -- purchases, inventory orders, etc. And the set of cause-and-effect physics that ties this all together. In the system simulation modeling world we usually call characters "actors" or "agents". The modeling paradigm that places characters foremost is called "agent based modeling" for example.
And that leads to narrative point of view, or "narrative mode", meaning the methods used to tell the story. The analogy in the analytics world is the modeling paradigm and tool set we choose to describe the characters and plot. Existing system simulation tools are oriented towards "data poor" or, at best, moderately "data rich" environments. Certainly nothing like the developing "data tsunami" world we are headed into. Tools powerful enough to model the stories buried in the "tsunami" of data that is beginning to be available truly are on the cutting edge -- and I suspect that is exactly what Lonergan meant.
In a future post I'll try to describe what those tools might look like.