Research Blog - Customer Intelligence

This issue: People, books, seminar and more ramblings.

People: I caught up with xxxxx xxxxxxxx (name deleted on request 25/10/2006) and Joan Valdez, both former colleagues from Telstra days. They are now working in the CRM part of Telstra On Air (data systems). Also, I've been in touch with Dr. Peter Sember, a data miner from Telstra's New Wave Innovation labs. We worked on some projects to do with search query analysis and web mining, so I'm keen to collaborate with him again. Lastly, at the seminar (below), I forced myself upon Brigitte Johnson and Peter Davenport - more Telstra CRM people, but higher up and in Retail. I'm keen to let them know about my research, and look for opportunities there too.

Book: Peter Weill & Marianne Broadbent's book ("Leveraging the New Infrastructure"). This is far more scholarly than Larry P. English's book (below), probably due to the different perspective, purpose and audience. Hell, they even quote Aristotle!

The gist of their approach is to identify IT (and communications) expenditure over the last decade or more for a large number of companies (in excess of two dozen) in different industries. They then compare business outcomes (including competitive positioning) over the same period. By breaking down the spend into eg. firm-wide infrastructure and local business unit, they're able to discern IT strategies (eg. utility, enablement) and see how well they align with business strategy. From this, they draw a set of maxims (in the Aristotelean sense) by which organisations can manage their IT investments.

This seems very sensible. But, both the strength and the weakness of this approach is that it treats IT as a capital investment program, as part of an organisation's portfolio of assets. Throughout the book, you get a feeling that they might as well be talking about office space. I've not yet found any discussion about the value of information, separate from the capital items within which it resides. That is, it's very technology focussed. Also, the "money goes in, money comes out later" black-box thing has yet to shed any light (for me) on the fundamental question of WHERE IS THE VALUE? The approach might be useful for benchmarking, and would be useful for people responsible for managing investments in ALL the organisations activities, in that it put IT expenditure on an even footing with office space and stationery. But, I still have a sense that something is missing ...

So while I'm on a roll, I've got some more comments about Larry P. English's book. This guy is - I'm sure he won't consider this defaming - a quality fanatic. He is relentless in his pursuit of quality and I think that this is a good thing. I wish that my phone company and bank had the benefits of some quality fanatics and gurus. But his approach/advice leaves me thinking that the hard bit, the interesting bit, isn't being addressed. By that, I mean that it appears he assumes you already have rock-solid immutable business requirements handed down on an stone tablet. For example, (and I'll paraphrase here):

Suppose you have three customers of your data, and two want 99% accuracy and one wants 99.99% - then you must give them 99.99%

After working as a business analyst and supplier of information to decision-makers, I flinched when I read that. Honed by my corporate experience, my first reaction was "are they willing to PAY for 99.99% ?" - closely followed by "what's 99.99% WORTH to them?" followed by "what would they do with 99.99% that they WOULDN'T DO with 99%?". I think that this step - what you'd call the information requirements analysis - is what Larry's missing. (Behind every information requirement is an implicit decision model.) And this step is the gist of where this research is headed. Quality without consideration of value isn't helpful in the face of scarse resources when priotisation needs to occur.

Seminar: This morning I attended an industry learning seminar put on by Priority Learning. It was about CRM Analytics. There were talks by ING and Telstra on their experiences implementing this, and SAS and Acxiom on the how and why. The main message I took away from this was that ROI is not the last word in why you'd want to do this. Both ING and Telstra feel it has been and will be worthwhile, but neither could show ROI (as yet).

There was the usual mish-mash of definitions ("what is intelligence anyway?"), the usual vendor-hype/buyer-scepticism, the "this is not a technology - it's a way of life" talk, the usual biases (SAS flogging tools, Acxiom flogging data) - in short, an industry seminar! I'm glad though, that my supervisor Graeme was able to come as it has given him a better view of where the CRM/Customer Intelligence practice is, and the questions that my research is asking.

Re: the practice side. There seems to be an emerging consensus that there is an Analytic component, and an Operational component, usually entwined in some sort of perpetual embrace, possibly with a "Planning" or "Learning" phase thrown in. This was made explicit through the use of diagrams. This is par for the course, though from what I've seen I have to say I like the Net.Genesys model better.

One aspect I found interesting was the implicit dichotomy inherint in "data" (or information, or intelligence - the terms are used interchangably): facts about customers, and facts about facts about customers. The former is typically transactions embedded in ER models that reflect business processes. The latter is typically parameters of models that reflect business processes.

Consider the example of a website information system (clickstream log). Here's two "first order" customer facts:

"Greg Hill", "11/3/01 14:02", "homepage.html"
"John Smith", "11/3/01 14:02", "inquiry.html"

Here's two "second order" customer facts:

"50% of customers are Greg Hill"
"100% of times are 11/3/01 14:02"

The former only makes sense (semantically) in the context of an ER model, or a grammar of some type (the logical view). The latter only makes sense in the context of a statistical model (quantitative view). Certainly the same business process can be modelled with say UML, or a Markov Model, and then "populated/parameterised" with the measurements. They will have different use (and value) to different decision-makers - a call centre worker would probably rather the fact "Greg Hill has the title Mr" - if they were planning on calling me. A market analyst would probably prefer the fact "12% of customers have a title of Dr" - if they were planning an outbound campaign.

But what does information in one domain tell us about information in the other? How does information move back-and-forth between these two quite different views? How does that improve decision-making? How does that generate value?

One last ramble: modelling decisions. To date, I've been thinking that the value of customer information lies in the decisions that it drives, in creating and eliminating options for decision-makers (like above). But nearly all the examples presented today involve implicit modelling of the decision-making of customers: When do they decide to churn? Which product will they be up-sold to? Which channel do they want to be reached through? That is, we're talking about making decisions about other people's decisions: "If I decide to make it free today, then no one will decide to leave tomorrow".

Modelling the mental states of other people involves having a Theory of Mind (a pet interest of mine from cognitive philosophy). Hence, if you take the view that communication is the process of changing our perception of the uncertainty residing in another person's mind ("we don't know anyone else - all we have access to is our own models of them" etc), then marketing really is a dialogue. With yourself. This begs the question: do autistic people - who allegedly lack a Theory of Mind - make particularly bad marketers? Does anyone even know what makes for a particularly bad marketer?

So, putting it together, I'm modelling decision-making about decision-making by looking at facts about facts about customers and how this relates to decision-making about facts about customers.

I need a lie down.

Wow - it's certainly been a while since I've contributed to this blog. First, some texts.

I've been reading Larry P. English's "Information Quality" book. Not very scholarly, seems to be loaded with good advice, but lacks a grasp on data, information, intelligence, representation etc. Ie the interesting and difficult theoretical stuff. He sets up a sequence ("data" -> "information" -> "knowledge" -> "wisdom") and more less says that each one is the former one plus context. Whatever that means. Anyway, I'm sure the book would be useful for data custodians or managers of corporate information systems, but I expect it would have limited use to decision-makers and business users, as well as researchers.

Also been going through an introductory book on "Accounting Concepts for Managers". I figure that a lot of accounting concepts and jargon have found their way into information systems, as evidenced by Dr. Moody's paper which proposes historical cost as the basis for the value of information (see below). While I respectfully disagree, it has motivated me to pick up more of these ideas.

Lastly, in the meeting with Graeme this morning we discussed Daniel's paper further, and information economics in general. I got a lead into this area from Mary Sandow-Quirk (Don Lamberton's work) and I'll also chase up Mingers' papers on semiotics. We seem to agree that the value of information lies in its use, and in an organisational context that means decisions. Hence, I've got a book on "Readings in Decision Support System", which will tie in with Graeme's work on Data Tagging for decision outcomes (see below).
We also discussed further our proposed joint paper on SLAs for customer data and analytics, and our plan to put together a "newsletter" document every month or so for Bill Nankervis, our industry sponsor.

I read a Paul Davies book on biogenesis, or the origins of life. A lot of his arguments revolved around complex systems, and the emergence of biological information. The ideas - genes as syntax, emergent properties of semantics, evolution as a (non-teleological) designer - are similar to Douglas Hoffstadter's classic "Godel, Escher, Back - Eternal Golden Braid" book. Davies' book, though, is nowhere near as playful, lively or interesting. Still, it had some good material on the information/entropy front, which got me to thinking about "information systems" in general, and my inability to define one.

Here's a proposed definition set:
A System is an object that can occupy one of an enumerable (possibly infinite) set of states, and manipulations exists that can cause transitions between these states.

A State is a unique configuration of physical properties.

There are two kinds of systems: Artefacts are those systems with an intentional (teleological) design. Emergent systems are everything else.

An Information system is no different from any other system - any system capable of occupying states and transitioning between them can be used to represent or process information. Some properties make certain systems more or less suitable for information systems (ie it's a quality difference).

The key one is that the effort required to maintain a certain state should be the same for all states. This means that - via Bayes' Rule - the best explanation for why a particular state is observed is that it was the same state someone left it in. (This is getting tantalisingly close to entropy/information/semiotics cluster of ideas.)

For example, a census form has two adjacent boxes labelled "Male" and "Female", and a tick in one is just as easy to perform/maintain as a tick in the other. On the other hand, if you were to signify "I'm hungry" by balancing a pencil on its end, and "I'm full" by lying it on its side, you'd go a long time between meals. Hence, box-ticking makes for a higher-quality information systems than pencil-balancing. The change in significance per effort-expended is maximised. The down-side - errors creep in due to noise. (And they say Shannon has no place in Information Systems theory!)

Another view: a system is a set of possible representations. The greater our uncertainty at "design time", the bigger the set of representations it can maintain. As we apply layer after layer of syntax, we are in effect restricting the set of states available. For example, if we have a blank page that can represent any text, we may restrict it to only accept English text. And then only sentences. And then only propositions. And then only Aristotelean syllogisms. We're eliminating possible representations.

By excluding physical states, we're decreasing the entopy, which according to the Second Law of Thermodynamics ("entropy goes up") means that we're pushing it outside the system (ie it's open). The mathematically-inclined would say that the amount of information introduced is equal to the amount of entropy displaced.

Then, at "use time", the system is "populated" and our uncertainty lies in knowing which state in the subset of valid ones is "correct". At different levels of syntax, we could define equivalences between certain states. One idea I'm kicking around is that this equivalence or isomorphism the key to the problem of semantics (or the emergence of meaning). More reading to do!