Thursday, July 26, 2007

How Do I Handle all this Data?

I work for a small mail house in the U.S. that is rapidly evolving into an integrated marketing company. This involves building new processes and creating new technologies (at least the way we are going about it does).

Yesterday, one of the partners came in an told me that we have a client who is struggling with their data. They apparently have 16 or so different databases, with different structures and different purposes. The question was, how do I handle all this data? Asked a different way, the question is, "How do I compile this data in a way that allows me to generate relevancy for my prospects and clients?"

Obviously, this is a two pronged problem, first is the data itself. Second, is the messaging.

The first step we will take is to homogenize the data. This means that we will change its data structure to allow us to combine it. Simply put, we will name the fields the same, add an id field and a field for "list" and then combine the data. As I said in a recent post, this will be done in a spreadsheet because it is the most flexible tool for this kind of work. We will need to include every field in the source data, because this will ultimately be used in the segmentation process.

The next step is to cleanse the data. This involves making sure that the mailing address is in the mailing address field and that the address is current and up-to-date. It will likely involve NCOAing the list and should include formalizing the names.

Once the data is combined and clean- we need to look at the need to segment. I propose that all the fields in the source data can be consolidated down to maybe four or five. (I haven't seen this data yet, but I understand it is pretty rich.) Before we can segment the list, we need to have an understanding of what the marketer is attempting to do with the data and what segments they think they serve. We also should be aware of the need to augment their list with purchased data, so building data around list readily available data elements. This may require data enhancement services, that is sending the list out to have available information appended to our database.

Depending on how prolific the company is with their mailing activity, we may be able to do some modeling. Neural Network modeling attaches significance to independent variables and allows for very precise profiling of the actual customer. So, the database needs to collect as much information as possible, preferably from every prospect and customer touchpoint. This may require (and probably should be designed as) relational data structures. My recommendation is "When in doubt, keep the info." I like to refer to myself as a kook (keeper of odd knowledge) when it comes to managing a database. Keep it all- you can always set what you don't need aside.

As we go down this road with this company, I will add real world experience to the probablys and shoulds of this post.

I will cover the messaging issue in another post.

No comments:

 
http://rpc.technorati.com/rpc/ping