Know Thy Customer

"Know Thy Customer" is a popular refrain amongst marketing professionals in many industries. There are many ways to "know your customer", but the one I'm writing about here is a data-driven, analytic approach. The basic idea is to take what you currently know about your customers (think of the data in your billing system) and augment it with other commercially and/or publicly available data. Knowing more about your customers makes it possible to communicate more effectively with them, to structure tiered pricing and plans more effectively, and to be much more persuasive in your efforts to promote conservation.

We recently completed a project where we helped a water utility client augment their billing system data. After reviewing several potential data sources, we found that county appraisal district data and federal census data hold numerous insights:

  • Exactly how many housing units are there in the utility service area?
  • How many people (estimated) live in those houses?
  • What is the poverty rate in different parts of my service area?
  • How big is each of my customer's parcels?
  • How big is each of my customer's houses?
  • Which customers have swimming pools and/or irrigation systems?
  • Where exactly (geo-coordinates) is each customer location?

The process of cleansing and matching your customer database against the county appraisal and federal census data isn't trivial, but it's not rocket science either. A skilled database programmer can assist in the following:

  • Standardize and validate all Location addresses from your Billing System. The United States Postal Service has a great utility for this; feed it any US street address and it will validate the address and return a properly formatted version of the address (or an error code if the address is invalid). The returned address is in the form of an XML message which can be hashed and used as a matching key.
  • File a public-information-request with your local appraisal district(s), load their database(s) and standardize/validate all parcel addresses the same way. Use the hash values to match each location with the property parcel record.
  • You'll also need to evaluate the improvement information to identify attributes such as the number of housing units on each parcel, the lot size, house size, presence of a swimming pool, and any other attributes which you believe might be helpful in segmenting your customer base.
  • Using the latitude and longitude of each Location, link back to the federal census data to access attributes such as estimated population per household and poverty levels.

In the case of our recent customer, these enhanced data attributes have proven extremely valuable in addressing several key challenges. First, the utility needed to validate the estimated population of it's CCN service area. State regulatory oversight and government funding opportunities are all dependent upon the estimated population served. This number is derived from federal census data as enhanced by state demographer population growth projection models, and then allocated across the various utilities that serve a given county. This is where things go sideways a bit; the allocation process is seemingly tied to metropolitan/micropolitan census areas which often results in very high error rates. This is especially true in regions where a micropolitan area differs greatly in size from the corresponding municipality's CCN service area. Another problem with this approach is that the underlying census data is only fully updated every ten years and partially updated every five years. When the housing market is active, this number can get stale much faster than the census data is updated.

Using the county appraisal database, we were able to perform a 'bottom up' analysis by locating every property parcel within the boundaries of the CCN service area. For this we evaluated a county-wide parcel shapefile acquired from the county appraisal district GIS department. Next we evaluated the appraisal data to determine how many 'housing units' are present on each parcel; this value was then multiplied by the estimated population per household which was pulled from the 2010 federal census data. The net result for our customer was that their service area population has been under-estimated by nearly 40%! This number impacts state environmental oversight, state and federal funding opportunities and strategic planning by the utility's own board of directors. It will be interesting to see how this information is received during the next round of state water plan updates.

Enhanced Customer data is also proving very helpful in the utility's efforts to ensure compliance with state regulations requiring only one residence per meter. The utility is very rural, and the staff had lots of anecdotal evidence and a strong collective hunch that there were quite a few multiple connections on their network. A cross reference of the appraisal data (with housing units extracted per parcel) against the billing system database quickly revealed the locations with multiple hookups. They're using rather 'soft gloves' as they approach the customer base with this new regulatory issue. It will be interesting to see how this proceeds, given the fact that they now know exactly where the multiple hookups are.

Matching property attributes to the Customer record has also allowed the utility to personalize the conservation messaging that they send to their customers as part of a monthly water usage email report. And finally, the appraisal and census data have also proven to be great during growth planning discussions. We were able to analyze population numbers using USGS quadrants, and to introduce a timeline-view of the property development. By viewing an animated 'heat map' you can easily see where the growth trends in the community have been and where they are heading. It will be interesting to see what applications our customers find for this data next.