skip to main content

Building an Identity Graph – Pain Tolerance

  • Ben Balogh

    Ben Balogh

Building an Identity Graph – Pain Tolerance

Acxiom’s 50th anniversary has me reflecting on my 21 years with the company, in particular how the identity space and Acxiom’s role within it have evolved.  From the early days of direct mail campaigns supporting merge/purge, through the more recent goals of identifying all possible digital connections.  From focusing on relatively static name and physical address data to noisy, short-lived, and sparsely populated digital touch points, my journey in the identity space is a mix of commonly shared adventures with peers, applied to a unique perspective, derived from the numerous roles I’ve held.

This reflection led me to think about the building blocks for generating and maintaining a successful brand identity graph.  Acxiom’s recently published eBook An Identity Buyer’s Guide for Data-Driven Marketers – Part 1:  Key Questions for Marketers” provides a solid overview of how to get started down this path. Over a series of blog posts, I will add my thoughts to this effort. 

The natural starting point is to determine where on the identity reach vs precision scale your needs fit.  The best way to do this is to assess your pain tolerance for a miss. This is a critical, but often overlooked, consideration based on a common belief that identity stays essentially the same regardless of use case. Through engaging with more than 100 Real Identity™ solutions during my time with Acxiom, I have yet to find two clients using precisely the same definition of an individual.  When you move the consideration to the household level, as my colleague Leslie Price outlined in her July 2018 blog post, the complexity only grows with the desire to generate more sophisticated resolution. 

In my experience, the single most common request coming my way has been to create “standard” identity resolution definitions.  That originates from every avenue possible, with the most common request being for either packaged or industry-level definitions.  In practice, the best “standard” approach to identity resolution or management centers on the primary use case. 

During planning, I look at use cases from the perspective of the “pain tolerance” of a miss. Every identity graph will have misses. The question is, how do you handle those edge case situations? 

  • Conservative (or high-precision) use cases require more scrutiny and break identities/entities apart when evaluating edge cases. 
  • Liberal (or high-reach) use cases allow more edge cases to consolidate into the same identity/entity. 

Both have their uses and generate a valuable view based on the goals.

This adage really does apply here:

“A jack of all trades is a master of none, but oftentimes better than a master of one.”

From my perspective, keeping a sharp focus on the specific use case will result in a system that is finely tuned.  Every additional use case added to the system limits the option to tune the result to the needs. This is acceptable if the additional use cases are similar; however, the wider your use cases spread along the precision vs. reach spectrum, the more limited the tuning options for each become. 

Being realistic, nobody lives in a world where you can create unique identity graphs for each specific use case.  That, however, does not mean you can’t make knowledgeable choices to get the best out of the system.  If you have an overwhelming primary use case, then focus on it. If the system is balanced between various needs, choose the path based on what the data supports best.  Often, “reach” use cases include the noisiest and lowest-quality data.  If that matches your situation, ask if you really need that data to “build” the identity graph or do those use cases simply need to “access” or “learn from” the identity graph?

Apply this methodology to your brand identity graph to build the best of both worlds.  Build the identity graph to focus on the need with the least “pain tolerance.”  Use the “access” models to expand the deployment options to more liberal use cases or suspect data.

We will see where my ramblings take us as I continue my reflections over the next several blog posts, discussing topics ranging from the advantages of third-party data to overcoming common pitfalls.