Download the 2022 State of Data Collaboration Report!  Learn More


Habu Blog


How Identity Fits Into Data Clean Rooms

by: Matt Karasick
Blog How Identity Fits Into Data Clean Rooms

One of the primary uses of clean room software is to JOIN distributed data. The vast majority of the time, the JOIN fields will be some flavor of user_ids. Beyond scenarios where the datasets are serendipitously sharing the same identifier, the ability to JOIN will depend on an id graph(s).

And the reason marketers use clean rooms to safely JOIN data between themselves and their partners/providers, in order to power marketing applications. In the vast majority of data collaboration scenarios, achieving a critical mass of identity resolution will be a requirement, and match rates will dictate the potential scale of the initiative’s potential. To maximize the potential of the collaboration opportunity, marketers turn to ID graphs to serve as a match table between keys.

Defining Three Key Data Clean Room Terms

To paint a clear picture of how Identity should work in a clean room environment, first let’s define a few key terms:

Marketing Application

A marketing application produces intelligence and outcomes given a set of inputs. Examples of marketing applications include use cases within measurement, segmentation, incrementality experimentation ,customer journey, overlap analytics, data enrichment, etc.. An application is thought to be “end-to-end” if it delivers the ultimate intelligence and/or outcome.

Identity Resolution

Identity resolution is when you use an ID graph to determine when a user in dataset A is the same user as one in dataset B, even if they are utilizing different identity keys. For example, if dataset A contains a user record with “user_id” = 12345, and a dataset B contains a user record with “PersonID” = 34567, and via an ID graph crosswalk, we are able to learn that user_id: 12345 == PersonID: 34567, we can say that we have “resolved” the identity between those records.

Data Clean Room

A data clean room is an environment whereby two or more distributed datasets are able to be JOINed to power intelligent marketing applications in privacy-safe and business-safe ways. A clean room enables both parties to have transparency and control over what and how data is accessed and used, while ensuring that consumer privacy and consent is protected in line with all external regulations and internal policies.

How Marketers and Their Partners Leverage Data Clean Rooms

Marketers and their partners use data clean rooms to power marketing applications with JOINed data. These marketing applications often take the form of event-driven systems that can be connected to live data streaming sources and are operating in an always-on, automated manner. 

The scale potential of the marketing application hinges on a critical successful identity resolution occurring in an automated fashion as an input process to the JOIN. We call this process identity orchestration, whereby clean room software optimizes the resolution process across fields and graphs in an automated way.

A practical example

Let’s assume a common segmentation and activation collaboration scenario between a CPG company and one of its retail partners, whereby there is mutual benefit to JOIN CRM data to create impactful and relevant audience segments based on propensity scores derived in the clean room.

To deploy and take action on the benefits of the newly derived propensity scores, the identity resolution crosswalk must become part of the automated end-to-end system that can be used during the model development, training, and most importantly, in the final live system, where scoring will feed an always-on activation API.

The process of turning an identity resolution scheme into one microservice within an end-to-end marketing application is called identity orchestration.

Using the above example, we can also see how clearly privacy and identity are intertwined in a data clean room environment.

Let’s say that in the above example, an individual user record in the CPG CRM data has a consent signal indicating that it is okay to use their data for analytics as well as for targeting. However, the same resolved user in the retailer’s data carries a consent signal which limits use of data to analytics, not targeting.

Left unchecked, it is likely that the propensity model contains features from both datasets. However, the data clean room software must now know to ignore any features from the retail dataset when deriving propensity scores which may be used in any ad bidding/decisioning.

Scaling identity orchestration

In order to facilitate identity orchestration at scale, data clean room software must be practically compatible with the real-world scenarios you will encounter with potential collaboration partners. This takes the following two forms:

1. Interoperability

    The Identity graph provider market is fairly mature with many different providers running at various levels of scale within certain use cases and/or geographic regions. If your data clean room software does not work with the ID graph provider or scheme that you and each partner are aligned to, this creates hurdles to successful collaboration. Clean room software should work “out-of-the-box” with the common schemes and providers.

    This was what made Habu’s partnership and integration with Liveramp so important (more on that soon). It’s simply table stakes to support RampID (formerly IDL) for too many important marketer categories.

    2. Business Execution

    The simple truth is that identity can be tricky and expensive to get up and running. Contracts, terms, and price points can become prohibitive and/or barriers to progress. Increasingly, clean room companies and identity providers are partnering to make sure that using these complementary services together can short circuit much of the procurement and legal steps that can slow things down.

      Data clean rooms and identity are inextricably linked. For data clean rooms to power marketing applications, you must also have identity orchestration. When searching for a data clean room software provider, be sure to keep this top of mind.

      If you’re curious about how Habu’s data clean room software might fit within your environment and business needs, we’d love to hear from you.