Near Team

by Madhu Therani

Chief Technology Officer

October 11, 2018

One of the key “common” issues faced by advertisers and publishers at large in the ad-tech eco-system is the issue of “limited” reach. Given any audience segment - defined in terms of certain key attributes for each individual member - we want to identify more people in the real-world who an advertiser can reach out to promote their offerings - a product or service. “Limited reach” is to be expected - the more criteria you define that a customer should have - smaller is the pool of people who can satisfy that criteria. If you are gender-specific, on average you have lost 50% of your audience, if in addition you are age-group specific (say under 18, 20-40, 40-60 and 60plus), your audience drops to 12.5% of your starting pool. Any other additional criteria only reduces this further. Now given any channel you may pick - digital, print, OOH, TV etc - the total number of people reachable is bounded. For example, a TV show may reach only so many households who have subscribed to the channel. A key issue is how do I increase this 12.5% for example? May be the constraints are not that hard and fast - which constraints should one relax? It depends on the product or service being promoted. If you are promoting cruises to the 60 plus age group, may be a cruise during spring break is appealing to the university student audience. Additionally, the potential reach drops further when you add requirements such as frequency of ad exposures to lessen message fatigue - so finding “complementary and/or overlapping” audiences is key for both advertisers and publishers.

Audience Extension is the generic term used by publishers to increase the relevant pool in general or more specifically - a specific segment being targeted. Lookalike modeling has a similar objective - and is a term more commonly used by advertisers. For example, in the example mentioned above, the cruise company (the advertiser) can buy data from AARP - the association for retired people. Cookies from the AARP website can be shared as-is by AARP with the cruise company - though the advertising may not be delivered on the AARP website - the same audience can be reached on other websites - based on the background technologies of cookie syncing. From AARP’s perspective - the “publisher” - this would be called audience extension. They are monetizing their existing audience.

Alternatively, the cruise company, can go through its existing set of customers, identify key criteria - such as Zip codes they come from or people born in a certain year etc. Based on this criteria, they can go through all public records and identify customers who just crossed 60+ and target them or live in those zip codes etc. This is the “lookalikes” approach to audience extension. The attributes of age and home location were used for identify similar prospects.

The key thing to realize is - either way is an approach to add to an already existing set of audiences. There are many approaches to audience extension - on both the publisher and advertiser perspective. The term is used in many ways across the ad-tech eco-system.

The basic steps for Audience Extension are the following:

  • Start with a “seed audience” in a given media channel.
  • Extend the seed pool - by adding more “similar” aka “lookalike” members to the pool or “reach” members of existing pool via alternative channels.

Seed Audiences can be obtained in a number of ways -

  • Cookies of visitors to your website or a particular page.
  • Cookies who saw your advertisement on a recent campaign.
  • Buy cookies from a publisher who specializes in a particular type of content.
  • Home addresses of people from your loyalty program who bought something for Christmas last year.
  • Mobile Ids for people who have downloaded and installed a particular app related to your product/service offering.

Once the seed audience is defined, simple approaches to “extend” include -

  • Cross-channel/Cross-screen extensions - for example - converting seed cookie pools to mobile ids and reaching them on mobile devices.
  • Using Retargeting techniques - reaching the audience on different websites online or “physical retargeting” - reaching them when they visit similar physical locations.
  • Cross content extensions - Based on apriori known brand/product affinities, reach the audience when they visit complementary and affinity-driven sites. For example, sports-lovers may also be reached on online gaming sites working on the thesis that sports lovers also spend a lot of time on online games - possibly obtained from one’s own audience research or bought from a research firm.

If your seed pool is big enough, these basic extensions may be quite effective in extending reach.

If the above does not meet your needs, the next step is to add new “potential” prospects to the seed pool. Addition to the seed pool can happen in multiple ways -

  • Identifying prospects who meet all the “criteria” of seed pool members from a larger pool of audience. This is what Facebook does when you define a lookalike audience - they use your seed set to go find other audience who match “all” the criteria.
  • A more relaxed version of the above is - meet some of the criteria - that is - the notion of “similarity” is loosened so that more prospects who meet one or more criteria of members in the seed pool can be identified.

This method of adding more members to the seed pool or creating a segment “similar” to the seed pool is the method of “lookalikes”. A number of approaches exist to generate the pool of lookalikes -

  • Firstly, picking the attributes of the users to match on and relax on. Pick the values of these attributes.
  • Secondly, algorithmic techniques to compute a similarity metric - when a candidate is selected from the open pool of prospects - how close are they to the required criteria.

Attributes of a used include - gender, age, age-groups, content preferences, locational preferences, specific brand preferences, page categories visited, transactional data on retail preferences, household characteristics, household incomes, education, professional standing, social media activity, Geo-demographic clusters - such as those from Nielsen, Acxiom and PersonicX, loyalty card preferences and more. Once the attributes are chosen, next is to focus on the specific values - for example - if residential area is a criteria - one may have specific zip codes under consideration. One can also “pre-defined” brand/content affinity relationships provided by research firms to select audiences from your own existing DMPs. The attributes may be ordinal or nominal, numeric or categorical, free text etc. Once these are known, algorithmic techniques include a variety of nearest neighbourhood methods, “similarity search” methods, clustering techniques, index based techniques etc. The quality of the resulting lookalike audience depends on how well the above steps are implemented. If audience behaviors change dynamically, these lookalike audiences need to be computed periodically else they can go stale. Methods and Metrics for how the lookalike audiences are built and evaluated are highly ad hoc and vary across vendors and channels.

From a marketer’s perspective, it is important to understand the issues in utilizing lookalike audiences. Some of the key ones are:

  • Though one may get more reach, there is a loss in the “fidelity” of the segment based on the criteria being relaxed. As a marketer, you need to be able to control which attributes are non-negotiable and which can be relaxed. If gender is an important dimension of your target market for your product, there is no point relaxing that criteria and targeting the other gender - you are wasting spend. One needs to conduct a series of small A/B experiments to test what can be done. You should also triangulate across vendors - use the same seed pool and get lookalike subsets from different vendors and cross-correlate.
  • Secondly, along with relaxing the audience criteria, campaign metric expectations may also need to be revised. Performance criteria will have to be relaxed as you do not know apriori how consumer reactions to your ads may be. Even creatives may need to be changed.
  • These effects get magnified when you use “lookalike audiences” across channels. There is a ad efficacy/reach trade-off to be understood.

At Near, we have implemented a lookalike audience generation and extension mechanism and are currently testing the same in context of Allspark. Our lookalike measures consider locations and place categories as key attributes in the audience generation approach, considering Near’s deep understanding of consumer journeys in the physical world.

If you would like to know more, please reach us here.