by Madhusudan Therani

Chief Technology Officer

March 19, 2018

Place Analytics with Human Mobility Data - An Overview

Personal mobile devices are generating a swathe of data about consumer activities on a daily basis with the widespread adoption of smartphones and other wearables. Such data is generally referred to with the rubric - Human Mobility Data. This data is a completely new source of raw data that can shed light on everyday human activity both at an individual level and at a cohort/group level. Furthermore, in a complementary manner - the same data can also be used to understand a place - a physical region in the world. Though counter-intuitive, most of our everyday lives transpires in the physical world - at home, at work, on the streets from-to work, at school, at malls, shops etc. In this brief article, we discuss how human mobility data can help with the different kinds of decisions from a places perspective. How can this human mobility data complement other sources of data about a physical place? What are the potential issues in harnessing and using human mobility data? For purposes of this discussion, we want to answer this question at three levels of “spatial resolution” - a) indoors in a retail store scenario, b) at a neighbourhood/postal code level and c) at a state/country level.

Indoors/Retail Scenario

Source: Walmart

Before delving into how the data can be used, it is worthwhile to understand the kinds of data sources/sensors that one can use to understand a “place” - let us say for exemplar purposes -

An apparel store. Potential sensors and associated data sources of activity at the apparel store could include:

  • Indoor/Entrance - video streams for security
  • Wifi hotspots - within store - generate passive pings and active connections
  • Beacons/custom hardware - identify smartphones in store, collect information about customers and send messages to them
  • Mobile app activity from store locations/within a store
  • Interaction with store systems such as POS systems, departmental kiosks etc.

Assuming a store is instrumented with all the aforementioned sensors, what kind of questions can the retailer expect to answer? Examples are:

  • When is the store most active? How does store activity transpire in a weekday/weekend?
  • What are most active aisles? Is the layout appropriate?
  • How do customers move within the store?
  • How long do customers stay in the store? How many browse and leave versus buy?
  • Can the customer find products quickly?
  • Which merchandise should be positioned where?
  • Are shelves stocked properly?
  • Is there enough sales support staff in the store to help customers?
  • If one can identify the individual via any of the above channels - all interactions can be “personalized” - how?
  • What kind of customers actually buy what kind of products? How do I use this information to acquire new prospects?

Apart from supporting in-store decisions such as above, one can also glean some partial info about - consumer activity outside the store - depending on where the store is located - in a mall, in a high-rise, in a strip-mall, on main street etc. Though data availability looks sufficient to answer aforementioned questions from the streams above, there are a number of hidden issues:

  • Firstly data from one stream cannot be linked with the other.
  • At an individual level, identifiers are different on each of the streams.
  • Temporal granularity is different.
  • What you can infer from each stream is different
  • Data in each stream is potentially under the control of different parties and is probably siloed into different vendor-specific systems
  • Aggregated spatio-temporal information can guide generic operational actions rather than a per customer interaction.
  • It does not provide data about your neighbouring competitors or complementary vendors.

Postal Code/Neighbourhood level - Stadium/Mall/Real Estate Scenario

Human mobility data can support decision regarding much large outdoor assets such as Stadiums, Malls, Billboards, housing developments etc. Decisions that need data support include:

  • What kind of people travel through this area?
  • Where are they coming from? Where will they come from? How are they traveling?
  • How to provide parking? Other emergency services?
  • What is their demographic/affluence profile?
  • What are the existing entities folks spend time at?
  • How is the ingress/egress and transportation dynamics during a regular day?
  • What kind of public services and other facilities are required? currently available? used?
  • How can I improve the utilization of my current assets?
  • How can existing commercial real estate be repurposed? Re-developed?
  • What are missing services? What can be shutdown?
  • What kind of policing do you require during events?
  • What kind of retail establishments are occupied before, during and after events?

New data sources (in addition to those mentioned earlier) include drone and image feeds that can be mined for data.

State/District/City level - Public Facilities/Land Use/Environment Scenarios

Increasing the spatial area of interest, the nature of the questions is far more aggregative with longer time windows. Questions include -

  • Land Use - commercial, agricultural, residential, industrial - how is it distributed? What can be added? changed? How have zoning requirements been defined? How should they be changed?
  • Environment/Pollution - what are the sources? What are the pollution levels? How is the weather affecting the dynamics? Waste management services
  • Security - What kind of public services do you need?
  • Transportation facilities for people and products - public and private transportation
  • Urban planning questions with respect to land development

At big geographic scales, satellite and remote sensing data become available which need to be processed to derive additional data points.

Decision-Making at Individual and Group level

Additionally in each of these scenarios - data for decision-making is required in all these combinations for different spatio-temporal windows.

Furthermore, the complexity arises when you need data that combines these different buckets - say data about a group at a given place versus data about an individual at a set of places.

Given these varied contexts wherein human mobility data may be potentially useful, a few more open questions from a data utility perspective are:

  • How are these decisions being made currently? What kind of data is used to make these decisions? How can you actualize these decisions operationally?
  • What kind of data is public? What is proprietary? How can proprietary data be shared/accessed/summarized? How can they be combined?
  • How frequently are these decisions made? in a data driven manner? Daily? Weekly? yearly?
  • What is the commercial value of these decisions - both from the cost perspective and financial impact?
  • What kind of business processes/functions does this data operationally impact?

As the world goes digital, a lot of incumbent ways of making the types of decisions mentioned above are being revisited. Census data is the basic data set from which all these decisions are made - primarily extrapolated.

Though new data sources of human mobility promise value, much work is required in -

  • Cleansing data at scale - noise removal, data censoring, refinement
  • Handling data variability across sources
  • Linking data points across time and geographic constraints
  • Correlating on different dimensions across different sources
  • And finally, building models for inferring properties
  • Visualizing data at scale across different stages to tell a story

Current approaches are highly fragmented and ad hoc without a clear value proposition as different incumbents across spatial data purveyors, map makers, mobile app publishers, BI tool vendors and others navigate this space. At Near, our vision is to drive the use of human mobility data in these varied contexts. The Near platform is engineered to address many of the non-trivial issues as we make the data viable for commercial use at scale. For your specific needs on any of the above scenarios, please reach out to us at allspark@near.co.

Contact Us to use data for superior decision-making.