Skip to content

Data Science: How to Detect Lat/Long and Prevent Location Spam

Location features and data collection are becoming the peloton in mobile development. BIA/Kelsey predicted in a recent study that location targeted revenues will soar to $18.2 billion in the US in 2019, up from $4.3 billion in 2014. Similarly, Google emphasized the importance of location for mobile users by saying over 50% of mobile searches have local intent, for example, to find a shoe store nearby. Moreover, other research found that a whopping 78% of these searches resulted in an offline purchase. Clearly, having a user’s correct location is absolutely pivotal for a successful outcome, yet very few developers, publishers and advertisers truly understand the intriciacies of this device signal.

Image source: Demystifying Location Data Accuracy, MMA 2015


To explain where location signal mistakes come from we need to see how publishers and developers decide the user’s location – defined in lat/long coordinates – and how this is shared with 3rd parties such as ad networks. These are the sources for obtaining the user’s current location, where the first five are technologies and the latter two are methods:

  • GPS: the smartphone connects to GPS satellites to provide the location. Highly precise but also very sensitive to disruptions such as tall buildings and being inside a building.
  • WiFi: the smartphone sends out its MAC address when scanning for WiFi which a router can pick up and use to identify users and provide their location. Does not work with iPhones higher than iOS7 if not actually connected to the network. Highly precise and insensitive to many competing signals.
  • Cellular signals: triangulation through cell towers gives an approximate location.
  • Bluetooth: beacons in a store, for example, connect to the smartphone and give a highly accurate and precise location. Today, these technologies have not been adopted to provide location data at a large scale.
  • IP address: inaccurate and imprecise as the location of the app server does not have to be the same as the user.
  • Self indicated/Registration: publishers can ask users to enter their location (eg. ask for ZIP code) to provide services. Accurate but often not very precise as user friendliness can suffer. Also, when not asked each session, the data is outdated.
  • Content based: some publishers serve content that is focused on a specific area (ex. LA Times). Not very reliable as readers can be in very different locations but generally accurate for a large part of traffic.

How does location spam occur?

Given this wide variety of locations signals, it’s not hard to see how the quality of the data can vary widely, as well. This is what causes location spam: inaccurate and flatout wrong location signals that get processed into bad data which gets fed into an ad network, misleading the ad buyer in the process. There are many causes of these signal errors:

  • Urban density: a dense urban area is especially disturbing for GPS signals
  • Data freshness: a publisher sends location data obtained by other apps or at a previous time, making the data outdated
  • Default values: when a location cannot be retrieved and a default value is sent instead
  • Centroid processing: publishers often derive the center-points of geographic areas such as zip codes, DMAs, or states and send as the real-time proxy for a user’s location. This is often derived from the IP address.
  • Cell Tower Sinks: All mobile ad requests appear to come from the same tower.
  • Speed: When a device travels further or faster than humanly possible
  • Truncation: when decimal numbers are limited and a publisher appends own random numbers at the end to fulfill the requirements for the amount of digits.

Besides these issues, there’s also the fact that the OpenRTB protocol used for programmatic advertising does not have any standards when it comes to location reporting. This makes it nearly impossible to check the quality of the location data since the protocol does not take in from what source it is obtained or under what circumstances.

Image source: Demystifying Location Data AccuracyMMA 2015

Effect on ad effectiveness

Location spam has large effects on the effectiveness of locally targeted ads, and the more precise an advertiser wants to be, the greater the damage when an ad request comes with faulty location data. Think of retailers wanting to attract people that are closeby to one of their stores, or a service provider that only works in a certain region; if the location is too far off the ad will be useless by definition because a conversion is simply physically impossible.

Efforts are underway  

Given the huge and growing importance of location data there is a strong focus from the industry to improve its reliability. The Mobile Marketing Association (MMA) set up the MMA Location Accuracy Committee to ensure standardization and to streamline efforts from other players in this field. An essential part in combating location spam is the utilization of Big Data and data science technologies to recognize when faulty location data is being fed into the system. With these efforts are under way, we can be sure that 2016 will be a tipping point towards trusting location based ad campaigns.

Sign Up to Receive Beth’s Free Stock Predictions:

Beth KindigSign up for my free newsletter on tech stocks. I write unique analysis on tech stocks predicts where tech stocks will go next. I publish fresh insights about 2-3 times per month featuring rare, in-depth analysis. This year, I predicted Facebook’s Q2 crash, Roku’s meteoric rise, Oracle’s slow decline and more. My newsletter subscribers are first to receive the information. Best of all, the newsletter is free. Sign Up Now. I look forward to staying connected.

Published inDataPublished in 2015

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *