The conundrum of tracking political advertising on modern ad platforms

This problem has been bouncing around in the back of my head for a while now and I needed to get some of this out. One of the current issues that people are currently trying to wrap their heads around is the fact that modern ad platforms like Facebook and Google are targeting ads to extremely well defined subsets of the population so that much of this ad content flies under the radar since it isn’t exposed everywhere. Although the title is focused on the concept of political advertising, it can also be extended to any other kind of sensitive communications that uses an open advertising platform as the delivery vehicle. Noting the recent article from ProPublica about the volume of scam ads, it’s not just politics we’re dealing with here.

In the traditional ad world, everything was pretty much out in the open. Mass media was by it’s very definition media destined for the masses, so we all saw the same things. Which meant it was pretty easy when something showed up on television, radio or in the newspaper that addressed a political subject for journalists to “follow the money” and trace back where the ad purchase came from. By its very nature, everything was public. There were a number of very clear choke points where people were involved and money changed hands where you could easily control for whether something was “political” and ensure that the appropriate control process was initiated.

Now we have an issue where we are dealing with ad platforms at scale where the entire value proposition of the system is that it is automated end-to-end from the acquisition of the customer (ie, someone who is going to pay for ads on the platform), to the submission of the ad, to the targeting rules that will be used, to the payment handling. The obvious solution to many (especially legislators) is that the system needs a checkbox to tick if you are submitting a political ad so that it can be tracked as such. Of course, the idiocy of this approach is obvious: if I’m trying to game the system, I’m not going to check this box. Mixed in with the millions of other transactions, who’s to know? Especially since I’ve targeted the ad to only be visible to the audience I know is receptive or I intend to sway. There’s no general public display so nobody is the wiser that certain populations are seeing this material.

Currently there has been some discussion on the excellent Exponent podcast on this with some fussing around the fact that publishing or providing the ad information to a government is going against some of the service’s privacy policies, not to mention some privacy legislation.

A proposal

I’ve been thinking about the problem and I’m sure there will be people that can point out some of the failings of this idea, but here goes:

We have many interested players in this situation :

  • the users of the platform on the consumption side (that’s you and me)

  • the users of the platform on the publishing side (that would be advertisers)

  • the platform provider (Facebook, Google et al.)

  • the government that wants some kind of visibility into the platform’s activity

  • the 5th estate (journalists trying to get some kind of visibility independently of government)

Advertisers are users of the platform and as such should be able to avail themselves of a certain level of privacy protection regarding their activities. That said, they are in the business of publishing advertising which by its very nature is destined to be seen by the public - ie, it needs to be seen to have value. The value of the ad is not intrinsic to the ad itself, but rather all of the metadata associated with it. The metadata’s primary value is to the platform provider and the advertiser. As a general user, that metadata is of no particular use to me.

To a government or a journalist there are two levels of interest. The first is the content of the ad in question which means that we need to be able to “see” it. The second level of interest comes once we have been able to make some kind of determination about the content of the ad - ie. “should this be categorised as having political impact?”

In this vein, I would submit a regulatory proposal that requires all automated advertising platforms to provide a public feed of all of the advertising content they have accepted (perhaps separated into feeds by country). This would permit interested parties, be that a government or journalists, to review the content that is published, without having to be in one of the target groups and actively using the platform. I realise that this is going to be a firehose of data, but this would allow third parties to develop their own machine learning tools to evaluate the content and identify material that is of interest without having to jump through the current hoops of asking users to install browser plugins in order to expose the ads. At this stage, there are no additional metadata attached or available to the ads themselves, simply the content. This should not be objectionable from a privacy standpoint since the entire point of an ad is to be seen. Regulators should not be asking or legislating the platform providers to do the determination of what is (political | prejudicial | hate speech) in a black box of the platform provider’s creation. What we need at this stage it not a set of rules with undetermined second order effects, but some more visibility and light on what exactly we are dealing with. The point here is that this feed is not prejudiced by the platform provider’s algorithms and any intrinsic unconscious biases (from people or machine learning) that may be present.

To prevent other kinds of abuse a second feed completely disassociated from the ad feed that contains nothing but the aggregated targeting requests, without any associated ad content. A daily dump of the unique keyword filter arrangements, aggregated so that they cannot be cross linked directly to the ads themselves, would be a useful method to understand just how the platform is identifying classes of users would go a long way to identifying some of the blatant discrimination being practiced by platforms. There’s no need for new laws other than the publication of the data. We already have anti-discrimination rules in many jurisdictions, we simply need to have the means to determine whether they are being abused.

The next component is an arrangement, probably via classic court mechanisms (subpoena), to be able to request the associated metadata in the event that the content of an ad is prejudicial based on a set of determined legal standards (which will vary by country) permitting the investigation into the source and targeting of the ads in question. Obviously journalists do not have subpoena power and will have to do some more work, although I’m starting to lean towards some kind of legislation that treats these platform providers as nation-states and thus subject to some variant on the idea of a Freedom of Information Act in the context of this system. This points us down the rabbit hole of how does one define a journalist in today’s fragmented media world, but I’ll leave that one for later.

Diverging incentives

We’ve clearly seen Facebook’s point of view in the recent news around the US election which can be summed up as “hey, we just provide the platform, what people do on it is up to them.” Which is a little disingenuous considering the amount of effort that Facebook puts into ensuring that nobody sees a female nipple on pictures provided by the non-paying users of the platform. But it does make sense from a business motivation viewpoint. The impacts of election fiddling, hate speech & fake news areexternalitiesfor Facebook. There’s even a perverse incentive at play here where inflammatory material produces higher engagement from users and drives more ad views…

Platform owners are not incented to track down these things. So we need someone else watching. We have two actors that are properly incented to do so: governments and journalists. Granted with different immediate motivations, but with the public good being a shared priority. Neither can properly do their job at the moment, lacking direct access to the ad content, although ProPublica is doing a good job trying.

Quis custodiet ipsos custodes?

Side note

Following up on the Pro Publica scam article, this is a technical point that should not be dealt with via legislation, but common sense on the ad platforms, by severely limiting the use of (potentially malicious) JavaScript in the ad payloads. Two simple rules would go a long way to minimize the damage:

  • no calling to external libraries or sources. The code must be standalone and autonomous

  • with a complete code package, run security scans before letting out in the wild

Code should also be included in the ad feed for use by security researchers.