Don’t Give Cambridge Analytica Too Much Credit

The data matrix has us.

Facebook banned data firm Cambridge Analytica from its social network on Friday. Their crime? Failing to delete a data set gathered via a Facebook app in 2014 that they’d agreed to destroy — including information from possibly 50 million Facebook users.

Since Brexiteers and the 2016 Cruz and Trump campaigns contracted with Cambridge for help targeting voters in the UK and US, and special counsel Robert Mueller has subpeoned documents from the company in his investigation into possible Trump/Russia collusion, politicians quickly took notice. In the UK at least, the hyperbole went nuclear:

One member of Parliament, Jo Stevens, said Facebook’s relationship with its users’ personal data “reminds me of an abusive relationship where there is coercive control going on.” At another point in the hearing, fellow lawmaker Rebecca Pow questioned whether Facebook was a “massive surveillance operation.”

Count me as a +1 for the “massive surveillance operation” critique — though we agree to the spying every time we log in. Remember: if you’re not paying, you’re not the customer, you’re the product! Actually, the fact that Facebook collects so much data on its users and (functionally) rents it to anyone advertising on the platform goes a long way toward puncturing the hype around Cambridge and its “psychographic” social mapping.

Here's the thing: psychographic targeting isn't new. Vendors have been pitching it for years, usually basing their models on the same kinds of data that corporate brands both create and consume in bulk. Click To Tweet

The company ran afoul of Facebook because it used a quiz app to gather data from users’ Facebook accounts — and those of their friends. This kind of “scraping” was apparently allowed back in 2014, though most vendors used it on a smaller scale than Cambridge seems to have done. Cambridge basically aimed to reconstruct the web of relationships and personal information embedded in Facebook so that it could manipulate the data without going through Facebook’s data interface –and Facebook’s rules on what they could access and share.

Cambridge used the scraped data (combined with a voter file) to sell the idea that its staff could help commercial and political advertisers reach exactly the right customers and voters with exactly the right messages. Republican donors Robert and Rebekah Mercer backed the company because of this promise, and they and then-ally Steve Bannon encouraged groups on the Right to sign up as clients, including the Trump campaign.

When 60 Minutes asked Trump digital director Brad Parscale about Cambridge, though, he said that his team never actually used the data. Instead, they could target voters and potential donors more effectively using the information they gathered by actually running Facebook ads and measuring the results.

Here’s the thing: psychographic targeting isn’t new. Vendors have been pitching it for years, usually basing their models on the same kinds of data that corporate brands both create and consume in bulk. Data experts I’ve talked with generally say that psychographic models can be useful in your first rounds of outreach, since they should give you at least an idea of whom to target.

But at soon as you start to gather information about which people actually respond to a given message, the models have mostly done their jobs: you’ll quickly see the difference between who SHOULD click on something and who DOES click on it. What voters actually do matters more than what you think they MIGHT do: data derived directly from voters’ choices quickly supersedes models that try to predict their behavior. Think of polling vs. canvassing: polls help you estimate what people think, while canvassing lets them tell you.

What voters actually do matters more than what you think they MIGHT do: data derived directly from voters’ choices quickly supersedes models that try to predict their behavior. Click To Tweet

Looking back at the Trump campaign, Parscale’s team famously automated Facebook best practices on a vast scale — on some days, they ran more than 100,000 different ad/ad targeting variations, collecting far more useful data than Cambridge could provide and building their lists at the same time. Who needs a model once people’s fingers tell you what they think?

As for Cambridge’s scraped data, they were crazy to hold on to it once they’d agreed to delete it. A data set like that is a wasting asset: it starts to lose value as soon as you create it, unless you can enrich it with more data. As people move, die, marry, have kids, change their interests or otherwise deviate from their 2014 Facebook state of mind, a one-time snapshot of the electorate becomes less and less useful — particularly with so much other data on our preferences and behaviors for sale.

As people move, die, marry, have kids, change their interests or otherwise deviate from their 2014 Facebook state of mind, a one-time snapshot of the electorate becomes less and less useful. Click To Tweet

None of this excuses Cambridge Analytica’s many and varied sins. We don’t know yet whether Cambridge staff acted as a conduit to Russia’s election-hacking effort, but we CAN make the case that they’ve been guilty of seriously over-hyping their technology — I’ve talked with people familiar with their presentations, and they sound like a load of hooey. Plus, rumor has it that Cruz’s people were dissatisfied with the results of Cambridge’s work and stopped using their data later later in the 2016 primary process.

Remember this the next time a vendor pitches you on a “revolutionary” product: if a technology promises magical results, look carefully for the smoke and mirrors.

Written by
Colin Delany
View all articles
Leave a reply