Yesterday’s post was about social media bots, one aspect of what I call Big Data.
Today’s is about another Big Data component: how data harvesting is used.
The Guardianpublished the latest article in its Cambridge Analytica File series. ‘I made Steve Bannon’s psychological warfare tool’: meet the data war whistleblower’is fascinating.
The Guardian is looking into Cambridge Analytica because the firm was hired for Brexit in the UK and Donald Trump’s campaign in the US. The paper is trying to make the firm look like a bad guy, even though the Left have more powerful social media and data tools to hand — not to mention censorship. That said, Britain’s Electoral Commission and a select committee of MPs are investigating Cambridge Analytica as is Robert Mueller in his stateside investigation of Russian collusion. This is because of alleged use of Facebook user data.
In the US:
Aged 24, while studying for a PhD in fashion trend forecasting, he came up with a plan to harvest the Facebook profiles of millions of people in the US, and to use their private and personal information to create sophisticated psychological and political profiles. And then target them with political ads designed to work on their particular psychological makeup.
In the UK:
Last month, Facebook’s UK director of policy, Simon Milner, told British MPs on a select committee inquiry into fake news, chaired by Conservative MP Damian Collins, that Cambridge Analytica did not have Facebook data. The official Hansard extract reads:
Christian Matheson (MP for Chester): “Have you ever passed any user information over to Cambridge Analytica or any of its associated companies?”
Simon Milner: “No.”
Matheson: “But they do hold a large chunk of Facebook’s user data, don’t they?”
Milner: “No. They may have lots of data, but it will not be Facebook user data. It may be data about people who are on Facebook that they have gathered themselves, but it is not data that we have provided.”
Personally, even if Big Data and social media didn’t exist, there would have been a Brexit vote and a Trump victory regardless. Furthermore, to still loathe Steve Bannon now is pointless. He was fired from the White House in 2017. He left Breitbart in January 2018. He’s annoyed various people greatly, from President Trump to the Mercers (more about whom below). Rebekah Mercer bankrolls Breitbart.
What I found interesting about The Guardian‘s article was how social media data are gathered, analysed and used. The genius whose idea led to the founding of Cambridge Analytica is 28-year-old Christopher Wylie. He was 24 at the time. Now he has turned whistleblower, largely because of the results of the UK referendum and US election in 2016.
Before getting into Big Data, the Left also use the same analytical tactics. Wylie learned from Obama’s campaign team (emphases mine below):
Wylie grew up in British Columbia and as a teenager he was diagnosed with ADHD and dyslexia. He left school at 16 without a single qualification. Yet at 17, he was working in the office of the leader of the Canadian opposition; at 18, he went to learn all things data from Obama’s national director of targeting, which he then introduced to Canada for the Liberal party. At 19, he taught himself to code, and in 2010, age 20, he came to London to study law at the London School of Economics.
For me, the big issue here is how data from social media users are used to shape public thinking.
Cambridge Analytica is far from being the only firm to do this. The primary customers for such data analyses are likely to be national security agencies, the military and defence companies:
… at Cambridge University’s Psychometrics Centre, two psychologists, Michal Kosinski and David Stillwell, were experimenting with a way of studying personality – by quantifying it.
Starting in 2007, Stillwell, while a student, had devised various apps for Facebook, one of which, a personality quiz called myPersonality, had gone viral. Users were scored on “big five” personality traits – Openness, Conscientiousness, Extroversion, Agreeableness and Neuroticism – and in exchange, 40% of them consented to give him access to their Facebook profiles. Suddenly, there was a way of measuring personality traits across the population and correlating scores against Facebook “likes” across millions of people.
The research was original, groundbreaking and had obvious possibilities. “They had a lot of approaches from the security services,” a member of the centre told me. “There was one called You Are What You Like and it was demonstrated to the intelligence services. And it showed these odd patterns; that, for example, people who liked ‘I hate Israel’ on Facebook also tended to like Nike shoes and KitKats.
“There are agencies that fund research on behalf of the intelligence services. And they were all over this research. That one was nicknamed Operation KitKat.”
The defence and military establishment were the first to see the potential of the research. Boeing, a major US defence contractor, funded Kosinski’s PhD and Darpa, the US government’s secretive Defense Advanced Research Projects Agency, is cited in at least two academic papers supporting Kosinski’s work.
The article says that, in 2013, a paper on the subject was published. Christopher Wylie read it and offered to replicate the technique for Britain’s Liberal Democrats, who were starting to become a political non-entity. Wylie made a formal presentation for them with the pitch that such an analysis could bring them more new voters. However, the Lib Dems were not interested.
That said, there was a silver lining. One of the Lib Dems Wylie was in touch with introduced him to a company called SCL Group:
one of whose subsidiaries, SCL Elections, would go on to create Cambridge Analytica (an incorporated venture between SCL Elections and Robert Mercer, funded by the latter). For all intents and purposes, SCL/Cambridge Analytica are one and the same.
Alexander Nix, then CEO of SCL Elections, made Wylie an offer he couldn’t resist. “He said: ‘We’ll give you total freedom. Experiment. Come and test out all your crazy ideas.’”
Wylie was hired as research director for the SCL Group, which had defence and political contracts:
Its defence arm was a contractor to the UK’s Ministry of Defence and the US’s Department of Defense, among others. Its expertise was in “psychological operations” – or psyops – changing people’s minds not through persuasion but through “informational dominance”, a set of techniques that includes rumour, disinformation and fake news.
SCL Elections had used a similar suite of tools in more than 200 elections around the world, mostly in undeveloped democracies that Wylie would come to realise were unequipped to defend themselves.
Wylie holds a British Tier 1 Exceptional Talent visa. He worked from SCL’s headquarters in London’s Mayfair.
He first met Steve Bannon in 2013. Bannon, the then-editor-in-chief of Breitbart came to England to support Nigel Farage and his pursuit of a national referendum on whether to leave the European Union.
Bannon, Wylie says, found SCL in an interesting way:
When I ask how Bannon even found SCL, Wylie tells me what sounds like a tall tale, though it’s one he can back up with an email about how Mark Block, a veteran Republican strategist, happened to sit next to a cyberwarfare expert for the US air force on a plane. “And the cyberwarfare guy is like, ‘Oh, you should meet SCL. They do cyberwarfare for elections.’”
It was Bannon who took this idea to the Mercers: Robert Mercer – the co-CEO of the hedge fund Renaissance Technologies, who used his billions to pursue a rightwing agenda, donating to Republican causes and supporting Republican candidates – and his daughter Rebekah.
Wylie and his boss Alexander Nix flew to New York to meet the Mercers. Robert Mercer had no problem understanding the SCL concept, as he had worked in AI (artificial intelligence) himself. He had also helped to invent algorhithmic trading. The pitch Wylie made to him was based on:
an influential and groundbreaking 2014 paper researched at Cambridge’s Psychometrics Centre, called: “Computer-based personality judgments are more accurate than those made by humans”.
Wylie had to prove to Mercer that such a statement was true. Therefore, he needed data. This is where another company, Global Science Research (GSR), entered the frame:
How Cambridge Analytica acquired the data has been the subject of internal reviews at Cambridge University, of many news articles and much speculation and rumour …
Alexander Nix appeared before Damian Collins, an MP, in February 2018. He downplayed GSR’s work for Cambridge Analytica in 2014:
Nix: “We had a relationship with GSR. They did some research for us back in 2014. That research proved to be fruitless and so the answer is no.”
Collins: “They have not supplied you with data or information?”
Collins: “Your datasets are not based on information you have received from them?”
Collins: “At all?”
Nix: “At all.”
Yet, The Guardian states:
Wylie has a copy of an executed contract, dated 4 June 2014, which confirms that SCL, the parent company of Cambridge Analytica, entered into a commercial arrangement with a company called Global Science Research (GSR), owned by Cambridge-based academic Aleksandr Kogan, specifically premised on the harvesting and processing of Facebook data, so that it could be matched to personality traits and voter rolls.
He has receipts showing that Cambridge Analytica spent $7m to amass this data, about $1m of it with GSR. He has the bank records and wire transfers. Emails reveal Wylie first negotiated with Michal Kosinski, one of the co-authors of the original myPersonality research paper, to use the myPersonality database. But when negotiations broke down, another psychologist, Aleksandr Kogan, offered a solution that many of his colleagues considered unethical. He offered to replicate Kosinski and Stilwell’s research and cut them out of the deal. For Wylie it seemed a perfect solution. “Kosinski was asking for $500,000 for the IP but Kogan said he could replicate it and just harvest his own set of data.” (Kosinski says the fee was to fund further research.)
Kogan then set up GSR to do the work, and proposed to Wylie they use the data to set up an interdisciplinary institute working across the social sciences. “What happened to that idea,” I ask Wylie. “It never happened. I don’t know why. That’s one of the things that upsets me the most.”
Meanwhile, I’m breathing a sigh of relief. That’s scary.
This is how the project worked — simply incredible and rather alarming:
Kogan was able to throw money at the hard problem of acquiring personal data: he advertised for people who were willing to be paid to take a personality quiz on Amazon’s Mechanical Turk and Qualtrics. At the end of which Kogan’s app, called thisismydigitallife, gave him permission to access their Facebook profiles. And not just theirs, but their friends’ too. On average, each “seeder” – the people who had taken the personality test, around 320,000 in total – unwittingly gave access to at least 160 other people’s profiles, none of whom would have known or had reason to suspect.
What the email correspondence between Cambridge Analytica employees and Kogan shows is that Kogan had collected millions of profiles in a matter of weeks. But neither Wylie nor anyone else at Cambridge Analytica had checked that it was legal. It certainly wasn’t authorised. Kogan did have permission to pull Facebook data, but for academic purposes only. What’s more, under British data protection laws, it’s illegal for personal data to be sold to a third party without consent.
Wylie told The Guardian that Facebook knew this was going on by looking at their security protocols. The article says Kogan reassured Facebook by saying the data were for academic use.
In any event, Cambridge Analytica had its data:
This was the foundation of everything it did next – how it extracted psychological insights from the “seeders” and then built an algorithm to profile millions more.
For more than a year, the reporting around what Cambridge Analytica did or didn’t do for Trump has revolved around the question of “psychographics”, but Wylie points out: “Everything was built on the back of that data. The models, the algorithm. Everything. Why wouldn’t you use it in your biggest campaign ever?”
Wylie left Cambridge Analytica in 2014. He was not involved in the company’s work on Brexit or for the Trump campaign.
Facebook didn’t really think about the data mining until 2016, when Cambridge Analytica were working for Ted Cruz during the GOP primary season. The Guardian‘s Harry Davies wrote an article in December 2015 about the use of Facebook data in his campaign:
But it wasn’t until many months later that Facebook took action. And then, all they did was write a letter. In August 2016, shortly before the US election, and two years after the breach took place, Facebook’s lawyers wrote to Wylie, who left Cambridge Analytica in 2014, and told him the data had been illicitly obtained and that “GSR was not authorised to share or sell it”. They said it must be deleted immediately.
“I already had. But literally all I had to do was tick a box and sign it and send it back, and that was it,” says Wylie. “Facebook made zero effort to get the data back.”
There were multiple copies of it. It had been emailed in unencrypted files.
Cambridge Analytica rejected all allegations the Observer put to them.
Facebook commented on the data:
Facebook denies that the data transfer was a breach. In addition, a spokesperson said: “Protecting people’s information is at the heart of everything we do, and we require the same from people who operate apps on Facebook. If these reports are true, it’s a serious abuse of our rules. Both Aleksandr Kogan as well as the SCL Group and Cambridge Analytica certified to us that they destroyed the data in question.”
The aforementioned Dr Kogan is still employed by Cambridge University as a senior research associate, but he also has a position in Russia:
what his fellow academics didn’t know until Kogan revealed it in emails to the Observer (although Cambridge University says that Kogan told the head of the psychology department), is that he is also an associate professor at St Petersburg University. Further research revealed that he’s received grants from the Russian government to research “Stress, health and psychological wellbeing in social networks”. The opportunity came about on a trip to the city to visit friends and family, he said.
Social media data have turned into a powerful tool to be exploited. I have had several conversations over the past few years with Facebook users, none of whom minds who has access to their personal details: family members, friends, likes, dislikes and interests. To know that this information has been mined under the aegis of academic research then used for other purposes boggles the mind.