Using data visualisation to map brands online
MEC
Although the debate around how, or even if, the influence and contribution of social media to brands can be measured rages on, social media measurement is now firmly mainstream. In 2013 the majority of larger media and advertising agencies have dedicated social media measurement teams, with even the traditional market research sector also being involved.
Most agencies (and increasingly advertisers themselves) use technology vendors such as Brandwatch, Sysomos and Radian6 to mine social media at a keyword level. Such services typically provide aggregated numbers of mentions for a brand, the overall sentiment of how people discuss that brand, and come from a heritage and core purpose of mining social media to listen to customers and protect corporate reputation by providing brands with the opportunity to nip emergent product or service issues in the bud.
But this misses a huge opportunity to use social media as an insight tool for brand planning in the same way that we might combine customised qualitative and quantitative analysis with syndicated resources like TGI.
Social media’s latent strength as an insight tool comes from three reasons: the amount of data is enormous, the data is reflective of real time trends, and the data is unprompted and therefore a more organic representation of true opinion.
The advantage of enormous data sets is clear, traditionally in academic and market research a lack of data is a real constraint. The sheer volume of data means that the types of analysis reserved for large scale, expensive studies is now the norm for social data. For marketing the real advantage lies in the ability to see trends for nuanced behaviours and niche brands without having to up the ante on sample size (and thus cost).
As Facebook’s Timeline illustrates, social media activity is invariably time-stamped (often also location-stamped) and frequently available based on when activity was happening. Thus, there is not only a pool of real time data, but a historical font of consumer opinion dating back to around the turn of the century, when social broke through into mainstream usage.
Most crucially, whereas many social environments are moderated, the data is typically unfettered from the influence of a question. Asking a question biases the answer and social data is a huge repository of unfettered opinion.
Realising this theoretical potential, however, presents real-world challenges, not least as each of the three advantages has a corollary challenge. Large volumes of data present an array of technical challenges in extraction, storage and manipulation; historical data sets extend these challenges further in requiring an approach that recognises a “third dimension” in analysis (when typically insight works on a two-dimensional world of attributes vs. brands); unfettered opinion means unstructured data set that needs to be organised before it is queried - but is best organised only after it is understood!
Overlaid to these challenges are two further complexities: is the market reality that the dominant social media monitoring tools aren’t primarily purposed for mining social data for insight, and the agency/advertiser reality that insight needs to be clearly, succinctly and above all quickly presented to have impact. As the Yale University Professor Emeritus Edward Tufte puts it, excellence in visualising data aims for “that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space”
One (conceptually) simple approach is to map regionally-specific data. Over 40% of twitter updates come from mobile, and 18% of Facebook users use only mobile. Social data does not just offer constant updates, but updates tied to a particular location. When tied in with using text analytics we can understand where and what people are talking about. Geo location data can be extracted from the mention itself (e.g. “I am in Hull”) or, if enabled the longitude and latitude recorded via gps. This provides a robust enough sample to map regionally or locally. Fig.1 shows analysis of the mooted fuel strike 2013 and shows the difference between volumes of activity (south-east metropolitan bias) against sentiment (negatively biased away from the south-east and more rural). Fig.2 shows a hyper-local example looking at what London talks about. Using Google earth, we ring fenced each tube station in zone 1 to within 30 metres. We then filtered every mention originating in London through these lat long ring fences and picked out the most dominant theme using text analytical. . We then visualised this using the TFL tube map, renaming each station on the basis of its theme. So rather than Oxford street, we having "shopping", and instead of Edgware road, have "Shisha". As befits mobile social, the data reveals a combination of “tourist” London (Sherlock Homes at Baker Street) and “Londoners’ London” (clubs and arts in SE London). “Escorts” at Gloucester Road we’ll leave to the reader to interpret!
The far bigger gain though is to be had by looking in social media data for insight on the bigger picture of brand territories: the emotional and rational spaces they occupy in consumers’ minds and the usage occasions informed and reflected by social media.
The problem here is that current visualisations of social data typically don’t reflect the big picture; they only report either the big quantitative picture (e.g. campaign effectiveness and as an analogue for brand awareness), or the small qualitative picture (e.g. verbatim quotes, word clouds). When dealing with how people perceive a brand and its competitors a qualitative approach represents the gold standard of what should be achieved. However, with social data, providing the big qualitative picture is far too time consuming, and text analytics provides the only way of crunching such volumes of data. In short, the problem can be distilled to wanting a big qualitative picture, but approached quantitatively.
Our solution involves “breaking” a number of rules of research, and altering our perceptions of what social data looks like. The key to understanding this data is that it is not a list of verbatim responses or scored questionnaire results. The quantitative data provided by social does not operate in the same territory as quantitative questionnaire data. Most importantly the data is rarely normally distributed. When looking at measures of reach, followers, interactions and mentions, the data is often skewed, and often forms an inverse bell curve, with the middle ground being the least populated. What would be considered an outlier normally is pretty much the norm in social data. The verbatim data from social, again is much more unusual than prompted verbatim responses. Mostly this is because social data is usually a part of a conversation, rather than as a stand-alone statement. Most importantly, inferring the meaning behind a statement is much more difficult. The meaning behind a typical verbatim response can be inferred from the question preceding it: e.g. the answer "it is too expensive" makes sense when the preceding question is, "why don't you buy Roquefort cheese". In social data much more effort has to expended to work out what the "it" is.
The final change of rules is that the data is not static from the point of collection and is only given meaning by the fact that it is a conversation, not a statement. That is, social data can only make sense when it is understood as a social act.
Our work on influencer identification had previously been built around the concept of the social graph. That is, a networked plot of the relationships and commonalities which people operate in. By looking at mutual friends Facebook can infer which people you are likely to know, Amazon can see which books you might like, and Spotify can recommend music to listen to.
To continue reading, download
Using data visualisation to map brands online (pdf, 1.04 Mb)