This post is another piece in the series I am researching for the article I am writing about social media. This time I have turned my attention to listening, which I have called social media monitoring, but some call blog mining and others call social media research - I'd love to hear your name for it, along with any other comments you have on the post below.
In the light of some of the emails I have received following some of the other articles I should stress that these posts are an attempt at reportage rather than polemics, they do not represent what I think should be the case, but what I think is currently the consensus amongst the cognoscenti.
Social Media Monitoring
Social media monitoring (also referred to as buzz monitoring, blog mining, and by some just as social media research) refers to research based on listening to the discourse of the web, especially social media, and usually refers to the use of automated tools to process that discourse, typically looking at thousands or millions of conversations. Social media monitoring can be passive, for example listening to people to find out what interests them, or it can be active, searching for references to a specific brand, campaign, or action.
An early example of the power of social media monitoring was given in 2005 by the CREEN project [http://www.creen.org], the project monitored the output of 100 thousand blogs for over three years and recorded the dual instances of science related words with fear/anxiety related words. The project tracked the volume of hits over time, and when peaks were observed the researchers reviewed the key phrases that were driving the increases. Examples of spikes in the data were ‘Schaivo’ (relating to the Terry Schaivo life support case in the US), and ‘stem’ (relating to research with stem cells).
A second, and more commercial example, is provided by the work the hotel chain Accor have been doing with social media monitoring company Synthesio (Accor haver several brands including Sofitel, Novotel, and Motel 6). Synthesio track 4000 specific Accor hotels, along with 8000 competitors, in eight languages, to produce a global dashboard, 40 regional dashboards, and 4000 hotel specific dashboards, each dashboard displaying key competitors, and each dashboard being updated weekly. The analysis combines the processing of open-ended comments in social media, scores from evaluation sites such as trip-advisor and Booking.com, and more traditional measures. As a result of using the process Accor report a rise in brand equity, satisfaction, and bookings. The system has allowed Accor to quickly identify underperforming hotels and for individual negative comments to be located and acted on.
Commercial Social Media Monitoring
Although there are some free social media tools, most of these only skim the discourse of the web. In order to conduct serious research it is necessary to either develop tools or to use one of the many commercial services, such as Radian6, Lithium, or Synthesio.
The process of social media monitoring starts by building a corpus, a body of text to analyse. A corpus is built by utilising spiders and bots to collect relevant parts of social media and the wider web, where part of the skill is in defining ‘relevant’. Although the available tools are increasingly powerful, they do have limitations, in particular not all of the web is accessible (especially large parts of Facebook) and it is often impossible to determine the geographic location of somebody posting a comment.
Before the data can be analysed it needs to be cleaned. Data that originates from the client, from its various agencies, and from bots needs to be removed. For example, if one of the reasons for conducting a project is to monitor the launch of some new campaign, messages originating from the media, PR, and marketing agencies need to be removed from the corpus. Another element of the cleaning is to remove erroneous matches, as Annie Pettit of Conversition has said, when looking at soft drinks, Coke the drink is good, but coke the drug is not.
Once the corpus has been created the analysis takes place and can include one or more of the following: counts, trends, sentiment, and influence.
The simplest form of analysis is to count how often key words or terms occur, sometimes extending this to take words in particular contexts. In terms of depth, simple counts are fairly superficial, although they have been shown to be of interest in some situations, for example several people have reported success in mapping the frequency of politicians being mentioned and their success in elections. Google have shown success in mapping diseases by counting the terms that people type into search engines, to identify possible incidences of flue and more recently dengue fever.
In many ways trends is simply an extension of counting, looking at whether a term is becoming more used or less used, and to see if it can be linked to other phenomena. Looking at whether a term is trending on Twitter has become a key component in reputation management, and brands look to see if they can identify trends, in social media, associated with their campaign launches.
Sentiment analysis is either ‘the’ core benefit of social media monitoring or the snake oil of 2011, depending on who you talk to. The idea behind sentiment analysis is very straightforward, rather than just count how often a key word or phrase is used (for example a brand name), sentiment analysis measures how many times it is said in a positive, neutral, or negative way. The data collected by the commercial systems are typically too large to code by hand, so one of the following approaches is followed:
- Automated techniques, applying a variety of approaches and algorithms.
- Manually coding a sample of the database.
- Coding some of the data manually to allow software to ‘learn’ how to code the rest.
The two key disputes about sentiment analysis relate to how accurate the automated software is and about how accurate it needs to be. A study conducted by UK agency Freshminds in 2010 concluded that the high figures quoted by the systems could be largely due to the preponderance of neutral comments (which often run at 70%). When Freshminds looked at just those posts classed as positive and negative they found that the validity of the systems was often below 50% (using the results of manual coding exercise as their criterion).
However, when considering the accuracy of automated systems it is necessary to note that different manual coders produce different results. As Annie Pettit has said, if manual coders are 80% accurate and automated systems are 70% accurate, then we will want to use the automated systems some of the time because of the volumes of information they can process.
At the NewMR Text Analytics event in March 2011 the consensus view of the twelve speakers was that at the moment manual coding of sentiment is a key element in using social media monitoring. This has series implications for the cost of using social media monitoring.
Influence and Identification
One of the key features of the tools used for social media monitoring is that, in most cases, they are not designed specifically for market research, they are equally, or perhaps more specifically, designed for marketing. As well as identify key words and phrases the systems can find who is saying what, allowing them to be targeted for marketing, viral leads, and word of mouth advocacy.
This power to find out who is saying what, who is listening to whom, and who appears to have influence, is a two-fold challenge for market research. Firstly, this power risks removing the anonymity of the people being researched, and secondly if market researchers are not prepared to be involved in uses such as intervention and response marketing they may find themselves marginalised in the whole area of social media monitoring.
The business of social media monitoring
The social media monitoring platforms burst on the scene a few years ago with a fanfare that promised to sweep away traditional research, ‘Why ask surveys to a few people, when you can listen to everybody?’ was the new mantra. However, the inroads social media monitoring have made so far into market research have been fairly modest.
The key factors that seem to have held social media monitoring back are:
- Social media monitoring only reports what people are talking about, if they are not talking about you, or if they are not talking about the issue you need to research, then it does not solve the problem.
- Social media monitoring is much more expensive than was expected, even to monitor a few brands and terms across a few markets is likely to cost several thousand dollars a month.
- The widely perceived need to use manual coding has held back the credibility of social media monitoring and made it slower and more expensive.
- Many of the areas where social media monitoring is strong, such as measuring reactions to experiences and advertising are areas where research buyers are very conservative, preferring to stick with their tried and trusted brand and customer satisfaction trackers.
- The lack of knowledge about what the results of social media monitoring mean become apparent when companies start to use it. Comments from different parts of social media (e.g. Facebook, blogs, and Twitter) often produce different findings, using different search terms often produces different results. There is a feeling that the results need to be merged to make them more representatives, but merged in what proportions? And, what are the merged result representative of?
However, social media monitoring is establishing a base within market research, and an even bigger one outside. Most brands recognise that they need to monitor what people are saying about them, even if they can’t use that process to replace other research. Researchers such as Conversition’s Annie Pettit have also demonstrated how social media monitoring can be combined with traditional research, using the social media listening to identify topics, issues, and even potential scales, and then use conventional research to create a representative research picture.
It is likely that the cost of social media monitoring will fall, as software improves, and as competition between the large number of providers increases, this will increase the use of social media monitoring tools for both research and non-research purposes.
I love to hear what you would add, delete, or amend from the reportage above.