I've been on the distribution list for Survey Magazine (electronic version) for several years now. Recently, they published an article (I think sponsored by Cvent) about using text analysis on survey open ended questions. Though the article offers some useful advice, it seems to me that there are better ways to do text analysis on survey open ended questions than the methodology proposed in the article. So, I thought I'd discuss some of the main points made and offer some thoughts. For those interested, the url below will bring you to the entire article. For clarity, I've italicized material from the article and colorized, in blue text, my remarks.
http://viewer.epaperflip.com/Viewer.aspx?docid=b3b90dd4-b2bd-4945-8af1-a36100b7c43e#?page=36
The Survey Magazine article begins with the basics about why its important to do text analysis. It goes on to make two points: First, that analyzing text is difficult and why ("The hardest part about including open-ended questions in your survey is analyzing responses. Unlike close-ended questions, it’s doubtful that any two open-ended responses will be exactly the same."); Second, it states that a text analysis plan is necessary. The article goes on to share how to create such a plan.
I would agree that analyzing text is difficult, especially using a manual inspection method coupled with word or phrase searches. I also agree that doing text analysis offers lots of value and insight in the right scenarios. But, in my opinion the methodology proposed in the article is "the hard way" to do text analysis and even so will really only be useful on shorter ad-hoc surveys.
The article outlines five steps in a proposed text analysis process.
1. Use word cloud technology. The best way to begin your text analysis is by using word cloud technology. The technology sifts through responses and creates a visual representation of the most frequently used words or phrases. The larger the font of the word in your cloud, the more relevant it is to your data. Once you've seen which words pop up the most, you can start to make categories to group responses and analyze trends.
Almost all verbatim analysis technology uses word clouds in one way or another. More sophisticated products combine words or phrases with usage context. For instance, in Retail e-commerce situations words like "Web site" and "Click" show up all the time. But, analyzing them is valueless without more context. Manually ascribing that context is a major endeavor if any real response volume is involved. So, the process outlined is very manual and ultimately subjective to the analyst mapping the word or phrase to a category. Another analyst at another time might choose to map the same data to a different category, based on their own interpretation at the time (so there's multiple dimensions of subjectivity). Word clouds are useful tools but are not the "end all" to text analysis.
2. Establish categories. The next step in analyzing your open-ended responses is creating categories. Use your word cloud for insights into the range of thoughts and feelings articulated by your respondents. For example, if you asked customers how they think your organization can improve its product, and the words “cost”, “size”, and “color” loom the largest, create categories for those words. Once you begin to read your responses file them under the appropriate categories. If any of your responses fit more than one category, put them in both.
This is largely good advice. But, again its lot of manual work that would have to be repeated on a survey by survey basis. Several commercial text analysis systems I am aware of (including Etuma360) will do this kind of work automatically and then let you tweak the topic analysis produced, saving boatloads of time. And again, building subjective categories can be potentially problematic, for reasons discussed previously.
3. Review and refine As you begin to inspect responses more closely, you will probably find that you have to make adjustments to your categories. If responses used similar words to describe conflicting sentiments, you’ll have to create new categories; if the reverse is true, you can combine categories.
If you've implemented a manually constructed and word cloud based approach, this is good advice. Language is a living, dynamic construct. Interpreting it is always a "tweaking" process. Just, there's a better way to do it than the process proposed. At Etuma, we use a set of layered ontologies to map language meanings to our topic database. Effectively, this lets us use input from hundreds of our users to improve everyone's language interpretation, largely eliminating the need for each customer to always manage that process. Other text analysis tool vendors employ statistical mapping models that they tweak for individual scenarios and customers. Point is, "topics" found in text streams should be largely auto identifiable, especially in known contexts like customer service or e-commerce.
4. Make correlations. Now it’s time to examine the text within the framework of your overall survey. Start to couple open-ended responses with corresponding close-ended responses to draw conclusions about why respondents gave the answers that they did. If you used an open-ended question as an avenue for respondents to give an “other” answer to a multiple choice question, try and determine if there is a clear winner. You should also cross tabulate your data by demographic to see if any patterns emerge. Find out if certain groups within your sample tended to answer open-ended questions in the same way.
Clearly, this is something that should be done. And again, in my opinion there are better ways to do it than that proposed. At Etuma, we simply connect the entire survey (and background data set) by api or upload process and automatically connect topics with filtered data subsets based on survey response categories. It's a lot less work and the analyses produced simply auto update over time.
5. Summarize your results. After you analyze your results, summarize your findings and include any quotes from the text that were especially illustrative of your conclusions.
Of course, summation should be done, but for on-going surveys it has to be done regularly, as the topics people talk about should change over time.
--------------------------------------------------------------------------------------------
Lots of web survey platforms are implementing word cloud based text analysis. As someone who's used text analysis tools (Etuma360 primarily) for a couple of years now, I find that it is of limited value unless certain criteria are met. Some of those are:
To try Etuma360 click here:
http://viewer.epaperflip.com/Viewer.aspx?docid=b3b90dd4-b2bd-4945-8af1-a36100b7c43e#?page=36
The Survey Magazine article begins with the basics about why its important to do text analysis. It goes on to make two points: First, that analyzing text is difficult and why ("The hardest part about including open-ended questions in your survey is analyzing responses. Unlike close-ended questions, it’s doubtful that any two open-ended responses will be exactly the same."); Second, it states that a text analysis plan is necessary. The article goes on to share how to create such a plan.
I would agree that analyzing text is difficult, especially using a manual inspection method coupled with word or phrase searches. I also agree that doing text analysis offers lots of value and insight in the right scenarios. But, in my opinion the methodology proposed in the article is "the hard way" to do text analysis and even so will really only be useful on shorter ad-hoc surveys.
The article outlines five steps in a proposed text analysis process.
1. Use word cloud technology. The best way to begin your text analysis is by using word cloud technology. The technology sifts through responses and creates a visual representation of the most frequently used words or phrases. The larger the font of the word in your cloud, the more relevant it is to your data. Once you've seen which words pop up the most, you can start to make categories to group responses and analyze trends.
Almost all verbatim analysis technology uses word clouds in one way or another. More sophisticated products combine words or phrases with usage context. For instance, in Retail e-commerce situations words like "Web site" and "Click" show up all the time. But, analyzing them is valueless without more context. Manually ascribing that context is a major endeavor if any real response volume is involved. So, the process outlined is very manual and ultimately subjective to the analyst mapping the word or phrase to a category. Another analyst at another time might choose to map the same data to a different category, based on their own interpretation at the time (so there's multiple dimensions of subjectivity). Word clouds are useful tools but are not the "end all" to text analysis.
2. Establish categories. The next step in analyzing your open-ended responses is creating categories. Use your word cloud for insights into the range of thoughts and feelings articulated by your respondents. For example, if you asked customers how they think your organization can improve its product, and the words “cost”, “size”, and “color” loom the largest, create categories for those words. Once you begin to read your responses file them under the appropriate categories. If any of your responses fit more than one category, put them in both.
This is largely good advice. But, again its lot of manual work that would have to be repeated on a survey by survey basis. Several commercial text analysis systems I am aware of (including Etuma360) will do this kind of work automatically and then let you tweak the topic analysis produced, saving boatloads of time. And again, building subjective categories can be potentially problematic, for reasons discussed previously.
3. Review and refine As you begin to inspect responses more closely, you will probably find that you have to make adjustments to your categories. If responses used similar words to describe conflicting sentiments, you’ll have to create new categories; if the reverse is true, you can combine categories.
If you've implemented a manually constructed and word cloud based approach, this is good advice. Language is a living, dynamic construct. Interpreting it is always a "tweaking" process. Just, there's a better way to do it than the process proposed. At Etuma, we use a set of layered ontologies to map language meanings to our topic database. Effectively, this lets us use input from hundreds of our users to improve everyone's language interpretation, largely eliminating the need for each customer to always manage that process. Other text analysis tool vendors employ statistical mapping models that they tweak for individual scenarios and customers. Point is, "topics" found in text streams should be largely auto identifiable, especially in known contexts like customer service or e-commerce.
4. Make correlations. Now it’s time to examine the text within the framework of your overall survey. Start to couple open-ended responses with corresponding close-ended responses to draw conclusions about why respondents gave the answers that they did. If you used an open-ended question as an avenue for respondents to give an “other” answer to a multiple choice question, try and determine if there is a clear winner. You should also cross tabulate your data by demographic to see if any patterns emerge. Find out if certain groups within your sample tended to answer open-ended questions in the same way.
Clearly, this is something that should be done. And again, in my opinion there are better ways to do it than that proposed. At Etuma, we simply connect the entire survey (and background data set) by api or upload process and automatically connect topics with filtered data subsets based on survey response categories. It's a lot less work and the analyses produced simply auto update over time.
5. Summarize your results. After you analyze your results, summarize your findings and include any quotes from the text that were especially illustrative of your conclusions.
Of course, summation should be done, but for on-going surveys it has to be done regularly, as the topics people talk about should change over time.
--------------------------------------------------------------------------------------------
Lots of web survey platforms are implementing word cloud based text analysis. As someone who's used text analysis tools (Etuma360 primarily) for a couple of years now, I find that it is of limited value unless certain criteria are met. Some of those are:
- Larger surveys with lots of open ended responses
- Permanence. Surveys that run over long periods of time are better suited for coupled text analysis than surveys that are ad-hoc
- Lots of background data about survey respondents
To try Etuma360 click here: