Reliable Research

*This article has been updated since it was originally published in 2011*

We all know that words have power to persuade, influence, and emotionally impact people. Good copywriters build great ads - the phrase "Just do it" built one of the greatest and longest running marketing campaigns in history for Nike.

We all know that we should pay close attention to the content on our websites and carefully choose words to help our sites accomplish our goals, but it's easy to forget about the little details of wording as we build our research test plans, stimuli, and surveys.

This article outlines three different reasons to pay attention to the words you use in research, as well as some practical research-based advice on how to build and conduct research.

1 Strong words push people away

Quantitative usability guru Jeff Sauro has an informative blog article about the wording of questions in the System Usability Scale (SUS) - a common post-test survey used to score the overall usability of websites and software. The SUS asks people to rate their agreement with evaluative statements (i.e. "I think that I would like to use this website frequently") on a Likert scale (example PDF). Jeff and his colleagues asked a group of people to rate a website using the SUS, but the participants received different versions of the questionnaire. A third of the participants received a standard version, while the remaining two thirds received a version with either a strong positive slant to the questions asked ( "I think that this is one of my all-time favorite web sites") or a strong negative slant (“I think I never want to use the web site again”).

What happened was interesting - people rated away from agreement with these emotionally-charged statements. In other words, people rated the site more negatively when they got a bunch of positive statements, and more positively when they got a bunch of negative statements. And they did this in a significant fashion. Lesson learned: People don't like taking extreme positions.

Make sure your research is reliable:

When gathering customer feedback via scales, understand that small variations in degree of commitment can have a large impact on your data.
Users are likely to shy away from strong statements of opinion.

2 The influence and impact of descriptive words

Human memory, including recollections of experiences, is highly changeable and suggestible. A cool article referencing a classic psychology investigation into suggestibility got me thinking about how using the correct words at key points in an interaction can substantially influence people.

In this experiment, participants were shown a photograph of a car accident along with descriptive text. One cohort of participants read a description that the cars had "hit" each other, and one cohort read an identical description with one word changed - the cars had "smashed" each another. A week later, participants were asked to remember the details of the picture. In the "smashed" condition, significantly more people remembered seeing shattered glass (there was none) and overestimated the speed at which the accident had happened. What's noteworthy here isn't that people interpreted "smashed" to be more serious than "hit", but that changing this one word affected the way people recalled their experience of viewing the photo 140+ hours after the fact.

It's almost frightening how such a tiny change to the experiment could have a statistically significant effect on the way people in the experiment reacted to simple questions after so much time had passed. This really drives home the point that word choice is exceptionally important in experiment design.

Make sure your research is reliable:

The choice of descriptive words in task construction, and even the words moderators speak to participants, can influence the way people respond in a usability test.
Avoid adjectives unless they're necessary to avoid influencing participants.

3 Accuracy and consistency in research design

Here's an interesting blog entry illustrating the dangers of inaccurate and inconsistent wording in survey questions. A Pew Research exercise was meant to track the frequency of use of online dating sites - in one study respondents were asked "Have you ever gone to an online dating website or other site where you can meet people online?" 4 years later the study was repeated, but respondents were asked "Please tell me if you ever use the internet to do any of the following things. Do you ever use the internet to… Use an online dating site. Did you happen to do this yesterday, or not?" Not surprisingly, the results were dramatically different - in the first study, 40% of young adults replied that they had used dating sites, but in 4 years later only 10% had used dating sites.

This is a dramatic example of poor research design, but it serves to illustrate an important point: changes in wording will change response. When conducting any sort of usability benchmarking research, consistency is really important.

We need to be intelligent about which variables we change in our research. If looking for true benchmarking results, Make sure your research is reliable: don't change wording of instructions or questions between rounds of testing. To be able to draw valid conclusions about causality in our benchmarks, we have to be disciplined about how we vary these stimuli.

Testing a web page with changed content wording AND navigational labels AND graphical layout is OK - we can still measure task success and subjective impressions - but if you change multiple variables, you won't be able to identify how much each variable contributed to the change in effect.

Make sure your research is reliable:

Changes in wording of research questions and instructions, even small changes, will yield different outputs, making benchmark results less valid.
Changes in wording of research stimuli - content, but also navigational items - will also yield different outputs, making straight comparisons invalid.

In summary...

Just as we have a responsibility to think through the content and presentation of the websites we design, we also have an obligation to be conscientious about the way we build our research tasks and activities.

Make sure your research is reliable:

Research participants will avoid making strong opinion statements, so build and evaluate your questions and survey scales accordingly. When using strong language, expect cautious results.
Descriptive words have the power to influence participant responses, even after long periods of time. Choose the words you use in your tasks and verbal questions carefully!
For benchmarking studies, changing the wording of questions or instructions may l render comparisons less valid and may skew your results if comparing results to previous rounds of UX testing or research.