As practitioners of user-centered design, it is natural for us to turn to user research to help inform and guide the process of choosing a visual design. But traditional usability testing and related methods don’t seem particularly well suited for assessing visual design for two reasons:
- When we reach out to users for feedback on visual design options, stakeholders are generally looking for large sample sizes—larger than are typical for a qualitative usability study.
- The response we are looking for from users is more emotional—that is, less about users’ ability to accomplish tasks and more about their affective response to a given design.
With this in mind, I was very intrigued by recent posts about desirability testing from Christian Rohrer on his xdStrategy.com blog. In one entry, Christian posits desirability testing as a mix of quantitative and qualitative methods that allow you to assess users’ attitudes toward aesthetics and visual appeal. Inspired by his overview of this method, we researched desirability studies a bit further and tried a modified version of the method on one of our projects. This article reviews the variants of desirability testing that we considered and the lessons we learned from conducting a desirability study to assess the visual design options for one of our projects.
Why Is Desirability Important?
From a usability perspective, an important role of visual design is to lead users through the hierarchy of a design as we intend. Use of value contrast and color and the size and placement of elements can serve to support a product’s underlying information architecture and interaction design. During the early stages of the design process, we focus on these functional aspects of a design and conduct research to ensure that the overall solution offers a compelling value proposition to users. We also aim to optimize usability and make it easy for users to realize the solution’s benefits and, ultimately, achieve their goals.
A product’s having valuable features and an intuitive information architecture and interaction design certainly contributes to its overall desirability. However, there is a difference between functional desirability and the emotional desirability that stems from aesthetics, look, and feel. Visual elements can support a solution’s interaction design, but they can also elicit an emotional response from users. Understanding and exploiting these emotional responses can help designers to influence users appropriately.
Interestingly, Lindegaard and his associates found that a design can have an emotional impact very quickly. In their research report “Attention Web Designers: You Have 50 Milliseconds to Make a Good First Impression!” PDF they outline a series of experiments they conducted to assess how quickly people form an opinion about the visual appeal of a design. As you can probably guess from the title of their report, they found that a design elicits an emotional response very rapidly—in about the time it takes to read a single word.
This is important because the halo effect of that emotional response causes users’ first impressions of a design to impact a product’s or application’s perceived utility, usability, and credibility. Users generally form their first impressions less by interacting with certain functions and more through their initial emotional response to a product’s visual aesthetics and imagery. Researchers classify the effects as positive or negative. For example, if a user has a positive first impression of the design aesthetics, they are more likely to overlook or forgive poor usability or limited functionality. With a negative first impression, users are more likely to find fault with an interaction, even if a product’s overall usability is good and the product offers real value.
This has special implications for a number of domains. For example, in an ecommerce environment, a site’s perceived level of trustworthiness can affect buying decisions or people’s willingness to interact with the site. For interactive applications, a sense of organization can affect perceived usability and, ultimately, users’ overall satisfaction with the product.
So Why Not Just Ask People Which Design They Like Better?
As I noted earlier, within my company’s design process, we try to iteratively improve our conceptual approaches and interaction designs through user feedback and usability testing. Often, during this testing, we use a think-aloud protocol and ask participants to explain which option they prefer for an interaction and why. With visual design comps, it is tempting to simply show participants the design options at the end of a usability test session and ask them which they like better. This sounds straightforward enough and, generally, we’ve found that this is what business stakeholders think of when we talk about getting user feedback on visual designs.
The problem with this simplistic approach is that people’s rationales for the overwhelming variety of their tastes may or may not be related to the business or brand goals for a design. For example, when I’ve asked this question before, I’ve heard participants say they like a certain design because it’s “their favorite color” or “I like things that are green.” Their statements may be truthful, but those types of responses don’t help researchers assess the emotional impact of a design or how it aligns with the intended brand attributes. In addition, some participants have a difficult time articulating what it is about a design they like or dislike. During an interview, participants may be able to select a preferred design, but without a structured mechanism for providing feedback, they may be at a loss for words when it comes to describing why they like or dislike it.
We’ve also found that, when asking for design preferences during a qualitative study like a usability test, the small sample sizes do not align with stakeholder expectations for validation of a given design. Especially for public-facing Web sites and applications, their visual design is one of the most significant depictions of the company’s brand, and business sponsors and stakeholders often want substantial customer feedback to assure them a given direction is correct.
Some Potential Research Methods
Besides simply asking for users’ preferences for particular designs, we explored several other structured research methods that could help inform design selection, including the following:
- triading
- experience questionnaires
- quick-exposure memory tests
- measurement of physiological indicators
Triading
The triading method I described in one of my columns on UXmatters offers potential in this regard, because it is structured around the comparison of several options. The idea with triading is to elicit attributes that research participants and target users would use to compare given alternatives, in a way that is not biased by the researcher. Given three design options, a researcher could ask participants to identify two that are different from the third and describe why they are different. This process helps the researcher to understand what dimensions are important to target users in comparing different designs. We’ve found this method to be very helpful both when evaluating the competitive landscape and for assessing different conceptual options from an interaction design perspective. However, this method is difficult when conducting studies with large sample sizes, and it can be difficult to present the tabulation of results to stakeholders who are looking for research to help them choose the best design option.
Experience Questionnaires
Another possible approach to assessing design options is a comprehensive experience questionnaire. Questionnaires such as SUS (System Usability Scale), QUIS (Questionnaire for User Interface Satisfaction), and WAMMI (Website Analysis and MeasureMent Inventory) are broad, experience-based questionnaires, but do include questions relating to visual appeal and aesthetics. In a 2004 report to the Usability Professionals’ Association, “A Comparison of Questionnaires for Assessing Website Usability,” PDF Tom Tullis and Jacqueline Stetson wrote about a study that compared the effectiveness of these questionnaires. They found that, to varying degrees, all of these questionnaires were effective in reliably assessing differences between Web sites.
For comparing visual design options, questionnaires’ ability to identify perceived differences between design alternatives is intriguing. These questionnaires are also attractive, because they are relatively straightforward and easy to administer on a large scale. But many of the questionnaires also include a significant number of questions about interactivity and require participants to have had a certain level of interaction with a site or application. For a quick comparison of static visual design comps, we felt these questions would not be appropriate. In addition, we were not just looking for a winner among the designs, we wanted to understand what emotional responses each alternative elicited, so we could make better design decisions going forward. The output of these questionnaires did not lend itself to that purpose.
Quick-Exposure Memory Tests
A third approach we looked at was a quick-exposure memory test. In this method, researchers show participants a user interface for a very brief moment, then take it away. Then, they ask participants to recall what they remember about the user interface from that brief exposure. Participants have limited interaction with the site or application, so theoretically, they’re providing you a glimpse into their first impression—what sticks in their memory. During usability test sessions, we’ve tried this method to elicit conversation about home pages and other starting pages, and it is helpful in assessing layout considerations and information design.
There is a service available online called fivesecondtest that lets you solicit responses from visitors and get a decent sample size—that is, 50 participants—in a relatively short period of time. We chose not to use this service as our primary method for visual design comparison studies, because we felt it focused too much on people’s memory of particular items rather than emotional impact, but for a small amount of money and effort, it may be helpful in certain situations.
Measurement of Physiological Indicators
Finally, in researching potential methods for desirability testing, we reviewed the growing body of knowledge about the physiological indicators researchers can measure to assess emotional response. In the article “A Multi-method Approach to the Assessment of Web Page Designs,” Westerman and his co-authors summarize the available approaches:
- Electroencephalography (EEG) measures activity in parts of the brain that you can map to certain emotional responses.
- Electromyography (EMG) measures muscle activity that correlates to excitement levels.
- Electrodermal Activity (EDA) measures the activity of sweat glands, which is said to correlate to arousal and excitement.
- Blood Volume Pressure (BVP) measures dilation in the blood vessels, which, in turn, correlates with arousal.
- Pupil dilation appears to correlate to both arousal and mental workload.
- Respiration measurements can indicate negative valence or arousal.
Similar to eyetracking, during these studies, various sensors track these physiological measurements as researchers show participants particular designs. Changes in one or more indicators suggest a particular emotional response. Researchers often pair these measurements with attitudinal and self-reporting surveys to give a multifaceted view of participants’ emotional reactions to a design. The potential of these physiological methods of quantitatively assessing emotional response is great. However, because of the time and budget constraints on many of our projects, we were looking for an approach we could use outside a lab or even over the Internet, so we could get large samples of responses.