When UX professionals think about data, we are usually thinking about analytics, A/B testing, or at least data that is big enough to be statistically significant. Such approaches work well as ways to discover potential areas for investigation or in weeding out bad ideas. However, there may be a tendency to focus on things that are easy to measure and test rather than using data to discover the big ideas that would lead to breakthroughs. A more meaningful approach would combine a high-level, big-data view with a ground-level view to deliver deep data insights.
Big Data and User Experience
Making sense of big data can seem like a Rorschach ink-blot test into which we project all our hopes and fears. Big data can be the basis of new scientific discoveries, detect terrorist networks, and let us create detailed profiles of customers so we can sell them more stuff. Well, maybe. The reality, according to the article “Hilary Mason Wants to Get You Started with Big Data,” is that big data is just “a data set that is too big to fit into your available memory, or too big to store on your own hard drive, or too big to fit into an Excel spreadsheet.”
A simple working definition that we can use in this column—and one that is relevant for user experience—is that big data is data that machines generate, tallying up what people do and say. It is how many people clicked a link, how many pageviews there were on the day a new campaign started, how many people registered. It is Web server logs, clickstream data, heatmaps, social-media activity, mobile phone calls, ebanking transactions, information that mobile-device sensors capture in an app.
Big data is aggregated behavioral or transactional data; a summary of events. It is what people did, not how they felt about it, why they did it, or even how they did it. There are three key characteristics of big data that have an impact on how we use such data to inform design.
- Big data measures user behaviors and actions—for example, Web site or application analytics—as well as words—for example, social-media analytics.
- Computers rather than humans collect big data.
- Big data uses defined measures.
Because big data documents what has occurred without other humans getting involved in collecting it—other than in creating the data-collection system in the first place!—it feels objective. After all, the more data, the less uncertainty. For example, measuring where just ten people clicked could result in flawed data if all ten of them were distracted, but measuring where thousands of people clicked takes such variation into account to some degree.
Even so, there can be bias in big data sets. Signal bias, where the data set represents a certain subset of people, is a common flaw. For example, while a well-known study, combining Hurricane Sandy-related Twitter and Foursquare data produced some interesting insights, it also gave the impression that Manhattan was the hub of the disaster. In user experience, a simple case of signal bias might be having more extensive analytics for desktop computers than for mobile devices.
Big data can be multistructured—such as Web log data that includes text and images alongside structured transactional information. It can also be unstructured, text-heavy data from metadata and social-media posts. This type of data lends itself to exploration, but can also lead to correlations that prove false. The recent failure of Google Flu Trends, which resulted from Google engineers’ not knowing what linked certain search terms with the spread of the flu, illustrates a correlation-causation gap. (They have since adjusted the algorithm to include CDC data.) An example from user experience would be seeing a correlation between people who searched for a certain keyword, then bounced, and using that data to make inferences about user interface problems.
Big data can be a good starting point for learning about a product or service user experience, but it often generates as many questions as answers. Data is not just numbers; it represents the actions and words of real people with complicated lives. But it lacks context. If big data is the archeology of user experience—or the study of the traces that people leave behind—small data is more like anthropology, exploring people’s lives as they are living them online.