«If there were only one truth, you couldn’t paint a hundred canvases on the same theme. Pablo Picasso, 1966 Introduction As one of today’s most ...»
Qualitative Analysis of Content
Yan Zhang and Barbara M. Wildemuth
If there were only one truth, you couldn’t paint a hundred canvases on the
--Pablo Picasso, 1966
As one of today’s most extensively employed analytical tools, content analysis
has been used fruitfully in a wide variety of research applications in information and
library science (ILS) (Allen & Reser, 1990). Similar to other fields, content analysis has
been primarily used in ILS as a quantitative research method until recent decades. Many current studies use qualitative content analysis, which addresses some of the weaknesses of the quantitative approach.
Qualitative content analysis has been defined as:
• “a research method for the subjective interpretation of the content of text data through the systematic classification process of coding and identifying themes or patterns” (Hsieh & Shannon, 2005, p.1278), • “an approach of empirical, methodological controlled analysis of texts within their context of communication, following content analytic rules and step by step models, without rash quantification” (Mayring, 2000, p.2), and • “any qualitative data reduction and sense-making effort that takes a volume of qualitative material and attempts to identify core consistencies and meanings” (Patton, 2002, p.453).
These three definitions illustrate that qualitative content analysis emphasizes an integrated view of speech/texts and their specific contexts. Qualitative content analysis goes beyond merely counting words or extracting objective content from texts to examine meanings, themes and patterns that may be manifest or latent in a particular text. It allows researchers to understand social reality in a subjective but scientific manner.
Comparing qualitative content analysis with its rather familiar quantitative counterpart can enhance our understanding of the method. First, the research areas from which they developed are different. Quantitative content analysis (discussed in the previous chapter) is used widely in mass communication as a way to count manifest textual elements, an aspect of this method that is often criticized for missing syntactical and semantic information embedded in the text (Weber, 1990). By contrast, qualitative content analysis was developed primarily in anthropology, qualitative sociology, and psychology, in order to explore the meanings underlying physical messages. Second, quantitative content analysis is deductive, intended to test hypotheses or address questions generated from theories or previous empirical research. By contrast, qualitative content analysis is mainly inductive, grounding the examination of topics and themes, as well as the inferences drawn from them, in the data. In some cases, qualitative content analysis attempts to generate theory. Third, the data sampling techniques required by the two approaches are different. Quantitative content analysis requires that the data are selected using random sampling or other probabilistic approaches, so as to ensure the validity of statistical inference. By contrast, samples for qualitative content analysis usually consist of purposively selected texts which can inform the research questions being investigated. Last but not the least, the products of the two approaches are different. The quantitative approach produces numbers that can be manipulated with various statistical methods. By contrast, the qualitative approach usually produces descriptions or typologies, along with expressions from subjects reflecting how they view the social world. By this means, the perspectives of the producers of the text can be better understood by the investigator as well as the readers of the study’s results (Berg, 2001).
Qualitative content analysis pays attention to unique themes that illustrate the range of the meanings of the phenomenon rather than the statistical significance of the occurrence of particular texts or concepts.
In real research work, the two approaches are not mutually exclusive and can be used in combination. As suggested by Smith, “qualitative analysis deals with the forms and antecedent-consequent patterns of form, while quantitative analysis deals with duration and frequency of form”(Smith, 1975, p.218). Weber (1990) also pointed out that the best content-analytic studies use both qualitative and quantitative operations.
Inductive vs. Deductive Qualitative content analysis involves a process designed to condense raw data into categories or themes based on valid inference and interpretation. This process uses inductive reasoning, by which themes and categories emerge from the data through the researcher’s careful examination and constant comparison. But qualitative content analysis does not need to exclude deductive reasoning (Patton, 2002). Generating concepts or variables from theory or previous studies is also very useful for qualitative research, especially at the inception of data analysis (Berg, 2001).
Hsieh and Shannon (2005) discussed three approaches to qualitative content analysis, based on the degree of involvement of inductive reasoning. The first is conventional qualitative content analysis, in which coding categories are derived directly and inductively from the raw data. This is the approach used for grounded theory development. The second approach is directed content analysis, in which initial coding starts with a theory or relevant research findings. Then, during data analysis, the researchers immerse themselves in the data and allow themes to emerge from the data.
The purpose of this approach usually is to validate or extend a conceptual framework or theory. The third approach is summative content analysis, which starts with the counting of words or manifest content, then extends the analysis to include latent meanings and themes. This approach seems quantitative in the early stages, but its goal is to explore the usage of the words/indicators in an inductive manner.
The Process of Qualitative Content Analysis The process of qualitative content analysis often begins during the early stages of data collection. This early involvement in the analysis phase will help you move back and forth between concept development and data collection, and may help direct your subsequent data collection toward sources that are more useful for addressing the research questions (Miles & Huberman, 1994). To support valid and reliable inferences, qualitative content analysis involves a set of systematic and transparent procedures for processing data. Some of the steps overlap with the traditional quantitative content analysis procedures (Tesch, 1990), while others are unique to this method. Depending on the goals of your study, your content analysis may be more flexible or more standardized, but generally it can be divided into the following steps, beginning with preparing the data and proceeding through writing up the findings in a report.
Step 1: Prepare the Data Qualitative content analysis can be used to analyze various types of data, but generally the data need to be transformed into written text before analysis can start. If the data come from existing texts, the choice of the content must be justified by what you want to know (Patton, 2002). In ILS studies, qualitative content analysis is most often used to analyze interview transcripts in order to reveal or model people’s information related behaviors and thoughts. When transcribing interviews, the following questions arise: (1) should all the questions of the interviewer or only the main questions from the interview guide be transcribed; (2) should the verbalizations be transcribed literally or only in a summary; and (3) should observations during the interview (e.g., sounds, pauses, and other audible behaviors) be transcribed or not (Schilling, 2006)? Your answers to these questions should be based on your research questions. While a complete transcript may be the most useful, the additional value it provides may not justify the additional time required to create it.
Step 2: Define the Unit of Analysis The unit of analysis refers to the basic unit of text to be classified during content analysis. Messages have to be unitized before they can be coded, and differences in the unit definition can affect coding decisions as well as the comparability of outcomes with other similar studies (De Wever et al., 2006). Therefore, defining the coding unit is one of your most fundamental and important decisions (Weber, 1990).
Qualitative content analysis usually uses individual themes as the unit for analysis, rather than the physical linguistic units (e.g., word, sentence, or paragraph) most often used in quantitative content analysis. An instance of a theme might be expressed in a single word, a phrase, a sentence, a paragraph, or an entire document. When using theme as the coding unit, you are primarily looking for the expressions of an idea (Minichiello et al., 1990). Thus, you might assign a code to a text chunk of any size, as long as that chunk represents a single theme or issue of relevance to your research question(s).
Step 3: Develop Categories and a Coding Scheme Categories and a coding scheme can be derived from three sources: the data, previous related studies, and theories. Coding schemes can be developed both inductively and deductively. In studies where no theories are available, you must generate categories inductively from the data. Inductive content analysis is particularly appropriate for studies that intend to develop theory, rather than those that intend to describe a particular phenomenon or verify an existing theory. When developing categories inductively from raw data, you are encouraged to use the constant comparative method (Glaser & Strauss, 1967), since it is not only able to stimulate original insights, but is also able to make differences between categories apparent. The essence of the constant comparative method is (1) the systematic comparison of each text assigned to a category with each of those already assigned to that category, in order to fully understand the theoretical properties of the category; and (2) integrating categories and their properties through the development of interpretive memos.
For some studies, you will have a preliminary model or theory on which to base your inquiry. You can generate an initial list of coding categories from the model or theory, and you may modify the model or theory within the course of the analysis as new categories emerge inductively (Miles & Huberman, 1994). The adoption of coding schemes developed in previous studies has the advantage of supporting the accumulation and comparison of research findings across multiple studies.
In quantitative content analysis, categories need to be mutually exclusive because confounded variables would violate the assumptions of some statistical procedures (Weber, 1990). However, in reality, assigning a particular text to a single category can be very difficult. Qualitative content analysis allows you to assign a unit of text to more than one category simultaneously (Tesch, 1990). Even so, the categories in your coding scheme should be defined in a way that they are internally as homogeneous as possible and externally as heterogeneous as possible (Lincoln & Guba, 1985).
To ensure the consistency of coding, especially when multiple coders are involved, you should develop a coding manual, which usually consists of category names, definitions or rules for assigning codes, and examples (Weber, 1990). Some coding manuals have an additional field for taking notes as coding proceeds. Using the constant comparative method, your coding manual will evolve throughout the process of data analysis, and will be augmented with interpretive memos.
Step 4: Test Your Coding Scheme on a Sample of Text If you are using a fairly standardized process in your analysis, you’ll want to develop and validate your coding scheme early in the process. The best test of the clarity and consistency of your category definitions is to code a sample of your data. After the sample is coded, the coding consistency needs to be checked, in most cases through an assessment of inter-coder agreement. If the level of consistency is low, the coding rules must be revised. Doubts and problems concerning the definitions of categories, coding rules, or categorization of specific cases need to be discussed and resolved within your research team (Schilling, 2006). Coding sample text, checking coding consistency, and revising coding rules is an iterative process and should continue until sufficient coding consistency is achieved (Weber, 1990).
Step 5: Code All the Text When sufficient consistency has been achieved, the coding rules can be applied to the entire corpus of text. During the coding process, you will need to check the coding repeatedly, to prevent “drifting into an idiosyncratic sense of what the codes mean” (Schilling, 2006). Because coding will proceed while new data continue to be collected, it’s possible (even quite likely) that new themes and concepts will emerge and will need to be added to the coding manual.
Step 6: Assess Your Coding Consistency After coding the entire data set, you need to recheck the consistency of your coding. It is not safe to assume that, if a sample was coded in a consistent and reliable manner, the coding of the whole corpus of text is also consistent. Human coders are subject to fatigue and are likely to make more mistakes as the coding proceeds. New codes may have been added since the original consistency check. Also, the coders’ understanding of the categories and coding rules may change subtly over the time, which may lead to greater inconsistency (Miles & Huberman, 1994; Weber, 1990). For all these reasons, you need to recheck your coding consistency.
Step 7: Draw Conclusions from the Coded Data This step involves making sense of the themes or categories identified, and their properties. At this stage, you will make inferences and present your reconstructions of meanings derived from the data. Your activities may involve exploring the properties and dimensions of categories, identifying relationships between categories, uncovering patterns, and testing categories against the full range of data (Bradley, 1993). This is a critical step in the analysis process, and its success will rely almost wholly on your reasoning abilities.
Step 8: Report Your Methods and Findings For the study to be replicable, you need to monitor and report your analytical procedures and processes as completely and truthfully as possible (Patton, 2002). In the case of qualitative content analysis, you need to report your decisions and practices concerning the coding process, as well as the methods you used to establish the trustworthiness of your study (discussed below).
Qualitative content analysis does not produce counts and statistical significance;
instead, it uncovers patterns, themes, and categories important to a social reality.
Presenting research findings from qualitative content analysis is challenging. Although it is a common practice to use typical quotations to justify conclusions (Schilling, 2006), you also may want to incorporate other options for data display, including matrices, graphs, charts, and conceptual networks (Miles & Huberman, 1994). The form and extent of reporting will finally depend on the specific research goals (Patton, 2002).
When presenting qualitative content analysis results, you should strive for a balance between description and interpretation. Description gives your readers background and context and thus needs to be rich and thick (Denzin, 1989). Qualitative research is fundamentally interpretive, and interpretation represents your personal and theoretical understanding of the phenomenon under study. An interesting and readable report “provides sufficient description to allow the reader to understand the basis for an interpretation, and sufficient interpretation to allow the reader to understand the description” (Patton, 2002, p.503-504).