«If there were only one truth, you couldn’t paint a hundred canvases on the same theme. Pablo Picasso, 1966 Introduction As one of today’s most ...»
Computer Support for Qualitative Content Analysis Qualitative content analysis is usually supported by computer programs, such as NVivo or ATLAS.ti.2 The programs vary in their complexity and sophistication, but their common purpose is to assist researchers in organizing, managing, and coding qualitative data in a more efficient manner. The basic functions that are supported by such programs include text editing, note and memo taking, coding, text retrieval, and node/category manipulation. More and more qualitative data analysis software incorporates a visual presentation module that allows researchers to see the relationships between categories more vividly. Some programs even record a coding history to allow researchers to keep track of the evolution of their interpretations. Any time you will be working with more than a few interviews or are working with a team of researchers, you should use this type of software to support your efforts.
Trustworthiness Validity, reliability, and objectivity are criteria used to evaluate the quality of research in the conventional positivist research paradigm. As an interpretive method, qualitative content analysis differs from the positivist tradition in its fundamental assumptions, research purposes, and inference processes, thus making the conventional criteria unsuitable for judging its research results (Bradley, 1993). Recognizing this gap,
Lincoln and Guba (1985) proposed four criteria for evaluating interpretive research work:
credibility, transferability, dependability, and confirmability.
Credibility refers to the “adequate representation of the constructions of the social world under study” (Bradley, 1993, p.436). Lincoln and Guba (1985) recommended a set of activities that would help improve the credibility of your research results: prolonged engagement in the field, persistent observation, triangulation, negative case analysis, checking interpretations against raw data, peer debriefing, and member checking. To improve the credibility of qualitative content analysis, researchers not only need to design data collection strategies that are able to adequately solicit the representations, but also to design transparent processes for coding and drawing conclusions from the raw data.
Coders’ knowledge and experience have significant impact on the credibility of research results. It is necessary to provide coders precise coding definitions and clear coding procedures. It is also helpful to prepare coders through a comprehensive training program (Weber, 1990).
Transferability refers to the extent to which the researcher’s working hypothesis can be applied to another context. It is not the researcher’s task to provide an index of transferability; rather, he or she is responsible for providing data sets and descriptions that are rich enough so that other researchers are able to make judgments about the findings’ transferability to different settings or contexts.
Dependability refers to “the coherence of the internal process and the way the researcher accounts for changing conditions in the phenomena” (Bradley, 1993, p.437).
Confirmability refers to “the extent to which the characteristics of the data, as posited by the researcher, can be confirmed by others who read or review the research results” (Bradley, 1993, p.437). The major technique for establishing dependability and http://www.qsrinternational.com/products_nvivo.aspx.
confirmability is through audits of the research processes and findings. Dependability is determined by checking the consistency of the study processes, and confirmability is determined by checking the internal coherence of the research product, namely, the data, the findings, the interpretations, and the recommendations. The materials that could be used in these audits include raw data, field notes, theoretical notes and memos, coding manuals, process notes, and so on. The audit process has five stages: preentry, determinations of auditability, formal agreement, determination of trustworthiness (dependability and confirmability), and closure. A detailed list of activities and tasks at each stage can be found in Appendix B in Lincoln and Guba (1985).
Examples Two examples of qualitative content analysis will be discussed here. The first example study (Schamber, 2000) was intended to identify and define the criteria that weather professionals use to evaluate particular information resources. Interview data were analyzed inductively. In the second example, Foster (2004) investigated the information behaviors of interdisciplinary researchers. Based on semi-structured interview data, he developed a model of these researchers’ information seeking and use.
These two studies are typical of ILS research that incorporates qualitative content analysis.
Example 1: Criteria for Making Relevance Judgments Schamber (2000) conducted an exploratory inquiry into the criteria that occupational users of weather information employ to make relevance judgments on weather information sources and presentation formats. To get first-hand accounts from users, she used the time-line interview method to collect data from 30 subjects: 10 each in construction, electric power utilities, and aviation. These participants were highly motivated and had very specific needs for weather information. In accordance with a naturalistic approach, the interview responses were to be interpreted in a way that did not compromise the original meaning expressed by the study participant. Inductive content analysis was chosen for its power to make such faithful inferences.
The interviews were audio taped and transcribed. The transcripts served as the primary sources of data for content analysis. Because the purpose of the study was to identify and describe criteria used by people to make relevance judgments, Schamber defined a coding unit as “a word or group of words that could be coded under one criterion category” (Schamber, 2000, p.739). Responses to each interview were unitized before they were coded.
As Schamber pointed out, content analysis functions both as a secondary observational tool for identifying variables in text and an analytical tool for categorization. Content analysis was incorporated in this study at the pretest stage of developing the interview guide as a basis for the coding scheme, as well as assessing the effectiveness of particular interview items. The formal process of developing the coding scheme began shortly after the first few interviews. The whole process was an iteration of coding a sample of data, testing inter-coder agreement, and revising the coding scheme.
Whenever the percentage of agreement did not reach an acceptable level, the coding scheme was revised (Schamber, 1991). The author reported that, “based on data from the first few respondents, the scheme was significantly revised eight times and tested by 14 coders until inter-coder agreement reached acceptable levels” (Schamber, 2000, p.738).
The 14 coders were not involved in the coding at the same time; rather, they were spread across three rounds of revision.
The analysis process was inductive and took a grounded theory approach. The author did not derive variables/categories from existing theories or previous related studies, and she had no intention of verifying existing theories; rather, she immersed herself in the interview transcripts and let the categories emerge on their own. Some categories in the coding scheme were straightforward and could be easily identified based on manifest content, while others were harder to identify because they were partially based on the latent content of the texts. The categories were expected to be mutually exclusive (distinct from each other) and exhaustive. The iterative coding process resulted in a coding scheme with eight main categories.
Credibility evaluates the validity of a researcher’s reconstruction of a social reality. In this study, Schamber carefully designed and controlled the data collection and data analysis procedures to ensure the credibility of the research results. First, the timeline interview technique solicited respondents’ own accounts of the relevance judgments they made on weather information in their real working environments instead of in artificial experimental settings. Second, non-intrusive inductive content analysis was used to identify the themes emerging from the interview transcripts. The criteria were defined in respondents’ own language as it appeared in the interviews. Furthermore, a peer debriefing process was involved in the coding development process, which ensures the credibility of the research by reducing the bias of a single researcher. As reported by Schamber (1991), “a group of up to seven people, mostly graduate students including the researcher, met weekly for most of a semester and discussed possible criterion categories based on transcripts from four respondents” (p.84-85). The credibility of the research findings also was verified by the fact that most criteria were mentioned by more than one respondent and in more than one scenario. Theory saturation was achieved as mentions of criteria became increasingly redundant.
Schamber did not claim transferability of the research results explicitly, but the transferability of the study was made possible by detailed documentation of the data processing in a Codebook. The first part of the Codebook explained procedures for handling all types of data (including quantitative). In the second part, the coding scheme was listed; it included: identification numbers, category names, detailed category definitions, coding rules, and examples. This detailed documentation of the data handling and the coding scheme makes it easier for future researchers to judge the transferability of the criteria to other user populations or other situational contexts. The transferability of the identified criteria also was supported by the fact that the criteria identified in this study were also widely documented in previous research works.
The dependability of the research findings in this study was established by the transparent coding process and inter-coder verification. The inherent ambiguity of word meanings, category definitions, and coding procedures threaten the coherence and consistency of coding practices, hence negatively affecting the credibility of the findings.
To make sure that the distinctions between categories were clear to the coders, the Codebook defined them. To ensure coding consistency, every coder used the same version of the scheme to code the raw interview data. Both the training and the experience of the coder are necessary for reliable coding (Neuendorf, 2002). In this study, the coders were graduate students who had been involved the revision of the coding scheme and, thus, were experienced at using the scheme (Schamber, 1991). The final coding scheme was tested for inter-coder reliability with a first-time coder based on simple percent agreement: the number of agreements between two independent coders divided by the number of possible agreements. As noted in the previous chapter, more sophisticated methods for assessing inter-coder agreement are available. If you’re using a standardized coding scheme, refer to that discussion.
As suggested by Lincoln and Guba (1985), confirmability is primarily established through a comfirmability audit, which Schamber did not conduct. However, the significant overlap of the criteria identified in this study with those identified in other studies indicates that the research findings have been confirmed by other researchers.
Meanwhile, the detailed documentation of data handling also provides means for comfirmability checking.
When reporting the trustworthiness of the research results, instead of using the terms, “credibility,” “transferability,” “dependability,” and “confirmability,” Schamber used terms generally associated with positivist studies: “internal validity,” “external validity,” “reliability,” and “generalizability.” It is worth pointing out that there is no universal agreement on the terminology used when assessing the quality of a qualitative inquiry. However, we recommend that the four criteria proposed by Lincoln and Guba (1985) be used to evaluate the trustworthiness of research work conducted within an interpretive paradigm.
Descriptive statistics, such as frequency of criteria occurrence, were reported in the study. However, the purpose of the study was to describe the range of the criteria employed to decide the degree of relevance of weather information in particular occupations. Thus, the main finding was a list of criteria, along with their definitions, keywords, and examples. Quotations excerpted from interview transcripts were used to further describe the identified criteria, as well as to illustrate the situational contexts in which the criteria were applied.
Example 2: Information Seeking in an Interdisciplinary Context Foster (2004) examined the information seeking behaviors of scholars working in interdisciplinary contexts. His goal was threefold: (1) to identify the activities, strategies, contexts, and behaviors of interdisciplinary information seekers; (2) to understand the relationships between behaviors and context; and (3) to represent the information seeking behavior of interdisciplinary researchers in an empirically grounded model. This study is a naturalist inquiry, using semi-structured interviews to collect direct accounts of information seeking experiences from 45 interdisciplinary researchers. The respondents were selected through purposive sampling, along with snowball sampling. To “enhance contextual richness and minimize fragmentation” (Foster, 2004, p.230), all participants were interviewed in their normal working places.
In light of the exploratory nature of the study, the grounded theory approach guided the data analysis. Foster did not have any specific expectations for the data before the analysis started. Rather, he expected that concepts and themes related to interdisciplinary information seeking would emerge from the texts through inductive content analysis and the constant comparative method.
Coding took place in multiple stages, over time. The initial coding process was an open coding process. The author closely read and annotated each interview transcript.
During this process, the texts were unitized and concepts were highlighted and labeled.
Based on this initial analysis, Foster identified three stages of information seeking in interdisciplinary contexts – initial, middle, and final – along with activities involved in each stage. Subsequent coding took place in the manner of constantly comparing the current transcript with previous ones to allow the emergence of categories and their properties. As the coding proceeded, additional themes and activities emerged – not covered by the initially-identified three-stage model. Further analysis of emergent concepts and themes and their relationships to each other resulted in a two-dimensional model of information seeking behaviors in the interdisciplinary context. One dimension delineates three nonlinear core processes of information seeking activities: opening, orientation, and consolidation. The other dimension consists of three levels of contextual interaction: cognitive approach, internal context, and external context.
The ATLAS.ti software was used to support the coding process. It allows the researcher to code the data, retrieve text based on keywords, rename or merge existing codes without perturbing the rest of the codes, and generate visualizatios of emergent codes and their relationships to one another. ATLAS.ti also maintains automatic logs of coding changes, which makes it possible to keep track of the evolution of the analysis.
As reported by Foster, coding consistency in this study was addressed by including three iterations of coding conducted over a period of one year. However, the author did not report on the three rounds of coding in detail. For example, he did not say how many coders were involved in the coding, how the coders were trained, how the coding rules were defined, and what strategies were used to ensure transparent coding. If all three rounds of coding were done by Foster alone, there was no assessment of coding consistency. While this is a common practice in qualitative research, it weakens the author’s argument for the dependability of the study.