# «Draft, April 21, 2008 Teacher Effects: What Do We Know? Helen F. Ladd Edgar Thompson Professor of Public Policy Studies and professor of economics Duke ...»

Ladd, Teacher Effects, Draft April 21, 2008

Draft, April 21, 2008

Teacher Effects: What Do We Know?

Helen F. Ladd

Edgar Thompson Professor of Public Policy Studies

and professor of economics

Duke University

Durham, NC 27708

hladd@duke.edu

This paper was prepared for the Teacher Quality Conference at Northwestern

University, May 1, 2008.

Ladd, Teacher Effects, Draft April 21, 2008

The availability of administrative data on teachers and students has greatly enhanced the

ability of researchers to address research topics related to the effectiveness of teachers. Such data permit the researcher to use the student as the unit of observation, to follow students over time, and in many cases to match students with their specific teachers. Moreover the sample sizes are sufficiently large to allow for more sophisticated and complex modeling that has heretofore been possible. Now that NCLB requires all states to test all students in grades 3-8 annually, the hope is that administrative data will be become even more readily available for research on teachers in the future.

Among the many issues that arise in estimating the effectiveness of teachers, three are particularly salient. One reflects the observation that teachers are not randomly assigned to schools or to classrooms within schools. As a result, teacher effects may be confounded by the unobservable characteristics of students, such as their ability and motivation. Any estimates of teacher effects that do not fully account for the nonrandom matching of students to teachers would be biased upward if students with greater ability and motivation are assigned to the more effective teachers, and the effects would be downward biased if district administrators tried to compensate for lower student ability by assigning them to the more effective teachers. This nonrandom assignment of teachers and students represents a significant obstacle to the estimation of teacher effects.

A second issue relates to the technical and conceptual feasibility of separating the effects of individual teachers (or teacher credentials) from the effects of other inputs to the educational process, such as the characteristics of students in the classroom, school level policies, and characteristics of the individual students. A third concern arises because measurement error leads to imprecise estimates of teacher effects and complicates their interpretation.

Most researchers now agree that teacher quality matters for student achievement and that variation in teacher effectiveness contributes significantly to the variation in student achievement. Section I below briefly reviews the evidence for that conclusion. That conviction, combined with the current national focus on student achievement as exemplified by the federal No Child Left Behind Act of 2001, has encouraged policy makers in some states and districts to introduce programs to reward individual teachers for their effectiveness in raising student test scores. As discussed in section II, however, this policy thrust appears to be moving faster than can be supported by the technical analysis of teacher effects. Much remains to be learned about both the estimation of teacher effects and their usefulness for policy. Section III shifts the focus to teacher credentials. Although many researchers and policy makers have traditionally downplayed the relationship between teacher credentials and student achievement, some researchers including me, believe that teacher credentials are important predictors of student achievement and should be viewed as important policy levers for improving student achievement and for reducing achievement gaps. Section IV concludes with potential future research directions.

I. Do teachers matter?

This first section looks at what is known about the extent to which teachers differ in their effectiveness in raising student achievement, as measured by test scores, and about now much the variation in teacher effectiveness contributes to the variation in student achievement. The

** Ladd, Teacher Effects, Draft April 21, 2008**

greater is the contribution of teachers the more productive it may be for policy makers to focus reform efforts on teachers.

Only recently have researchers documented in a reasonably convincing way what parents always knew, namely that the variation in teacher quality contributes significantly to the variation in student outcomes. Prior to this recent research, the more standard view among many researchers, which was based on early studies showing little relationship between teacher credentials and student achievement, was that variation across teachers does not account for much of the variation in student achievement or achievement gains. The conclusion that teachers matter is based on three quite different approaches to isolating the effects of teachers.

One approach emerges in work by Hanushek and Rivkin and various co-authors based on data from Texas. In Rivkin, Hanushek and Kain (RHK) (2005), the authors use statewide data on student test scores that can be matched to teachers only at the grade level, not at the classroom level. Though their approach is clever, more convincing evidence requires that the teachers of students be identified at the classroom level. In Hanushek, Kain, O’Brien, and Rivkin (2005) the authors use data for a single Texas district, which they refer to as the Lone Star District, for which they are able to match students in grades 3-8 to their classroom teachers.

In both cases, the authors’ goal is to measure the persistent component of teacher effects, that is, the nonrandom component of what is undoubtedly a noisy measure.

The patterns that emerge from the latter study are summarized in Table 1. In all cases the estimates are based on estimated teacher effects by year, with the achievement gains measured in normalized units. The table highlights three important issues that arise in the estimation of such effects. One is the extent to which controls are included for the demographic characteristics of the students such as their income, race/ethnicity and gender. Another is whether the estimates refer to the variation in teacher quality at the district level or just to the variation within individual schools. The third issue is the importance of measurement error.

The first row of the table summarizes the variation in student achievement gains accounted for by the variation in teacher effects (estimated using fixed effects for teachers by year, a method that is discussed further below) for each of four models: district wide models with and without demographic adjustment and models that include school-by- year fixed effects with and without demographic controls. Not surprisingly, the within-school variation in teacher effects is smaller than the district-wide variation. Further, as would be expected, demographic controls reduce the estimated variation in teachers effects far more for the within-district estimates than for the within-school estimates. That outcome reflects the fact that more student and teacher sorting occurs across than within schools.

Those entries overstate the contribution of teachers to student achievement, however, because some of the variation is simply measurement error in the form of random noise. The entries in the second row, which are the year- to -year correlation in teacher effects, indicate that from 42 to 50 percent of the variation is persistent. These fractions are used to adjust the entries in the first row downward to generate the variance in persistent variation in teacher quality reported in the third row for each of the four models. Below each measure of variance is the corresponding estimated standard deviation. Emerging from this analysis is that a one-standard deviation in teacher quality is associated with a 0.22 to 0.32 standard deviation difference in achievement gains. The larger estimate could well overstate the importance of teachers since it does not control for school level factors such as the effectiveness of school principals or the

** Ladd, Teacher Effects, Draft April 21, 2008**

composition of the students. Hence, the authors highlight the smaller estimate, emphasizing it is a lower bound estimate of teacher effects. 1 A second approach based on national data is presented in Rowan, Correnti, and Miller (2002). 2 These authors study two cohorts of students in the nationally representative sample of schools in the Prospects study. The authors fit four different models for each subject, cohort and grade. The first model is a three level-nested analysis of variance with students clustered within classes or teachers, classes within schools, and schools. The second and third are valueadded and gains models along the lines described below. And the fourth is a cross-classified model. In none of the models are the authors able to separate teacher effects from classroom effects.

For each of their models, the authors find that classrooms (and their teachers) account for significant portions of the relevant variance in achievement, where the relevant variance is defined in different ways in the different models. For example, based on their fourth model, the authors conclude that the variability in teacher effects accounts for 60-61 percent of the ‘reliable” variance in achievement growth in reading and 52-72 percent in math, where the “reliable” variance is defined as the variability in achievement growth purged of the effects of measurement error. In a subsequent review of this study, McCaffrey et al. (2003) quibble with the way the authors measure reliable variance. Nonetheless, they conclude that this study provides convincing evidence of teacher effects, or more precisely, classroom effects, although the magnitudes are not fully clear.

The most compelling results emerge from a study that uses data from the Tennessee class size experiment to examine teacher effects for students in the early grades (Nye, Konstantopoulos, and Hedges, 2004). Because students in this mid-1980’s experiment were randomly assigned to classrooms, this study provides the only evidence about teacher effects based on a random assignment of students to teachers. Teacher effects are estimated using a hierarchical linear model designed to sort out the between-teacher (but within-school) effects of teachers on achievement gains and also on achievement status. The authors conclude that the teacher effects are real and consistent with those of other studies, that they are larger for math than for reading, and that within-school teacher effects are larger than across school effects. For math, the estimated between-teacher variance components range from 0.123 to 0.135 and for reading they are about half that size. If teacher effects are normally distributed, these findings suggest that the difference between having a teacher at the 25th percentile and one at the 75th percentile is close to half a standard deviation in math and close to a third of a standard deviation in reading.

This lower bound estimate is larger than the comparable estimate of 0.11 standard deviations reported in Rivkin, Hanushek, and Kain (2005), but the results are not directly comparable. Though this other study is also refers to within school differences, it focuses on grade level differences from one year to the next. Another important difference is that this other study is based on raw gains, not the standardized gains used for the results in Table 1.

The standard deviation of the raw gains is about two-thirds of the standard deviation of the standardized gains.

Putting the two estimates on the same scale would increase the 0.11 estimate to 0.15 standard deviations. (Hanushek, Kain, O’Brien, and Rivkin, 2005,p. 14).

This discussion is based primarily on the discussion of McCaffrey, Lockwood, Koretz, and Hamilton (2003), pp.

24-30.

** Ladd, Teacher Effects, Draft April 21, 2008**

Taken together, these and other studies provide quite convincing evidence that teachers matter for student achievement. Not examined here are additional studies that provide evidence that these teacher effects cumulate over time (McCaffrey et al, 2003, pp. 36-48). Although the overall contribution of teachers to student achievement has not been precisely established, the findings in these papers are sufficient to justify additional research attention to other questions related to teachers, such as whether it is possible to identify the effectiveness of individual teachers and whether teacher credentials are predictive of student achievement.

II. Can teacher-specific effects be identified and measured?

Identifying the relative effectiveness of individual teachers is of increasing policy relevance as policy makers have become intrigued with the idea of rewarding individual teachers for good performance, as measured by their ability to raise test scores. The research in this section, shows, however, that it is difficult to separate the effects of teachers from other inputs, particularly those based on contextual factors at the school or classroom level, that the estimated teachers effects are not very stable over time, and that there is no clear best way to deal with measurement error in the estimates.

Researchers have been using two main approaches to identify the effectiveness of individual teachers in raising student achievement. I refer to the first approach as value added modeling and include in that category both level and gain models. The second approach includes mixed and layered models that directly model the full joint distribution of all student outcomes.

Though for some purposes, the mixed methods models are superior, they are computationally very demanding and will receive somewhat less attention in this overview.

Value-added models As noted earlier, a fundamental challenge in estimating teacher effects is the observation is that teachers are not randomly assigned to teachers. For the moment, I set this issue aside to develop the conceptual foundations of the standard value added model, with particular attention to the assumptions underlying it.

Derivation of simple value added model The starting point for this analysis is the observation that education is a cumulative process.

In the context of a very simple model in which the only educational input that matters is teachers and in which all other determinants of achievement such as student background, ability, and motivation are set aside, we can write Ait = f(Tit, Ti t-1,….) + ε it (1) where Ait refers to student i’s achievement, as measured by test scores, in year t, Tit refers to the teacher of student i in year t, and ε it is an error term. This equation expresses student i’s achievement in year t as a function of her teacher in that year and in all previous school years plus a random error.

Two additional assumptions permit this relationship to be transformed into one that can be used to estimate the effect of the student’s teacher in year t on the student’s achievement in that same year, controlling for the effects of teacher quality in all prior years. One assumption is that the marginal effect of a teacher on a student’s achievement in the contemporaneous year is constant across years and that the relationship is linear. The second is that student achievement, or knowledge, decays from one year to the next at a constant rate. As a result, the rate at which a

** Ladd, Teacher Effects, Draft April 21, 2008**

student’s knowledge persists from one year to the next is also constant. Letting β be the effect of T and α the rate at which knowledge persists, we can rewrite equation 1 as Ait = βTit + αβT it-1 + α2βTi t-1 + α3βT it-2 + … + εit (2) and, after rearranging terms, as Ait = βTit + α(βT it-1 + αβTit-2 + α2 βTit-3 + … ) + εit (3) Noting that the expression within the parentheses is simply Ait-1 and changing the order of the terms, we end up with Ait = αAit-1 + βTit + εit. (4) Thus, the effects on current achievement of the student’s prior teachers are captured by the lagged achievement term. If a student’s knowledge does not persist from year to year the persistence paramenter, α, would be zero.