# «Draft, April 21, 2008 Teacher Effects: What Do We Know? Helen F. Ladd Edgar Thompson Professor of Public Policy Studies and professor of economics Duke ...»

They develop their argument analytically and then illustrate its importance using a model designed to determine the relationship between the student achievement of fourth and fifth graders in New York City and the credentials of their teachers. The model is a gains model with student, grade and time fixed effects. 7 Table 2 reports the results of their analysis for four of the teacher attributes included in the model: first year of experience (relative to no experience), not certified (relative to being certified, attended a competitive college (reference not clear), and a teacher math SAT score that is one standard deviation above the average (relative to a teacher with a mean SAT score). The estimated coefficients, three of which are statistically significant at the one percent level and one at the five percent level are shown in the first column. As is common in this literature these reported coefficients are expressed relative to the observed variation in the test score level.

** Ladd, Teacher Effects, Draft April 21, 2008**

the standard deviation of the observed test score gain and in the third column relative to what the authors refer to as the gain in the universal (or “true”) test score.

The take-away point is that expressing the estimated coefficient relative to the standard deviation of the observed gains rather than to the standard deviation of the observed levels raises the effect sizes by about 50 percent and expressing them relative to the standard deviation of the “true’ gain score raise them by about 200 percent. Thus, assuming this approach makes sense, the estimated effect sizes – defined in terms of standard deviations of the “true” gains in test scores – appear much larger than they generally do. In particular, the effect of one year of teaching experience appears to raise achievement by an amount equal to 25 percent of a standard deviation of the gains in true achievement. Because this work on interpreting effect sizes is still in progress, it has not been fully vetted and some of the details of the estimates may change as the authors complete the research. I include it here because I believe that far more research would be useful along these lines to help both researchers and policy makers correctly interpret the magnitudes of the achievement effects that emerge from value added models of teacher credentials.

Strategies used by Clotfelter, Ladd and Vigdor to estimate the achievement effects of teacher credentials.

In a series of recent papers, my colleagues, Charles Clotfelter and Jacob Vigdor, and I have used a number of different strategies to estimate the effects of teacher credentials on student achievement. All the models are variations of the value added or gains models discussed in section II and the research is based on rich administrative data from North Carolina accessed through the North Carolina Education Research Data Center. North Carolina is a particularly fitting state for this research because it has been administering end-of-grade tests for all students in grades 3-8 and end-of-course tests in certain subjects for all students at the high school level since the early 1990s, and all the tests are closely linked to the statewide curriculum. In addition, even since 1996/97 teachers in schools that successfully raise student tests scores are eligible for salary bonuses. As a result, teachers have an incentive to teach the material included on the state curriculum and students have an incentive to learn it. Three of the papers (one of which is simply a longer version of a shorter published paper) focus on achievement at the elementary level and the fourth at the high school level. The main challenge we faced in this research was to devise credible ways of measuring the effects of teacher credentials given the nonrandom sorting of students and teachers among schools and across classrooms within schools.

Clotfelter, Ladd and Vidgor (2006). The initial paper in this sequence was based on cross sectional data for one cohort of fifth grade students. We began by documenting the positive matching of students to teachers both across schools, and to a much lesser extent across classrooms within schools, where the term positive matching denotes that the more advantaged and higher performing students tend to have the teachers with the stronger credentials. Because the data were cross sectional, we were not able to use student fixed effects to address the expected upward bias in the estimates of teacher credentials.

Instead, we used three other strategies to minimize the bias. First, in addition to a standard set of student demographic variables, we added an extended set of student variables based on survey responses collected at the time the students were tested. These variables include information on time spent on homework, use of computers, and time spent watching television.

Including these explanatory variables was helpful in that it reduced the magnitude of the error

** Ladd, Teacher Effects, Draft April 21, 2008**

term in the student achievement equation, thereby reducing the room for reverse causation.

Second, we added school-level fixed effects, which was feasible because of our ability to match students to teachers at the classroom level. The inclusion of these fixed effects meant that we were identifying the effects of teacher credentials based only on the variation across teachers within each school and thereby, we eliminated much of the statistical problem that emerges because of the sorting of teachers across schools. Third, we addressed any remaining problems associated with the nonrandom assignment of students and teachers to classrooms within schools by restricting the analysis to the subsample of schools that, based on a number of observable characteristics, appeared to be assigning fifth grade students in a balanced way across classrooms, and hence to teachers within the school. Finally, we tested the validity of our strategies by observing how the coefficients of the teacher credentials differed in specifications with and without the prior year achievement term. Our logic was that the closer our statistical solutions approached the gold standard of a random experiment, the more similar should be our estimated coefficients of the teacher credentials in models with and without the prior achievement term. In the context of a true random experiment, controls for baseline characteristics of students are not needed to generate unbiased estimates of treatment effects.

Clotfelter, Ladd and Vigdor 2007a and 2007b. The first of these two papers is a shorter version of the second paper and focuses mainly on the results. The second paper provides the full models.

A logical next step was to use longitudinal data for multiple cohorts of students to explore the same set of issues. The advantage of the longitudinal data set was that we were able to include student fixed effects in our models. In fact, though, we estimated five different models for third through five graders to explore the effects of different specifications, including three that did not include student fixed effects. We did so because we concluded that none of the models was capable of generating perfectly clean estimates of the effects of teacher credentials.

Because testing does not start in North Carolina until grade three and it was not possible to identify a student’s specific teacher in math or reading after grade 5, we restricted the analysis to student test scores in grades 3-5. In models with prior year achievement, the models refer to test score levels or gains for grades four and five. The short panels for each of our cohorts ruled out any instrumental variable strategy to counter the bias that arises from having the lagged achievement variable as an explanatory variable.

For each model, we predicted the direction of the bias based on conceptual and empirical considerations. Model 2 is quite similar to the model in our 2006 cross sectional paper but here is estimated in these papers with multiple cohorts of students. As elaborated in a conceptual note by Steven Rivkin (2006) and shown in the bottom two rows of the following table, of particular interest is that the inclusion of student fixed effects is predicted to generate downward biased estimates of teacher credentials when the dependent variable is specified in levels and upward biased estimates when it is specified as an achievement gain. For reasons we spell out in the two papers, we prefer models 4 and 5, but with the recognition that model 4 provides a lower bound estimate of the effects of teacher credentials and model 5 an upper bound estimate.

Ladd, Teacher Effects, Draft April 21, 2008

Clotfelter, Ladd, and Vigdor (2007c). Most of the literature on teacher credentials, including the three papers just discussed, focus on teachers at the elementary level. In this final paper, we shift the focus to the relationship between student achievement on five courses typically taken by students in ninth and tenth grade (English 1, algebra I, geometry, biology, and economics, law and politics) and the credentials of their teachers. This high school analysis is feasible in the North Carolina context because of the existence of statewide end-of course tests in each of these subjects.

This paper makes a methodological contribution by its use of student fixed effects in the context of a model estimated across subjects rather than, as is more typical in this literature, over time. The inclusion of the student fixed effects means, as would be the case in longitudinal studies, that the effects of teacher credentials are estimated within students. In this case, that means they are based only on the variation in teacher credentials across the subjects for each specific student. This approach goes a long way to addressing the selection problem provided students are assigned to classrooms based on their overall, or average, ability or motivation, rather than on their likely success in a specific subject. Although we provide evidence in the paper in support of this key assumption, we cannot completely rule out the possibility that subject-specific unobservable abilities of students may be correlated with the teachers to whom they are assigned. This concern is analogous to the concern that arises in the context of longitudinal models about the role of time-varying student characteristics in the assignment of students to classrooms.

Effects of credentials: Differences between elementary and high school teachers.

Tables 3 to 8 summarize the results for various teacher credentials for teachers in elementary schools and high school that emerge from the Clotfelter, Ladd and Vigdor research.

In all cases the entries emerge from models that include all the other teacher credentials, student fixed effects and a variety of other covariates. The tables are designed to highlight both the similarities and differences in the estimates. The elementary school results come from model 4 described above which is based on longitudinal data with student fixed effects. The reported results should be interpreted as lower bound estimates.

Consider first the results for teacher experience in Table 4. Consistent with other studies the table shows that two-thirds of the impact of teacher experience (and about 5/6 off it at the high school level) is associated with the first few years of experience. 8 In addition, as is quite None of the models from which these results emerge include teacher fixed effects. Hence it is not possible to determine from this table alone whether the patterns of the estimates over time reflect learning on the job or the patterns of teacher attritions, either out of the elementary schools or out of the core ninth and tenth grade courses at the high school level. This issue is discussed in more detail and with additional estimates in both the cited papers.

** Ladd, Teacher Effects, Draft April 21, 2008**

common in the literature, at the elementary level the estimates are larger for math than for reading. The subsequent tables report elementary school results only for math; in all cases the unreported comparable coefficients for reading are somewhat smaller. Further the similarity in the results for teacher experience for the first two years for elementary and high school teachers in North Carolina and also with the 0.06 estimate for New York City reported in Table 2 above is striking given the different methods and data used.

Subsequent tables for other credentials also show remarkable similarities between the elementary and high school results. At both levels, non regular licensure, including lateral entry licenses, are negatively associated with student achievement relative to regular licensure, with the effects somewhat larger at the high school level (Table 4). Also at both levels, it appears that teachers who subsequently are National Board Certified, are more effective than other teachers, providing support for the view that Board Certification identifies the more effective teachers.

But, in contrast to the elementary level, at the high school level it appears that Certification itself may be associated with higher student achievement, leading to what some have called a positive human capital effect of the Board Certification process (Table 5). With respect to master’s degrees it appears that elementary school teachers who invest in a master’s degree part way into their teaching career are somewhat less effective than other teachers. At the high school level, the evidence is a bit more positive, in that such master’s degrees are predictive of small positive effects on student achievement (Table 6).

As shown in Table 7, the predictive effects of teacher licensure tests are similar at the two levels. At the high school level, we are able to look at the relationship between subject-specific teacher test scores and student performance by subject. Emerging from that analysis is the finding that higher teacher licensure test scores in math are clearly predictive of higher student achievement in algebra and geometry. Contrary to our expectations, students in English I do less well, all else held constant, when they have teachers with higher licensure test scores in English.

Finally, in table 8, we report results by certification status, but only at the high school level. The results indicate that certification matter most for math and science.