«Douglas Harris Tim R. Sass Dept. of Educational Leadership & Policy Studies Dept. of Economics Florida State University Florida State University ...»
-- PRELIMINARY DRAFT, DO NOT QUOTE, COMMENTS WELCOME -Value-Added Models
and the Measurement of Teacher Quality
Douglas Harris Tim R. Sass
Dept. of Educational Leadership & Policy Studies Dept. of Economics
Florida State University Florida State University
Original Version: March 9, 2005
This Version: April 3, 2006 Abstract The recent availability of administrative databases that track individual students and their teachers over time has lead to both a surge in research measuring teacher quality and interest in developing accountability systems for teachers. Existing studies employ a variety of empirical models, yet few studies explicitly state or test the assumptions underlying their models. Using an extensive database from the State of Florida, we test many of the central assumptions of existing models and determine the impact of alternative methods on measures of teacher quality. We find that the commonly used “restricted valueadded” or “achievement-gain” model is a good approximation of the more cumbersome cumulative achievement model. Within the context of the restricted value-added model, we find it is important to control for unmeasured student, teacher and school heterogeneity. Relying on measurable characteristics of students, teachers and schools alone likely produces inconsistent estimates of the effects of teacher characteristics on student achievement. Moreover, individual-specific heterogeneity is more appropriately captured by fixed effects than by random effects; the random effects estimator yields inconsistent parameter estimates and estimates of time-invariant teacher quality that diverge significantly from the fixed effects estimator. In contrast, the exclusion of peer characteristics and class size each have relatively little effect on the estimates of teacher quality. Using aggregated grade-within-school measures of teacher characteristics produces somewhat less precise estimates of the impact of teacher professional development than do measures of the characteristics of specific teachers. Otherwise, aggregation to the grade level doesn’t have a substantial effect. These findings suggest that many models currently employed to measure the impact of teachers on student achievement are mis-specified.
* We wish to thank the staff of the Florida Department of Education's K-20 Education Data Warehouse for their assistance in obtaining and interpreting the data used in this study. The views expressed is this paper are solely our own and do not necessarily reflect the opinions of the Florida Department of Education. This work is supported by Teacher Quality Research grant R305M040121 from the United States Department of Education Institute for Education Sciences. Thanks also go to Anthony Bryk for useful discussion of this research.
I. Introduction In the last decade the availability of administrative databases that track individual student achievement over time has radically altered how education research is conducted and has brought fundamental changes to the ways in which educational programs and personnel are evaluated. Prior to the development of the Texas Schools Project by John Kain in the 1990s,1 studies of student achievement and the role of teachers in student learning was limited largely to cross-sectional analysis of student achievement levels or simple two-period studies of student achievement gains. Now, in addition to Texas, statewide longitudinal databases exist in North Carolina and Florida as well as in large urban
longitudinal databases has allowed researchers to measure changes in achievement at the individual student level, thereby controlling for the influences of students and families when evaluating educational programs.
The availability of student-level panel data is also fundamentally changing school accountability and the measurement of teacher performance. In Tennessee and Dallas, models of individual student achievement have been used to measure teacher performance.2 While the stakes are currently low in these cases, there is growing interest among scholars and policymakers alike to use the measures for highstakes merit pay, school grades, and other forms of accountability. Denver and Houston have recently adopted merit pay systems based on student performance and Florida plans to implement a statewide system beginning in the 2006-2007 school year.
The use of student-level longitudinal data in education research and systems of accountability is likely to expand even more rapidly in the coming years. With the new federal No Child Left Behind statute, testing requirements will increase so that all students will be tested in grades 3-8 in every state.
Thus in a few years, all states will have the capability to track student achievement over time. This The Texas Schools Project was begun in 1992, but it took several years to create a unified database and the first research to exploit the data was not written until 1998.
See Sanders and Horn (1998) and Mendro (1998) and references therein.
wealth of new data will bring great opportunities as well as significant challenges for the analysis of educational programs and policies.
In just the last few years, a plethora of studies have made use of the new student-level panel data sets to analyze the determinants of student achievement. However, no consensus has developed on the appropriate model specifications and empirical methods. In most cases the assumptions underlying the empirical models employed are unstated and untested and rarely are comparisons made between alternative methods.
Two recent studies, Todd and Wolpin (2005) and Ding and Lehrer (2005), investigate alternative forms of the cumulative achievement function, emphasizing the impact of historical home and schooling inputs on current achievement. Neither is directly concerned, however, with measuring the impact of teachers on student learning. Todd and Wolpin focus on the effect of family inputs on educational outcomes. Assignment of teachers to students within a school is assumed to be exogenous and only school-level averages of teacher inputs are used in their analysis. Ding and Lehrer exploit data from the Tennessee class-size experiment where students were randomly assigned to teachers and thus avoid the problems associated with measuring teacher quality.
In this paper we consider some of the same specification issues that are tested by Todd and Wolpin and Ding and Lehrer, but also investigate what factors are important in obtaining relatively consistent and precise estimates of the impact of teachers on student achievement. In section II we consider the general form of achievement functions and the effect of prior educational inputs on contemporaneous student achievement. Section III analyzes the measurement of schooling inputs that may influence student achievement, including peers, teachers and school-level variables. In section IV we discuss alternative methods of controlling for student and family characteristics. Section V discusses our data and in section VI we present our results. In the final section we summarize our findings and consider the implications for future research and for the implementation of accountability systems.
II. Achievement Model and the Treatment of Past Inputs A. General Cumulative Model of Achievement In order to clearly delineate the empirical models that have been estimated, we begin with
a general cumulative model of student achievement in the spirit of Todd and Wolpin (2003):
where Ait is the achievement level for individual i at the end of their tth year of life, Xi(t), Fi(t) and Ei(t) represent the entire histories of individual, family and school-based educational inputs, respectively. The term µi0 is a composite variable representing time-invariant characteristics an individual is endowed with at birth (such as innate ability), and εit is a normally distributed, mean-zero error.
If we assume that the cumulative achievement function, At[⋅], does not vary with age3 and
is additively separable,4 then we can rewrite the achievement level at age t as:
This assumption implies that the impact of an input on achievement varies with the time span between the application of the input and measurement of achievement, but is invariant to the age at which the input was applied.
Thus, for example, attending a private school in kindergarten has the same effect on achievement at the end of third grade as does attending a private school in second grade on fifth-grade achievement.
Figlio (1999) explores the impact of relaxing the assumption of additive separability by estimating a translog education production function.
where α1, ϕ1 and β1 represent the vectors of weights given to contemporaneous individual, family and school inputs, α2, β2 and ϕ2 the weights given to last year's inputs and so on. The impact of the individual-specific time-invariant endowment in period t is given by ψt.
B. Cumulative Model with Fixed Family Inputs Estimation of equation (2) requires data on both current and all prior individual, family and school inputs. However, administrative records contain only limited information on family characteristics and no direct measures of parental inputs.5 Therefore, it is necessary to assume that family inputs are constant over time and are captured by a student-specific fixed component, ζi. However, the marginal effect of these fixed parental inputs on student achievement may vary over time and is represented by κt.
The assumption of fixed parental inputs of course implies that the level of inputs selected by families does not vary with the level of school-provided inputs a child receives. For example, it is assumed that parents do not systematically compensate for low-quality schooling inputs by providing tutors or other resources.6 Similarly, it is assumed that parental inputs are invariant to achievement realizations; parents do not increase their inputs when their son or daughter does poorly in school.
The validity of the assumption that family inputs do not change over time is hard to gauge. Todd and Wolpin (2005), using data from the National Longitudinal Survey of Youth 1979 Child Sample (NLSY79-CS), consistently reject exogeneity of family input measures at a 90 percent confidence level, but not at a 95 percent confidence level. They have only limited Typically the only information on family characteristics is the student participation in free/reduced-price lunch programs, a crude measure of family income. Data in North Carolina also include teacher-reported levels of parental education.
For evidence on the impact of school resources on parental inputs see Houtenville and Conway (2003) and Bonesr nning (2004).
aggregate measures of schooling inputs (average pupil-teacher ratio and average teacher salary measured at the county or state level) and the coefficients on these variables are typically statistically insignificant, whether or not parental inputs are treated as exogenous. Thus it is hard to know to what extent the assumed invariance of parental inputs may bias the estimated impacts of schooling inputs. It seems reasonable, however, that parents would attempt to compensate for poor school resources and therefore any bias in the estimated impacts of schooling inputs would be toward zero.
Given the assumption that family inputs and student ability are time-invariant (but may have differing marginal effects), we can combine the individual endowment and family inputs
into a single component, ωtχi = κtζi + ψtµi0. The achievement equation becomes:
Equation (3) represents our baseline model – the least restrictive specification of the cumulative achievement function that can conceivably be estimated with administrative data. In this very general specification current achievement depends on current and all prior individual timevarying characteristics and school-based inputs as well as the student’s (assumed time invariant) family inputs and the fixed individual endowment.
Given the burdensome data requirements and computational cost of the full cumulative model, equation (3) has never been directly estimated for a large sample of students.7 Rather, various assumptions have been employed to reduce the historical data requirements. There are several ways to avoid direct estimation of these lagged effects based on the assumed persistence Todd and Wolpin (2005) estimate the cumulative achievement model using a sample of approximately 7,000 students from the NLSY79-CS. Although they possess good measures of parental inputs and achievement levels the have only a few general measures of schooling inputs measured at the county or state level.
specifications and the associated restrictions on the cumulative achievement function, moving from the least-restrictive to the most restrictive specification.
C. The Unrestricted Value-Added Specification8 Suppose the marginal impacts of all prior school inputs decline geometrically with the time between the application of the input and the measurement of achievement at the same rate so that β2=λβ1, β3=λ2β1, etc., where λ is a scalar. The achievement equation can then be
Assuming the impact of the initial endowment and family inputs on achievement, ωt, changes at a constant rate then (ωt - λωt-1) can be expressed as a constant, ϖ. Combining the family inputs
with the initial individual endowment into a single component yields:
This specification is also sometimes referred to as the “covariate adjustment model” in the education literature.
where γi = ϖχi is an individual student effect and ηit = εit - λεit-1 is a random error.
Thus, given the assumed geometric rate of decay, the current achievement level is a function of contemporaneous student and school-based inputs as well as lagged achievement and an individual-specific effect. The lagged achievement variable serves as a sufficient statistic for all past schooling inputs, thereby avoiding the need for historical data on teachers, peers and other school-related inputs.
Ordinary least squares (OLS) estimation of equation (7) is problematic. Since ηit is a function of the lagged error, εit-1, the lagged achievement term, Ait-1, will be correlated with the error term in equation (7), ηit, and OLS estimates of equation (7) will in general be biased. A number of studies have estimated the effect of teacher quality on student achievement by estimating equation (7) by OLS, ignoring the correlation between the lagged dependent variable and error (eg. Aaronson, Barrow and Sander (2003), Clotfelter, Ladd and Vigdor (2005), Goldhaber and Brewer (1997) and Nye, Konstantopoulos and Hedges (2004)). To obtain consistent estimates it is necessary to use an instrumental variable estimation technique, typically incorporating At-2 and longer lags as instruments. Because of the data requirements and computational burden this has rarely been done. Two exceptions are Ding and Lehrer (2005) and Sass (2006).
D. The Restricted Value-Added Specification Rather than assume a constant rate of decay in the impact of schooling inputs on student achievement, an alternative approach is to assume that there is no decay in the effect of past
As noted by Boardman and Murnane (1979) and Todd and Wolpin (2003), this implies that the effect of each input must be independent of when it is applied. In other words, school inputs each have an immediate one-time impact on achievement that does not decay over time. For example, the quality of a child's kindergarten must have the same impact on their achievement at the end of age 5 as it does on their achievement at age 18.