FREE ELECTRONIC LIBRARY - Abstracts, books, theses

Pages:   || 2 | 3 | 4 | 5 |   ...   | 7 |

«Consortium for Educational Research and Evaluation– North Carolina Comparing Value-Added Models for Estimating Individual Teacher Effects on a Statewide ...»

-- [ Page 1 ] --

Consortium for


Research and




Comparing Value-Added Models for

Estimating Individual Teacher Effects

on a Statewide Basis

Simulations and Empirical Analyses

Roderick A. Rose, Department of Public Policy & School of Social

Work, University of North Carolina at Chapel Hill

Gary T. Henry, Department of Public Policy & Carolina Institute

for Public Policy, University of North Carolina at Chapel Hill

Douglas L. Lauen, Department of Public Policy & Carolina Institute for Public Policy, University of North Carolina at Chapel Hill August 2012 Comparing Value-Added Models August 2012 Table of Contents Executive Summary


The Potential Outcomes Model

Stable Unit Treatment Value Assumption (SUTVA)


Violations of Assumptions

Typical Value-Added Models

Nested Random Effects Models

Fixed Effects Models

Hybrid Fixed and Random Effects Models

Summary of Models

VAM Comparison Studies


Data Generation Process

Variance Decomposition Simulation

Heterogeneous Fixed Effects Simulation

Calibration of Inputs

Number of Simulations

Actual NC Data Analysis

Comparison Criteria


Spearman Rank Order Correlations

Agreement on Classification in Fifth Percentiles

False Positives: Average Teacher Identified as Ineffective



Limitations and Implications




Consortium for Educational Research and Evaluation–North Carolina Comparing Value-Added Models August 2012




Executive Summary Many states are currently adopting value-added models for use in formal evaluations of teachers.

We evaluated nine commonly used teacher value-added models on four criteria using both actual and simulated data. For the simulated data, we tested model performance under two violations of the potential outcomes model: settings in which the single unit treatment value assumption was violated, and settings in which the ignorability of assignment to treatment assumption was violated. The performance of all models suffered when the assumptions were violated, suggesting that none of the models performed sufficiently well to be considered for high stakes purposes. Patterns of relative performance emerged, however, which we argue is sufficient support for using four value-added models for low stakes purposes: the three-level hierarchical linear model with one year of pretest scores, the three-level hierarchical linear model with two years of pretest scores, the Educational Value-Added Assessment System (EVAAS) univariate response model, and the student fixed effects model.

Consortium for Educational Research and Evaluation–North Carolina 2 Comparing Value-Added Models August 2012 Introduction A wide body of research into the effects of schooling on student learning suggests that teachers are the most important inputs and, consequently, that improving the effectiveness of teachers is a legitimate and important policy target to increase student achievement (Rockoff, 2004; Nye, Konstantopolous, & Hedges, 2004; Rowan, Correnti, & Miller, 2002). In order for education policymakers and administrators to use teacher effectiveness to achieve student performance goals, they must have accurate information about the effectiveness of individual teachers. A relatively recent but often recommended approach for obtaining teacher effectiveness estimates for use in large-scale teacher evaluation systems relies on value-added models (VAMs) to estimate the contribution of individual teachers to student learning; that is, to estimate the amount of gains to student achievement that each teacher contributes rather than focusing on levels of student achievement (Tekwe, Carter, Ma, Algina, Lucas, et al., 2004). These VAMs rely on relatively complex statistical methods to estimate the teachers’ incremental contributions to student achievement. Value-added models could be viewed as primarily descriptive measurement models or putatively causal models that attribute a portion of student achievement growth to teachers (Rubin, Stuart, & Zanutto, 2004); we take the latter view in this study.

Proponents maintain that VAM techniques evaluate teachers in a more objective manner than by observational criteria alone (Harris, 2009). By holding teachers to standards using outcomes, policymakers could move away from standards based on inputs in the form of educational and credentialing requirements and principals’ or others’ more subjective observations of teachers’ practices (Gordon, Kane, & Staiger, 2006; Harris, 2009). There are concerns that VAMs may not be fair appraisals of teachers’ effectiveness because they may attribute confounding factors, unrelated to instruction, to the teacher (Hill, 2009). Further, evidence suggests that teacher effectiveness scores may vary considerably from year to year (Sass, 2008; Koedel & Betts, 2011), despite teachers’ contentions that they do not vary their teaching style (Amrein-Beardsley, 2008), suggesting that the year-to-year variability is unrelated to teacher effectiveness. While the controversies about the accuracy and utility of VAMs continue to swirl, many states have agreed to incorporate measures of teacher effectiveness in raising student test scores into their teacher evaluations in order to receive federal Race to the Top (RttT) funds or achieve other policy objectives. The uses of teacher VAM estimates in the evaluation process vary from low stakes consequences, by which we mean an action such as developing a professional development plan;

to middle stakes, by which we mean actions such as identifying teachers for observation, coaching, and review; to high stakes, by which we mean denial of tenure or identifying highly effective teachers for substantial performance bonuses. In spite of the commitment by many states to use a VAM for estimating teachers’ effectiveness, there is no consensus within the research community on the approach or approaches that are most appropriate for use. Given these concerns and the widespread use of these models for teacher evaluation, evidence on the relative merits of VAMs is needed.

Several techniques for estimating VAMs have been compared using simulated or actual data (Guarino, Reckase, & Wooldridge, 2012; Schochet & Chiang, 2010; McCaffrey, Lockwood, Koretz, Louis, & Hamilton, 2004; Tekwe et al., 2004). Tekwe et al. (2004) used actual data, while McCaffrey et al. (2004) used both simulation data and actual data. Simulation studies have used either correlated fixed effects (Guarino, Reckase, & Wooldridge, 2012; McCaffrey et al., Consortium for Educational Research and Evaluation–North Carolina 3 Comparing Value-Added Models August 2012

2004) or variance decomposition frameworks for data generation (Schochet & Chiang, 2010). To date, no study has used both correlated fixed effects and variance decomposition simulated data as well as actual data. The present study aims to provide a more comprehensive assessment and uses all three types of data. Moreover, the present study compares nine common VAMs, more than in any other study published to date. Finally, we compare VAMs using the rankings of teachers in the true and estimated distributions of effectiveness using four distinct criteria that are relevant to policymakers, administrators, and teachers.

We compare these nine VAMs using simulated and actual data based on criteria that include their ability to recover true effects, consistency, and false positives in identifying particularly ineffective teachers (or ineffective teachers; the results are nearly identical). To determine which VAMs best handle the confounding influence of non-instructional factors and identify ineffective teachers, we generate simulation data, with known teacher effectiveness scores to compare with teacher effectiveness estimates from each VAM. We use actual data from a statewide database of student and teacher administrative records to examine the consistency between models and relative performance of each VAM in year-to-year consistency in teacher ranking.

In this study, we take the view that teacher effect estimates from VAMs are putatively causal, even though the conditions for identifying causal effects may not be present. Therefore, we first discuss the potential outcomes model (Reardon & Raudenbush, 2009; Rubin, Stuart, & Zanutto, 2004; Holland, 1986). Subsequently, we introduce seven common VAMs and then review existing studies comparing VAMs for teacher effect estimation in the context of the potentially unrealistic demands placed on these VAMs by the potential outcomes model. We then discuss the methods in the present study, including the data generation process for both simulations and the characteristics of the actual data, the form of the nine models compared, and the comparison techniques. We follow with the results of these comparisons. In the final section, we discuss the implications of these findings for additional research into VAMs for teacher effect estimation and implementation as a teacher evaluation tool.

Consortium for Educational Research and Evaluation–North Carolina 4 Comparing Value-Added Models August 2012 The Potential Outcomes Model Value-added models, in economic terms, measure the output resulting from combining inputs with technology (i.e., a process; Todd & Wolpin, 2003). If estimates from VAMs of student assessment data are to be inferred as and labeled teacher effect estimates then they should be viewed as causal estimates of teachers’ contributions to student learning. That is, the value added estimands are not simply descriptions of students’ average improvement in performance under a given teacher, but are effect estimates causally attributed to the teacher. This view coincides with the use of VAMs in education policies such as teacher evaluation. It is widely acknowledged that the process by which the teacher causes student learning does not have to be specified (see, for example, Todd & Wolpin, 2003; Hanushek, 1986). It is not as widely understood that the process by which students learn does not have to be fully specified in order to identify a causal teacher effect. The causality of the estimand from a VAM can instead be derived from assumptions that are independent of model specification (Rubin, Stuart, & Zanutto, 2004).

The assumptions of the potential outcomes model, if met, support the causal inference of teacher effect estimates from VAMs (Reardon & Raudenbush, 2009; Rubin, Stuart, & Zanutto, 2004;

Holland, 1986). The central feature of the potential outcomes model is the counterfactual —the definition of the causal estimand of a teacher’s effect on a student depends on what the student experiences in the absence of the specified cause—that is, under any other teacher besides the one to which the student was assigned. This enables us to ignore inputs to cumulative student knowledge that are equalized over different treatment conditions and are not confounded with treatment assignment. A formal model for causality begins as follows. First, assume that the outcome for student (with = 1,…N) under teacher is. Second, assume that each teacher is a separate treatment condition from J possible treatments, and each student has one potential or latent outcome under each possible teacher (of which at most one can actually be realized). This is a many-valued treatment (Morgan & Winship, 2007) with the potential outcomes represented by a matrix of N students by J treatments (Reardon & Raudenbush, 2009).

Because only one such treatment can be identified (the fundamental problem of causal inference;

Holland, 1986), the treatment effect is defined as a function of the distributions of students assigned to teacher and the students under any other teacher. Generally, this is implemented using linear models based on the average treatment effect for teacher ( ) comprised of students observed under assignment to teacher compared to the other teachers, which we label as.j (not j), e.g., a simple mean difference dij = ∆ – ∆. = E[ ] – E[. ]. An obvious candidate for.j is the teacher at the average level of effectiveness.

Reardon and Raudenbush (2009) identified six defining, estimating, and identifying assumptions of causal school effects that they suggested are also appropriate for teacher effects, two of which we make explicit here. Defining assumptions include (1) each student has a potential outcome under each teacher in the population (manipulability); and (2) the potential outcome under each teacher is independent of the assignment of other participants (the stable unit treatment value assumption, or SUTVA). Estimating assumptions include (3) students’ test scores are on an interval scaled metric; and (4) causal effects are homogeneous. Identifying assumptions, when satisfied, make it possible to infer the treatment effect as causal despite the fundamental problem of causal inference that only one of J potential outcomes can be realized. These assumptions Consortium for Educational Research and Evaluation–North Carolina 5 Comparing Value-Added Models August 2012 include (5) strongly ignorable or unconfounded assignment to teachers; and (6) each teacher is assigned a “common support” of students, which may be relaxed to assume that each teacher is assigned a representatively heterogeneous group of students, to estimate an effect that applies to all types of students. This last assumption may alternatively be met by extrapolation of any teacher’s unrepresentative group of students to students that the teacher was not assigned if the functional form of the model (e.g., linear or minimal deviations from linearity) supports such extrapolation.

Building on the formal model of causality discussed above, this section presents a formal discussion of two of the six assumptions of the potential outcomes model that are relevant to the comparison between VAMs in the present study, drawing heavily on Reardon and Raudenbush (2009).

Stable Unit Treatment Value Assumption (SUTVA) SUTVA implies that the treatment effect of any teacher on any student does not vary according

to the composition of that teacher’s classroom (Rubin et al., 2004):

–  –  –

is an N x J matrix of ij elements recording the assignment of students to teachers, with ij = 1 if i is assigned to j and ij = 0 otherwise. The statement above makes it explicit that is invariant to all permutations of, a vector indicating each student’s assignment to treatment. Ruled out by this assumption are effects based on composition of the classroom, including those attributable to peer interactions and those between peers and teachers. Therefore, a student assigned to a math classroom with higher achieving peers should have the same potential outcome under that teacher’s treatment as they would if the classroom contained lower achieving peers. The effects that classroom composition may have on learning make this assumption challenging to support. For example, if teachers alter instruction based on the average achievement level of the class, these effects imply that the treatment effect for a single student is heterogeneous according to the assignment of peers (Rubin et al., 2004).

Ignorability The second assumption, ignorability, implies that each student’s assignment to treatment—that is, their assignment to a specific teacher (A)—is independent of their potential outcome under

that teacher (Morgan & Winship, 2007):

–  –  –

Pages:   || 2 | 3 | 4 | 5 |   ...   | 7 |

Similar works:

«‘Ireland in Schools’ St Brendan the Navigator Notes for teachers 1. The voyage of St Brendan not just a tall tale? A note to help answer some of the queries raised in the classroom.2. Navigatio Sancti Brendanis Abbatis (Voyage of Saint Brendan the Abbot) ‘The bare bones’ of the tenth-century text of St Brendan’s voyage, ‘rendered down into the factual narrative of a remarkable venture by sea’ ‘Ireland in Schools’ The voyage of St Brendan not just a tall tale? A note to help...»

«Harmonic Motion: The Pendulum Lab Teacher Version In this lab you will set up a pendulum using rulers, string, and small weights and measure how different variables affect the period of the pendulum. You will also use the concept of resonance to make pendulums swing without any initial push.Prerequisites: Students doing the basic version of this lab should be comfortable dividing by 10.California Science Content Standards:  1. Newton's laws predict the motion of most objects.  1a....»

«Phil Smith Symposium GUERILLA PEDAGOGY: ON THE IMPORTANCE OF SURPRISE AND RESPONSIBILITY IN EDUCATION Lisa D. Weems Miami University The ability to respond is what is meant by responsibility, yet our cultures take away our ability to act—shackle us in the name of protection. Blocked, immobilized, we can’t move forward, can’t move backwards. That writhing serpent movement, the very movement of life, swifter than lightning, frozen. —Gloria Anzaldúa1 “The Enlightenment is sick at...»

«Krzysztof Zajdel Father’s Role in Children’s Upbringing Pedagogika Rodziny 5/1, 77-87 Pedagogika Rodziny. Family Pedagogy nr 5(1)/2015, ss. 77–87 DOI: 10.1515/fampe-2015-0007 Krzysztof Zajdel Uniwersytet Zielonogórski Father’s Role in Children’s Upbringing Abstract: This articles tackles the issue of the importance of a father for his family, particularly own children. It provides a small historical description of changes that occurred within families from the prehistoric times until...»

«Written and illustrated by Diane deGroat Teacher friendly and ready to use, this guide aligns with the Common Core State Standards (CCSS) and is appropriate for kindergarten through grade three. It includes discussion questions, fun multidisciplinary activities, and printable sheets. It is a perfect tool to use for your Diane deGroat author study. Your students will be meaningfully engaged and ask for more books about their favorite opossum, Gilbert. Guides for other Gilbert and Friends books...»

«Predicting Success in College: The Importance of Placement Tests and High School Transcripts Clive R. Belfield Queens College, The City University of New York Peter M. Crosta Community College Research Center Teachers College, Columbia University February 2012 CCRC Working Paper No. 42 Address correspondence to: Clive R. Belfield Associate Professor of Economics Queens College, The City University of New York 65-30 Kissena Blvd Flushing, NY 11367 Email: clive.belfield@gmail.com The authors...»

«UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS International General Certificate of Secondary Education MARK SCHEME for the May/June 2006 question paper 0500 FIRST LANGUAGE ENGLISH 0500/02 Paper 2, maximum raw mark 50 These mark schemes are published as an aid to teachers and students, to indicate the requirements of the examination. They show the basis on which Examiners were initially instructed to award marks. They do not indicate the details of the discussions that took place at an...»

«s A TEACHER’S GUIDE TO THE SIGNET EDITION OF TENNESSEE WILLIAMS’S A STREETCAR NAMED DESIRE By ROBERT C. SMALL, JR., Ed.D., Radford University SERIES EDITORS: W. GEIGER ELLIS, ED.D., UNIVERSITY OF GEORGIA, EMERITUS and ARTHEA J. S. REED, PH.D., UNIVERSITY OF NORTH CAROLINA, RETIRED ISBN: 0-451-52992-8 Copyright © 2004 by Penguin Group (USA) For additional teacher’s manuals, catalogs, or descriptive brochures, please email academic@penguin.com or write to: PENGUIN GROUP (USA) INC. Academic...»

«Research in Science Education (2005) 35: 173–195 © Springer 2005 DOI: 10.1007/s11165-004-8162-z Towards a Framework of Socio-Linguistic Analysis of Science Textbooks: The Greek Case Kostas Dimopoulos, Vasilis Koulaidis and Spyridoula Sklaveniti University of Peloponnese Abstract This study aims at presenting a grid for analysing the way the language employed in Greek school science textbooks tends to project pedagogic messages. These messages are analysed for the different school science...»

«HIGHER EDUCATION Exploring Possibility Challenging Curriculum, Students, and Teachers to be Engaged and Critical PAUL CRUTCHER Michigan State University BEGAN TEACHING an established children's and young adult (YA) literature course in the Teacher Education department at Michigan State University in Spring 2009. The course centered on introducing students to literature as literature, developing competence in structural understanding of Literature (e.g., third-person, historical fiction,...»

«Overseers In The Church Of Christ 1 Timothy 3:1-7 Introduction It’s been said that God doesn’t call the qualified but rather qualifies the called. In Paul’s instructions to Timothy he covers the topics of false teachers, sound doctrine, spiritual warfare, public worship, prayer, the place of women and now Paul sets forth the qualifications for those who occupy the office of ruling elders or bishops or overseers. The Office Of Bishop Or Elder (v.1) 1 Timothy 3:1 (NKJV)1This is a faithful...»

«1 But What About That Gigantic Elephant in the Room? Albert Bandura Stanford University When I began my career, more than half a century ago, behaviorism had a stranglehold on the field of psychology. It focused almost entirely on learning by direct experiences through paired stimulation and response consequences. This type of theorizing was at odds with the conspicuous social reality that much of what people learn is through the power of social modeling. Direct experience is an unmercifully...»

<<  HOME   |    CONTACTS
2017 www.sa.i-pdf.info - Abstracts, books, theses

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.