WWW.SA.I-PDF.INFO
FREE ELECTRONIC LIBRARY - Abstracts, books, theses
 
<< HOME
CONTACTS



Pages:   || 2 | 3 | 4 |

«Exploring Explanations for the “Weak” Relationship Between Value Added and Observation-Based Measures of Teacher Performance Mark Chin and Dan ...»

-- [ Page 1 ] --

Exploring Explanations for the “Weak” Relationship Between Value Added and

Observation-Based Measures of Teacher Performance

Mark Chin and Dan Goldhaber

1. Introduction

Since 2009, 49 states and the District of Columbia have changed their teacher evaluation

systems in response to federal incentives, such as flexibility waivers to No Child Left Behind and

Race to the Top grants. 1 In many cases teacher evaluation reforms have included the use of

student growth, or “value-added”, measures of teacher performance. These measures of teachers’ contributions to student performance on standardized tests represent a relatively new way to assess practicing teachers, though value-added models have been employed as an analytic tool for decades by researchers (e.g., Hanushek, 1971; Murnane, 1981). Value-added measures are also controversial (Baker et al., 2010; Darling-Hammond, Amrein-Beardsley, Haertel, & Rothstein, 2012) and can only be used to assess teachers in tested grades and subjects, who represent less than 33 percent of the teacher workforce (Papay, 2012). Not surprisingly, given their history as an evaluation tool that can be used to assess all teachers, virtually all states also include observations of teachers’ classroom practice as a component in a summative evaluation (Doherty & Jacobs, 2013).

It is unclear what the relationship ought to be between value-added and observational measures, but the relationship is often characterized as being “modest” or “weak” (e.g. Harris, 2012). Moreover, some judge the relationship between these measures (described more extensively below) to be problematic for use by policymakers who might wish to use value added and observations together to identify effective or ineffective teachers. Audrey AmreinBeardsley (2014), for instance, notes that “value-added scores do not align well with observational scores, as they should if both measures were to be appropriate[ly] capturing the ‘teacher effectiveness’ construct”. Notwithstanding the characterization of the relationship between value-added and observational measures, several scenarios exist that result in a weak See Minnici, 2014.

2/26/2015 Please do not cite or distribute without consent of the authors. 2 correlation; not all of them suggest that the two measures capture different teacher effectiveness constructs. Variation in the multidimensionality, validity, 2 and reliability of value added and observations distinguish these scenarios from one another.

Few studies have investigated the scenarios that might explain attenuated correlations between value-added and observational measures, or have suggested which are unlikely given observed correlations in prior research. Our paper explicitly illustrates these different scenarios, and uses simulated data to formally investigate the extent to which one or another explanation is likely to explain weak correlations between the measures. We explore the levels of correlation between value-added and observation scores after varying two broad factors. First, we adjust the correlation of each teacher’s score on an underlying dimension of “teacher quality” to its two different proxy measures: error-free value added and error-free observational measures of teacher practice. This adjustment allows us to investigate the effect of changes in the validity of these measures. Second, we add error to these measures to create simulated outcomes (i.e., “student test performance” or “lesson performance”), and vary the number of outcomes used to estimate measure scores. This adjustment allows us to investigate the effect of changes to measure reliability. With the results from our simulations, we attempt to answer the following research question: What is the magnitude of the correlation between value-added and observation scores, given different levels of validity and reliability for each measure of teacher quality?

In what follows, we recount the historic use of value-added and observational measures in teacher evaluation systems, the research on their relationship, and the factors that impact this We discuss two types of measure validity in our paper. The first type refers to the extent to which value added and observations serve as good proxies for some desirable underlying dimension or dimensions of teacher quality. The second type refers to the extent to which the performance of a teacher’s students on tests, or the performance of a teacher during observed lessons, reflect his or her true value-added or observation scores, respectively (also referred to as “systematic error”, see McCaffrey, Lockwood, Koretz, Louis, & Hamilton, 2004). We use the term “validity” to represent the first type, unless otherwise specified.

–  –  –

parameters we vary to reflect these key factors. After describing our process for creating the simulated data and method of analysis, we discuss the simulations’ results (Section 4). Finally (in Section 5), we discuss the implications for researchers and practitioners and offer some concluding thoughts.

2. Value-Added and Observational Measures of Teacher Quality and Their Relationship Value-added methods have long been used as a means of assessing both educational productivity and the effects of specific schooling inputs (e.g., Hanushek, 1971; Murnane, 1981).

They have also been used to assess the implications of differences amongst individual teachers and the extent to which individual teachers explain the variation in student test performance (e.g., Goldhaber, Brewer, & Anderson, 1999; Hanushek, 1992; Nye, Konstantopoulos, & Hedges, 2004). Though a few states and districts began using value-added and other related testbased measures of teacher quality in the late 1990s (Sanders & Horn, 1998), it is only in recent years that the use of value-added measures has proliferated across the nation. This proliferation has engendered debates amongst researchers and policymakers about whether value added is a fair measure of teachers’ contributions in the classroom, and, relatedly, how its use will affect teachers and students.





Value added has been linked to long-term student outcomes (Chetty, Friedman, & Rockoff, 2014b) and been shown to be unbiased in some experimental and quasi-experimental settings (Bacher-Hicks, Chin, Kane, & Staiger, in preparation; Chetty, Friedman, & Rockoff, 2014a; Kane & Staiger, 2008; Kane, McCaffrey, Miller, & Staiger, 2013). Yet questions remain about the extent to which value added may be used to obtain unbiased estimates of teacher performance (Rothstein, 2008, 2014), and, even if the measures are unbiased, whether they are 2/26/2015 Please do not cite or distribute without consent of the authors. 4 stable enough from year to year to use, 3 or would have negative ramifications for teacher behavior (Baker et al., 2010; Darling-Hammond et al., 2012).

Scholars have similarly investigated the quality of teachers through their practices in the classroom for decades (Brophy & Good, 1986). Compared to value-added measures, classroom observations of teaching have longer played a role in evaluation systems, yet have not faced the same level of academic scrutiny as value added (Corcoran & Goldhaber, 2013). 4 Recent findings, however, have found that the traditional observation systems used in some states and districts failed to meaningfully differentiate teachers (Weisberg, Sexton, Mulhern, & Keeling, 2009). Revisions to preexisting observation systems have led some locales to adopt observation protocols developed by the academic community, such as the Danielson Group’s Framework for Teaching (Herlihy et al., 2014). These protocols, which are also widely used in research projects, identify key classroom practices that, in theory, should be important for student learning, and also standardize how teachers are evaluated on these practices.

The relationship between value added and observations A number of the recently implemented educator evaluation reforms include the use of multiple measures of teacher quality, and many states and districts use both value-added and observational measures when assessing teachers’ performance (Herlihy et al., 2014). Not surprisingly, there is a growing research base that explores the extent to which these measures are related to one another. For example, the Measures of Effective Teaching (MET) project, a large scale study of teacher quality, explored the relationship of teacher value added and observations and found correlations between the two measures ranging from 0.12 to 0.34, See Goldhaber and Hansen (2013) and McCaffrey, Sass, Lockwood, and Mihaly (2009) for estimates of the stability of value added.

4 See Cohen and Goldhaber (2015) for a review of this role and a comparison of what we know about the properties of observations and value added.

2/26/2015 Please do not cite or distribute without consent of the authors. 5 depending on the observation protocol (Kane & Staiger, 2012). With some exceptions (e.g., Schachter & Thum, 2004), most other recent studies have replicated this pattern of a weak or moderately weak relationship when analyzing similar observation protocols (e.g., Bell et al., 2012; Grossman, Loeb, Cohen, & Wyckoff, 2013; Hill, Kapitula, & Umland, 2011; Kane, Taylor, Tyler, & Wooten, 2011). These findings contradict what many scholars and practitioners might expect. Theory and intuition suggests that strong instructional practices by teachers should lead to improvements in student test performance. In this paradigm, value-added and observation scores should be highly correlated.

Furthermore, states and districts have practical reasons to be concerned about the weak relationships observed in extant literature. A weak relationship may indicate that one or both are not valid measures of some dimension of teacher quality. It also sends contradicting signals to practitioners about their strengths and weaknesses, which in turn may inhibit the improvement of teachers’ practice (Polikoff, 2014). Finally, it could serve to undermine the trust in teacher evaluation systems, making it more politically difficult to use evaluations to inform key personnel decisions such as compensation or tenure (Herlihy et al., 2014).

Explanations for the weak relationship between value added and observations There exist at least three scenarios that result in weak correlations between value added and observations. The first is that one or both measures could provide unreliable estimates of one or more dimensions of teacher quality, due to sampling error. The second is that teacher quality may be multidimensional, and the measures provide reliable estimates of different dimensions of teacher quality. And the third is that one or more of the measures may be invalid, in the sense that the measure does not provide a reliable estimate of any dimension of teacher quality. We provide simple illustrations of these scenarios in Figure 1.

–  –  –

In Panels A and B of the figure, we depict underlying dimensions of teacher quality (TQ) with the bullseyes in the targets. In practice, we use value added and observations to serve as proxy measures for each teacher’s quality, which we never observe. We also never observe each teacher’s true, error-free value-added or observation score. Instead, we estimate value-added and observation scores from two different observed outcomes, represented in the figure: student test performance (v) and performance on lessons (o), respectively. The clouds around each set of outcomes show the distribution of the data points used to estimate each measure, with a darker color representing estimates based on the aggregation of information from each measure (e.g.

from multiple student test results, or multiple observed lessons). The dashed, two-headed arrow represents the distance or correlation between the two different measures of teacher quality; a shorter arrow indicates that the two measures align more closely. Moving from the left target to the right in either Panel A or B of Figure 1, the amount of information for each measure of teacher quality increases (e.g. through more having more students’ test results or observing teachers’ lessons more often), increasing the reliability of each measure.

The leftmost illustration in Panel A depicts the first scenario for weak correlations, where both measures would serve as valid proxies for the same dimension of teacher quality, but are estimated unreliably. Value added and observations could be estimated unreliably due to factors such as observing a teacher on a particularly good or bad day, or analyzing the test results of students who by chance perform well or poorly on a test; either would add sampling error to scores. To counteract sampling error in value added, many research projects will estimate teachers’ value added using Empirical Bayes estimators, which shrink scores that are estimated 2/26/2015 Please do not cite or distribute without consent of the authors. 7 less reliably (e.g., estimated from the test performance of fewer students) toward the mean (e.g., Kane & Staiger, 2008; Sanders & Horn, 1994). 5 Another way to counteract sampling error is to estimate value added and observations with as many data points as possible. For example, the stability of value-added measures, moderate when estimated from a single year of student test performance data (McCaffrey et al., 2009), improves when using multiple years of data (Goldhaber & Hansen, 2013). Though states and districts need to consider the financial and temporal burdens associated with reducing sampling error in value added and observations by increasing data points, improving measure reliability would disattenuate the relationship between both.

In research and practice, teacher value added and performance on observations are often treated as measures of the same underlying construct—the scenario depicted in Panel A.

However, there are reasons to believe that they are not, and that Panel B of Figure 1 depicts a more accurate representation of reality. Panel B of the figure illustrates a case where there are two dimensions of teacher quality (TQ1 and TQ2) and each measure of quality is a reliable estimate of only one of the dimensions. For example, one dimension might capture the degree to which teachers contribute to student knowledge, and a second dimension might be the extent to which teachers contribute to students’ ability to interact productively with one another. These dimensions of teacher quality may or may not be closely related, and the correlation between the measures of teacher quality may or may not increase as the reliability of each measure increases.

In the example depicted by Panel B, the correlation between the measures decreases (i.e., the arrows become longer) as each measure of teacher quality becomes more reliable, moving from In theory, the same adjustment for reliability can apply for observations as well. In practice, however, little research appears to use Empirical Bayes estimators to adjust scores for differences in the number of lessons observed. However, it is not clear that such estimates provide the best indicator of teacher effectiveness (see, for instance, Mehta, 2015).

–  –  –

a low correlation between the measures: that each measure provides a reliable estimate of different dimensions of teacher quality.

Some empirical evidence substantiates this second explanation for weak correlations. For instance, prior research suggests that measures of teacher contributions to the performance of students on different tests may themselves capture divergent dimensions of teacher quality. The most obvious example of this divergence emerges when comparing teachers’ value added in different subjects; for example, one might not expect a teacher’s contributions to performance on a mathematics exam to be measuring the same type of quality as his or her contributions to performance on a reading exam (Fox, forthcoming; Gershenson, forthcoming; Goldhaber, Cowan, & Walch, 2013; Rockoff, 2004).



Pages:   || 2 | 3 | 4 |


Similar works:

«Hip Habits Manners for Kids Wheel and Manners for Kids Teacher Guide Good manners are an important key to each child’s social success. Parents and teachers often desire to go beyond the basics of “Please” and “Thank you” in teaching proper manners. Teaching manners is a daily process. There are always opportunities to teach expected behaviors in social situations. The Hip Habits Manners for Kids Wheel provides information on table manners, teasing and bullying, rude behaviors,...»

«25 Tricks For Teachers A Manual of Minor Miracles for Magically-Minded Mathematicians! Stephen D. T. Froggatt Head of Maths Oaks Park High School Ilford, Essex Mathemagic: 25 Tricks For Teachers CONTENTS 1. Evens & Odds 2. Magic Squares 4 x 4 Any Total 3. Magic Squares (2n+1) x (2n+1) 4. Best Of 9 Cards 5. Four Cards From 12 6. Think Of A Number And Variations featuring Jam Jar Algebra 7. Fibonacci Sums 8. Tip-Top Topology: Rope Escape & Linking Paperclips 9. Vanishing Line and Vanishing Area...»

«“Almost as annoying as the Yank; better accent, though.” -Attitudes and Conceptions of Finnish Students toward Accents of English Pro Gradu Henrik Hakala Department of English University of Helsinki Instructor: Prof. Anna Mauranen Contents Contents 1. Introduction 2. Theoretical Framework 2.1 Attitudes 2.2 Language Attitude Studies 2.2.1 Techniques in language attitudes studies 2.2.2 Previous Studies on Non-native Speaker Attitudes 2.3 Language Attitudes 2.3.1 Language attitude assessment...»

«Lesson 11 | 187 Lesson Plans Affixes With Unchanging Base Words Lesson 11 OBJECTIVES • Students will read words with affixes.• Students will form words with affixes. NOTE: This lesson focuses on base words whose spelling does not change when an affix is added. Base words whose spelling changes when adding a suffix (e.g., plan–planned, funny–funnier, make–making) are taught in a later lesson. MATERIALS • Lesson 11 letter cards* • Word cards from previous lesson (featuring base...»

«A comparative study of international mindedness in the IB Diploma Programme in Australia, China and India July 2014 Dr Arathi Sriprakash Professor Michael Singh Dr Qi Jing Executive summary This report discusses findings from a qualitative study of international mindedness in the Diploma Programme (DP) in six International Baccalaureate (IB) schools in Australia, China and India. The aim is to provide a resource for the IB community of empirically based concepts and practices of international...»

«the jaltcalljournal ISSn 1832-4215 Vol. 8, no.1 Pages 17–32 ©2012 jAlt cAll SIg The growing body of literature on Computer CALL Assisted Language Learning (CALL) in the past decades has advocated the use of CALL in edunormalization: cational circles in order to get to a normalized state. However, the uptake of CALL is stipuA survey on lated by various known and unknown factors. This study examined various factors that may inhibitive factors have influenced the uptake of CALL resulting in...»

«HOW TO CREATE A HOLLYWOOD CHRIST-FIGURE: SACRED STORYTELLING AS APPLIED THEOLOGY Anton Karl Kozlovic ABSTRACT This is the second century of the age of Hollywood, yet, the pedagogic utilisation of popular feature films as a legitimate extra-ecclesiastical resource for the study of theology is frequently ignored, unappreciated or under-utilised. However, to remain culturally relevant in the post-Millennial period, the profession needs to integrate movies into the curricula that go beyond their...»

«New Zealand Journal of Asian Studies 13, 1 (June 2011): 46-61 Bridging the Cultural gulfs ChC teaChers in new Zealand sChools Dekun Sun Victoria University of Wellington Introduction The rapid increase in Chinese language programmes worldwide has created an increased demand for qualified Chinese language teachers. As Wang (2009, p. 283) notes, “the lack of quantity and quality of Chinese language teachers constitutes the key bottleneck in building capacity” for the sustainable development...»

«EYES ON DANCE Study Guide for Teachers and Students Table of Contents Page 3.Attending a Ballet Performance Page 4.About Pacific Northwest Ballet Page 5.About Hansel and Gretel Page 6-8.The Story of Hansel and Gretel Pages 9-11.About the Artists Pages 12-13.Discussion Topics Page 14.Additional Resources The March 22, 2013, EYES ON DANCE matinee of Hansel and Gretel will feature the entire narrated ballet, performed by sixty Pacific Northwest Ballet School students and guest artists. The...»

«FAST FACTS FOR FACULTY Guided Notes IMPROVING THE EFFECTIVENESS OF YOUR LECTURES Developed by William L. Heward The Ohio State University Partnership Grant Improving the Quality of Education for Students with Disabilities What Are Guided Notes? Guided notes are instructor-prepared handouts that provide all students with background information and standard cues with specific spaces to write key facts, concepts, and/or relationships during the lecture. {See example on page 5}. Guided notes (GN)...»

«TEACHING TECHNIQUES Getting Your Textbook Published Expert teachers often are dissatisfied with the textbooks they use for their courses. To solve the problem, they usually supplement the text with additional material, some of which they have developed themselves. What makes textbooks less than adequate? Many factors can come into play, but a few of the more common include:1 • The author’s inability to explain material at a level students can understand. • Poor writing or organization...»

«“VALOUR RATHER THAN PRUDENCE”: HARD TIMES AND HARD CHOICES FOR CANADA’S LEGAL ACADEMY Harry Arthurs ∗ One ought to celebrate the centennial of the College of Law by recalling its past successes and predicting its future achievements. However, many of those successes did not come easily, either to Saskatchewan or to other Canadian law faculties, nor can that we be sure that the legal academy’s trajectory of progress, once established, will continue indefinitely. In many respects, this...»





 
<<  HOME   |    CONTACTS
2017 www.sa.i-pdf.info - Abstracts, books, theses

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.