# «University of Nebraska - Lincoln DigitalCommons of Nebraska - Lincoln Dissertations and Theses in Statistics Statistics, Department of 8-2010 ...»

achievement data reveals nine teachers participating in M2 are linked to only fifth grade scores. Consequently, the teacher effect estimates for these fifth grade teachers are not truly value-added, because previous scores on which to base each student’s fifth grade achievement are not available. Additionally, six more teachers are not linked to any student scores before their participation in M2, so these teachers’ estimated before participation teacher effects are not meaningful. After removing these 15 teachers’ estimated effects from the comparison, the average difference between each remaining teacher’s estimated after and before participation teacher effects is slightly higher (mean = 0.027, SE = 0.073, n = 22), and the median difference remains roughly zero (Figure 3.2). Although the results do not indicate the M2 Institute had a significant impact on participating teachers’ effects on student learning, the instruments may not have been designed to detect such changes. In future research, the Survey of Mathematical Knowledge for Teaching (Hill, Schilling, & Ball, 2004) data, collected from teachers before and after M2 participation, will be used to investigate whether teachers’ gains in content knowledge for teaching mathematics during the program can be linked to the changes in teacher effects on student learning.

3.4 Summary and Future Work Although a Z-score approach has limitations, it is an appropriate alternative to using raw data when analyzing less-than-ideal student achievement data across a mixture of norm- and criterion-referenced tests over time. This chapter addressed issues arising when using a layered, longitudinal linear mixed model to analyze gains in standardized scores, including weighting considerations for variance components. Additional studies should consider other weighting alternatives and investigate the impact of such variance component weighting schemes on the estimation of teacher effects. Because curricula and test content vary across grades, as do mobility rates, future research should also explore whether notable changes in a student’s Z-scores from year to year are associated with changes in mobility rates, curricula and/or test content.

Additionally, methods were proposed for estimating teacher effects on student learning before and after teacher participation in professional development programs.

Although the specific example used does not indicate the M2 Institute had a significant impact on participating teachers’ effects on student learning, the instruments may have not been designed to detect such changes. When utilizing this methodology, determining whether the goals of the program align with what the instruments assess and acknowledging any existing limitations is essential. Further research should address the potential issue of censoring, and careful consideration needs to be given to what data are needed and how much baseline data should be obtained when estimating the impact of a professional development program. Ideally, these methods can be extended to other VAM approaches, as well as other professional development programs, and could eventually be used to establish potential relationships between changes in a teacher’s mathematical knowledge for teaching mathematics and changes in student achievement.

Chapter 4 Using Parallel Processing Methodology to Estimate Teacher Effects

4.1 Introduction In recent years, education systems, in theory, have held students to higher academic standards (No Child Left Behind, 2001) by holding states accountable for assessing measurable student outcomes. Research efforts have addressed issues associated with analyzing student achievement data (McCaffrey, Lockwood, Koretz, & Hamilton, 2003), but many of the recommended approaches have not been widely adopted because the resources and high-quality longitudinal data required for these approaches are not readily available. Most value-added modeling approaches require student achievement data to be vertically scaled, or at least linearly related, over time (McCaffrey et al., 2003). Such requirements limit the analyses that can be conducted on available assessment data, which often are not on a single developmental scale. Few studies have addressed how to use value-added models to analyze achievement data not on a single developmental scale (Green, Smith, Heaton, Jiao, & Stroup, under review;

Rivkin, Hanushek, & Kain, 2005), and even fewer, perhaps none, have discussed how to use information from multiple instruments in a single year that are on different scales, potentially both within and between instruments over time. Section 4.2 describes how to use parallel processing, specifically curve-of-factors, methodology to analyze longitudinal student achievement data collected from two different assessments in a single subject, such as mathematics, and estimate teachers’ effects on student learning.

Assuming data come from a curve-of-factors model structure, a simulation study described in Section 4.3 evaluates the performance of the proposed curve-of-factors model in its ability to accurately rank teachers in the presence of either complete or missing test data. The performance of the curve-of-factors model is then compared to that of the Z-score methodology proposed in Chapter 3. The chapter concludes with a summary of the results and recommendations for future work.

4.2 Parallel Processing Methodology Growth curve models analyze differences in individuals’ changes on repeated measurements over time. Growth curve models can be specified either within a multilevel modeling framework as a random coefficients regression model (Raudenbush & Byrk,

2002) or within a structural equation modeling (SEM) framework as a latent growth curve model (Little, Bovaird, & Slegers, 2006). In a random coefficients model, an average growth curve is estimated for all individuals. Because growth curves differ between persons, each individual’s random deviations from the average trajectory are also modeled, capturing individual variability around the average trajectory of change (Raudenbush & Byrk, 2002). Although a latent growth curve model can be specified as a random coefficients model in the multilevel framework, modeling growth models within the SEM framework has the advantage of allowing researchers to model changes in a latent construct over time (Little et al., 2006). This is particularly beneficial when modeling multiple outcome measures, instead of a single measure across time. In such instances, parallel process, or multivariate, growth curve models estimate the relationship between the growth trajectories for each of the parallel measures and allow researchers to investigate changes in latent factors over time instead of changes in observed scores.

4.2.1 Introduction to Parallel Processing Multivariate, or parallel process, growth curve models “model univariate growth in the context of multiple parallel measures and relate those multiple outcomes to each other” (Little et al., 2006, p. 193). One type of multivariate growth curve model is the associative latent growth curve model (Little et al., 2006). This type of model first establishes the univariate latent growth curve model structures for each of the constructs, or processes, measured over time and then estimates the correlations between the growth factors for all of the constructs. This type of model allows researchers to estimate the relationship between the growth trajectories of multiple constructs measured over time.

For example, Cheong, MacKinnon, and Khoo (2003) used an associative latent growth model in a mediation analysis context. In the study, Cheong et al. (2003) explored the impact a drug prevention program had on the growth factors of two different, parallel processes: the mediator process (perceived importance of team leaders) and the outcome process (nutrition behaviors) for high school football players. Structural relations in a mediation model were used to analyze the effect of participation in the prevention program on initial status and change in both the perceived importance of team leaders and the nutrition behaviors over time. The authors also used structural relations, instead of correlations, to explore whether initial status in each of the constructs predicted change in the other construct and whether change in perceived performance of team leaders predicted change in nutrition behaviors over time. In this sense, Cheong et al. (2003) used an associative growth curve model to estimate the relationship between the latent growth curves of perceived performance of team leaders and the nutrition behaviors over time.

Similarly, Roesch et al. (2009) used an associative growth curve model to explore the relationship between the growth trajectories of various mediator processes (psychosocial constructs) and the growth factors of an outcome process (physical activity constructs) in adolescents over time. The authors also used a mediation model to estimate the effect of a health promotion intervention program on those growth factors.

Another type of multivariate growth curve model is the factor-of-curves model (Little et al., 2006). Similar to the associative growth curve model, the factor-of-curves model first establishes the univariate growth trajectories for each of the constructs measured over time. However, instead of estimating the correlations or structural relations between the latent growth factors for the constructs, this type of model uses higher-order growth factors to represent the relationship between the latent growth curves of the processes. This type of model allows researchers to determine whether the growth trajectories for multiple constructs measured over time can be summarized by a common, higher-order latent growth curve. Duncan, Duncan, and Strycker (2000) used a factor-ofcurves model to analyze the relationship between the initial status and rate of change in four different behaviors (drug use, marijuana use, deviance and academic failure) of adolescents across time. Instead of correlating the growth factors, the factor-of-curves model allowed them to specify a higher-order latent growth curve of problem behavior with one common initial status and one common rate of growth to describe the relationship among growth trajectories of the lower-order constructs.

The last type of multivariate growth curve model discussed in this chapter is the curve-of-factors model (Little et al., 2006). This type of model is used to estimate a latent growth curve to describe changes in a latent construct measured by multiple indicators over time. For instance, mathematics achievement can be measured by multiple instruments in a given year. Across time, changes in the latent factor can be described by a higher-order growth curve. This allows researchers to investigate change over time based on a student’s common, latent trait, instead of a student’s observed scores. The following section describes the use of a modified version of this type of model in a valueadded context, where the latent factors are allowed to covary instead of have a specific growth trajectory over time.

4.2.2 Parallel Processing in a Value-Added Context Parallel processing, specifically curve-of-factors methodology, applied in a valueadded context can extend the analysis of student achievement data to situations in which multiple tests with potentially different scales are given each year in a particular subject.

Instead of estimating a teacher’s effect on changes in a student’s scores over time, curveof-factors models can allow the estimation of a teacher’s effect on changes in some common, latent trait measured by the multiple instruments in each year. These models can be extended to account for longitudinal student achievement data cross-classified in

** Figure64.1: Visual of Cross-Classified Data Structure nature.**

Figure 4.1 shows an example of such a structure, where students at level one are nested within level two teachers’ classrooms at each time. However, not all students who have the same teacher at one time have the same teacher at another time, resulting in a cross-classified structure.

The level one curve-of-factors model shown in Figure 4.2 has two indicators, or observed measurements, e.g., CA1 and MA1, of a latent trait at each of the four time points. The latent factors A1-A4 represent the latent trait of interest, e.g., achievement, at each of the times. The set of latent factors for a single individual is assumed to follow a multivariate normal distribution with a mean vector of zero and an unstructured covariance matrix. The covariance between two latent factors is represented by a twosided arrow connecting the two factors, with the variance of a factor represented by a self-connecting arrow. Latent factors are assumed independent across students. The measurement error variance of an indicator or test, i.e., the portion of an indicator’s variance not explained by the underlying construct, is also estimated and represented by a

* * * * * * * * * Freely estimated (i.e., unconstrained) parameter.

** Figure74.2: Level One Curve-of-Factors Model self-connecting arrow.**

All measurement errors are assumed to be independent both within and across students, with constant variance across indicators and time. The loadings, or coefficients, regressing the indicators on their corresponding factors, or constructs, are constrained to one for identification purposes.

Including the level two teacher effects in the curve-of-factors model structure (Figure 4.2) cannot be easily depicted in a diagram because of the cross-classified nature of the data. Instead, this relationship can be given by the model equations,

where Zt Zta 12, Z d I 12 and A 12 adjust for the shift from one response (i.e., the latent factor) at level two to two responses (i.e., the test indicators) at level one in each year. In Equation 4.3, y is the vector of test scores, X is the coefficient matrix for β, the vector of test means, and Zt is the coefficient matrix for t, the vector of random

indicated in Equation 4.1, the teacher effects are estimated based on changes in a student’s common, latent trait, instead of on changes in a student’s scores over time.

Including teachers in this manner changes the constructs, A1-A4, from exogenous, or independent, factors, as illustrated in Figure 4.2, to endogenous, or dependent, factors.

Consequently, the variance of a factor originally represented by a self-connecting arrow in Figure 4.2 is replaced by a disturbance, or term representing factor-level variability not attributed to teachers. Additionally, the unstructured covariance matrix is no longer for the set of latent factors, but instead for the set of disturbances for the latent trait across time. The vector of random, factor-level disturbances, d, can be assumed to be normally

disturbances for the same student are allowed to covary but disturbances for two different students are assumed to be independent; Z d is the corresponding coefficient matrix.

Random test measurement errors, e, can be assumed to be normally distributed with

Equation 4.3 can be extended to a layered model in which teacher effects are assumed to persist undiminished over time. Wright and Sanders (2008) distinguish between the layered and non-layered model in the construction of the Zt matrix (Table 4.1). In the non-layered model, each student’s outcome for a given indicator is linked only to the current teacher. In contrast, the layered model links a student’s achievement to current and previous teachers within a given time span. Therefore, the Zt matrix for the layered model can have several “1”s in a row, connecting past teachers with subsequent student levels of the latent trait. Because multiple indicators are available in a given year, two outcomes in the same year in either the non-layered or the layered model are identically linked to teachers in the respective Zt matrix.

Table84.1: Comparison of Z Matrix in Non-Layered and Layered Models

4.3 Example: Student Achievement Simulation Study The analysis of student achievement data can be challenging, particularly when high-quality longitudinal data on a single developmental scale are not readily available.