Register   Thursday, February 09, 2012
Christian School Education Christian School Education  

Getting the Best From Your Tests

Last Updated Mar 26, 2009


Ann Fielden has been a measurement consultant for sixteen years, including thirteen years with Harcourt Brace Educational Measurement. For five years she enjoyed teaching third graders. Since then she has worked with schools and school districts in Idaho, Montana, Utah, Wyoming, Nevada, New Mexico, and Colorado as they developed and implemented new testing programs. Ann now works with all thirteen western states and ACSI with their large-scale testing programs.

Note: the test is now Stanford 10. Contact ACSI for updated test results.

A rite of spring in most ACSI schools is the administration of the Stanford 9 Achievement Tests. Parents, teachers, and administrators anticipate getting the results and seeing the accomplishments indicated by the scores. The investment that schools make, in time and in financial resources, in their achievement testing program should warrant getting as much benefit as possible from efficiently studying their results. As with most private schools, ACSI schools generally expect their numbers to be higher than the national norms and to exceed the normal year-to-year growth in student achievement.

It is challenging for teachers and school administrators to educate parents (and each other) about what the scores mean, and to keep them from overstating the implications of the results. The measurement of normal growth and the measurement of change in excess of normal growth are two different things. This distinction is important if the school is to make the most accurate use of the data.

The Guide for Organizational Planning, published by Harcourt Brace Educational Measurement, addresses this important distinction. The paragraphs below are from that publication, edited for length, and are helpful in assisting teachers and the school staff in interpreting and discussing their achievement test results.

ACSI-Member School Students

Grade Above National Average
K 7 Months
1 7 Months
2 10 Months
3 1 Year
4 1 Year, 7 Months
5 1 Year, 9 Months
6 2 Years, 5 Months
7 2 Years, 5 Months
8 3 Years, 6 Months
9 3 Years, 4 Months
10 2 Years, 4 Months
11 1 Year, 4 Months
12 4 Months
Stanford 9, Spring 1997
Christian School Education

 

Measuring Growth in Achievement

On the surface, the measurement of change seems deceptively simple: to assess how students have progressed from one point in time to another. However, the interpretation of change becomes more complicated. It is subject to some of the same problems as measuring what a student or class knows at any given time. In addition, the measurement of change gives rise to a unique set of problems that an administrator should be aware of and share with his or her staff.

The measurement of growth normally involves obtaining multiple measures on the same scale. For example, to measure a child’s growth in height, a comparison is made between two or more readings made on the same scale, usually a linear measure of feet and inches. In this example, the scale represents a single dimension. Unfortunately, the single dimension concept is not as clearly defined or effective when we measure academic growth.

Reading ability, as measured at the second-grade level, is not the same as reading ability as measured at the eighth-grade level. In the primary grades, for example, greater emphasis is likely to be given to the development of decoding skills, while at the middle school and high school levels, comprehension usually receives increased attention. Similarly, computation skills measured at the middle school level may not be quite the same kind of skills that are measured at the primary level.

For the most part, achievement tests measure successive levels of an ability or trait and only approximate a one-dimensional scale. Therefore, most test users assume that successive test levels are measuring the same underlying trait and operationally define academic gain, or growth, as the difference between two test scores. In effect the same content field is being measured, but the skills are being measured on a gradually changing set of underlying traits within that field.

The measurement of academic growth is applied to the performance of individual students as well as to groups of students. One important function a testing program can serve is to provide regular feedback regarding a student’s academic progress. By testing students annually, parents and teachers are able to monitor the student’s progress from the primary grades through high school.

The achievement of groups of students can be measured by following one group from year to year (longitudinal study) or by comparing one group with another group (cross-sectional study), assuming that the groups are similar enough to give a valid comparison.

Problems and Solutions

The problems involved in measuring change may appear relevant only to statisticians and researchers. However, it is important that administrators be aware of these problems since they have practical significance for the interpretation of test results.

One problem involves the definition of “normal growth.” When asked what constitutes “normal” growth in terms of test scores, many educators are likely to respond with “one year’s progress in one year’s time.” This is only partially true.

Growth is most commonly defined using grade equivalents, percentile ranks, and Normal Curve Equivalents (NCEs). The grade equivalent scale is constructed so that, for average students, there is a 1.0 increment from grade to grade. Average students tested at the beginning of grade 3, for example, will score at 3.0, while average students tested at the beginning of grade 4 will score at 4.0. It is important to realize, however, that this expected gain pattern is true only for students, or groups, performing at or near the average of the norm group. It is not the expected pattern for students who perform outside the average range, particularly those at the extremes. Thus, it is a very poor scale for measuring the growth of all students.

Depending on their achievement in each domain (above average, average, below average), the amount of growth required to remain in the same standing the next year will vary. A superior student or group typically must gain more than 1.0 grade equivalent per year to maintain position relative to the norm group, while a low-achievement student or group typically gains less than 1.0 grade equivalents. This explains why low-achieving students, while maintaining their percentile position on successive levels of testing, may appear to fall further and further behind in terms of grade equivalents.

When percentile ranks are used, normal growth is defined as maintaining one’s position relative to the norm group. A grade 5 student who scores at the 90th percentile in language one year, and again at the 90th percentile the following year, has exhibited normal growth. However, if the same student had scored at the 75th percentile in grade 6, normal growth would not have been demonstrated in spite of above-average scores. Likewise, a student who scored at the 10th percentile one year but at the 25th percentile the following year has demonstrated greater than normal growth, even though both scores are below average. In terms of percentiles, “normal” growth is defined as maintaining one’s relative position, regardless of level of achievement.

A major disadvantage of using percentile ranks to measure growth is that percentile ranks do not represent an equal-interval scale. Thus, a “15 percentile change” in one part of the scale is not equivalent to a “15 percentile change” in another part of the scale.

Normal Curve Equivalent scores (NCEs) are derived from percentile ranks and were developed to solve the problem of inequality of units. In effect, NCEs result from dividing the normal curve into 99 equal units. Because of their equal-interval nature, any difference, such as five NCEs, has the same meaning, regardless of the part of the scale being referenced. For this reason, NCEs have become the preferred mode for measuring change. NCEs show real change and are more sensitive to smaller increments of change.

Additional Test Measures

At several ACSI schools last year, the Differential Aptitude Test (DAT) and Career Interest Inventory (CII) were given to students in grades 7 through 12. DAT measures verbal reasoning, numerical reasoning, abstract reasoning, perceptual speed and accuracy, mechanical reasoning, space relations, spelling, and language usage. Each of the eight tests is relevant to certain types of courses and occupations.

The Career Interest Inventory is a career-guidance instrument designed to provide information about educational goals, interest in a variety of school subjects and school-related activities, and interest in fields of work. This information can then be used to help explore educational and occupational alternatives, learn about careers, and begin to set goals for the future. When used in conjunction with the DAT, it links information about students’ aptitudes, as measured by the DAT, with information about occupations that match their interests, providing a more complete profile of the student’s career potential. Likes and dislikes are marked according to this scale: like very much; like a little; don’t know or undecided; dislike a little; dislike very much.

When DAT/CII results are studied with Stanford 9 results, it is useful to compare achievement and interests. Together the results may suggest areas requiring further investigation.

Norm-referenced tests provide only one measure of school performance. As teachers and administrators examine their school’s results, it is helpful to partner those results with other indicators to get a more complete picture of achievement and trends. Sharing multiple forms of evaluation with parents, teachers, and governing boards helps them see more clearly the results of the educational experience their schools provide.

 

Getting the Best from Your Tests 1.4

Share/Save/Bookmark