For help with questions not addressed below, please call the Measurement Incorporated Grade 8 ELA Helpline (877) 516-2403. The line is available weekdays May 11 – 25 from 8:00 a.m. - 5:00 p.m.

INTRODUCTION - After the videotaping of Grade 8 ELA training sessions, the Scoring Leaders participated in Question and Answer sessions. Staff from Measurement, Inc. and the State Education Department provided responses to participants' questions. Many of the questions refer to particular student answer papers in the Practice Sets, and responses may refer to the Scoring Guide. The following transcript places questions of a general nature first. These are followed by questions organized by content area:  Reading; Writing; Writing Mechanics; and   Listening.

Q: How do I use the Videotapes?

A: Videotapes have been provided for each component of the Grade 8 English Language Arts test to assist in training scoring leaders/scorers. The trainer in each videotape will discuss the contents of the Scoring Guide and the Practice Set for that content area. The Scoring Guide will be presented first to demonstrate how the scoring rubric should be applied to student responses. We suggest that Training Leaders stop or pause the videotape before the videotaped trainer begins discussion of the Practice Set. This provides an opportunity for those being trained to read their Practice Sets and practice making scoring decisions.

We also suggest that scorers practice on only one or two student responses at a time, stopping and reviewing the correct score(s) before moving on to the next. The Scoring Leader may read and discuss the annotations and marginalia in their copies of the Practice Sets, or may resume the videotape at appropriate intervals. Several short practice segments followed by review maximizes the opportunity to learn by doing and assists in building scorer skill and confidence.

Q: As a Scoring Leader, how should I prepare to train Table Facilitators and Scorers?

A: Training procedures and the logistics of live scoring are covered in the Scoring Leader Handbook, which should be read thoroughly before training. You should also review your Scoring Guide and Practice Set while viewing the videotape.

Q: How should a response be scored if it’s entirely blank, or it says the student refuses to answer, or it’s written in another language?

A: A list of Condition Codes can be found near the back of the Scoring Guides. The Scoring Leader Handbook contains the procedures for assigning such codes. Responses written in another language should always be scored as an "E" even if the Scoring Leader or another scorer understands the other language, since the test is intended to assess English communication skills.

Q: Sometimes a student will respond to some but not all items. What Overall score should such responses receive?

A: Near the back of the Scoring Guide is a list of Scoring Considerations. These outline the effect of missing responses on the Overall score.

Q: Suppose a student leaves the short responses blank and answers only the extended response, but in the extended response clearly demonstrates understanding of all of the questions posed in the other items? Since the Overall score is supposed to holistically reflect the understanding of the student, can such a response receive a "6"?

A: No. If only the extended response is answered, the Scoring Considerations limit the Overall score to a "2."

Q: When training is over, should scorers refer to the training materials while scoring actual student responses?

A: YES! To maintain accuracy and consistency in scoring, it is very helpful to refer occasionally to the student responses used in the training materials as examples of the various score points. These responses are often called "anchor papers" because they help to fix the acceptable range within a score point and prevent the scorer from "drifting" higher or lower in their expectations for awarding a score point.

Q: I understand that holistic scoring involves weighing and balancing various factors. What are these factors, and what weight should be given to each?

A: The scoring rubric addresses the factors that should be considered in determining the score of a response by listing characteristics that tend to occur among the score points. These characteristics reflect the degree to which focus, development, organization, and writing style are found within a response. Focus is how well the response fulfills the requirements of the task and the connections to the task found within the response. Development is how much information is presented: the details, the specifics, and the amount of elaboration on ideas. Organization is the order in which the information is presented. Does one idea logically follow another? If it’s a narrative, how tight is the sequence of events? Writing style generally concerns word choice and sentence patterns. How fluent is the response? Is it easy to read? Writing style should not be confused with writing mechanics. Style concerns what word is used, whereas mechanics concerns how the word is spelled. Style looks at how the sentence patterns create a flow of ideas, while mechanics looks at how the sentences are punctuated. Remember that writing mechanics is scored separately and should not be a factor in scoring Independent Writing. In assigning a score to an Independent Writing response, all relevant factors should be assessed. However, the most important factor by far, and the one accorded the most weight, is DEVELOPMENT. The amount of development is central to each score point. How much information are we being given? What are the details and the specifics? Are ideas or events elaborated and expanded upon? Development is not only important in and of itself; it also impacts the other factors. There must be a certain amount of information presented for a scorer to be able to assess a response’s focus, organization, and fluency. Caution: development is not synonymous with length! Obviously, the process of presenting the amount of information necessary to get to a higher score point will result in longer responses. However, note that the training materials contain several examples where a response that appears longer receives a lower score than a response that appears shorter. For example, in the Writing Scoring Guide #4 is a high "1" that covers a page and a half, while #6 is a solid "2" that covers only about half of a page. Handwriting size obviously makes a difference, but other considerations also come into play. Repetition will make a response appear longer, but does not add to the quantity and quality of the elaboration. Word choice can also affect development. Specific and/or vivid words pack more information into less space. One example is in Writing Scoring Guide #12, where words like "independent" and "outspoken" clearly convey the idea of the American woman in the later part of the 20th Century.

Q: Our scorers are experienced teachers who adhere to certain standards in their classrooms. Some scorers may find it difficult to follow the standards set by the rubric and the training materials if those standards seem higher or lower than those used by the scorer in the classroom. How should I advise a scorer who hesitates to apply the standards appropriate for this test?

A: We value the classroom experience of our scorers, and we realize that some variation of expectations may exist between districts, schools, and individuals. However, it is very important that all scorers separate their classroom expectations from the standards used in scoring this statewide test. Every scorer should use the same standards in applying the rubric to student responses. Uniform standards in scoring are crucial to obtaining the consistency and accuracy necessary for a valid assessment of student performances across the entire state. Accurate assessments ultimately benefit everyone.

Q: How can a scorer avoid "drifting" from the correct standards while scoring?

A: After scoring a number of responses, a scorer may gradually, even unconsciously, begin to accept more or less than is appropriate in awarding a particular score point. This could result in scoring inequity, where a student response could receive a different score from the same person depending on when it was scored. To maintain the consistency and accuracy of all scores, it is important to prevent any "drift" in scoring standards. This is best accomplished by frequent reference to the "anchor papers" in the training materials, and by encouraging scorers to consult their Table Facilitators or Scoring Leaders with responses that seem on the line between two score points.

Q: What if I should encounter a response where the student indicates that he or she is in a crisis situation and needs intervention? How should such sensitive responses be handled?

A: Sometimes a student in a difficult situation will use the test as an opportunity to reach out and ask for help. The Scoring Leader Handbook and the School Administrator's Manual have information on the procedures to be followed if such a response is encountered. Scorers should be instructed to bring such responses to the immediate attention of the Scoring Leader.

Q: What if a student puts the correct information for a response on a different page, such as the planning page, instead of on the correct response page?

A: If the response page is blank, it must be scored to reflect that it is blank. However, if a student indicates graphically on the correct response page that a response is written or continued onto another page, then the scorer can follow the student’s instructions and consider the information on the indicated page.

Q: The rubric says a "6" will have "vivid language and a sense of engagement or voice." Where in each of the "6"s in the training materials can I find examples of vivid language and voice?

A: Not all "6's will have vivid language or a sense of engagement. However, the precision of language and the manner of expression can be factors in strengthening a response if all of the other elements are present. Voice, where the personality of the student shows itself in the manner of expression, is like a cherry on the sundae. The sundae must be there first before the cherry can be seen as adding anything substantial. Keep in mind also that what is vivid language or voice for an 8th grade student may be different from what you or I may consider to be vivid.

Q: On borderline calls, when deciding between adjacent score points, should the scorer always give the "benefit of the doubt" to the student and award the higher score?

A: No. Such a practice can result in scoring "drift." After scoring a number of responses, a scorer may gradually, even unconsciously, begin to accept less (or demand more) than is appropriate in awarding a particular score point. Scoring "drift" can create an unfair situation where a student response could receive a different score from the same scorer depending on when the response was scored. To prevent "drift" and maintain the consistency and accuracy of all scores, it is helpful to refer occasionally to the student responses used in the training materials as examples of the various score points. These responses are often called "anchor papers" because they help to fix the acceptable range within a score point and prevent the scorer from "drifting" higher or lower in their expectations for awarding a score point. Scorers should also be encouraged to consult their Table Facilitators and Scoring Leaders with responses that seem on the line between two score points.

 Q: Where do the student sample responses used in training come from and what was the procedure used to decide how to score them?

A: These responses were generated by students of New York State a few years ago when items for the NYSTP were originally field tested. After the field tests were completed, teachers from all over New York state were invited to take part in rangefinding sessions to help determine how to apply the rubrics and arrive at scores for these responses. The generic and specific rubrics were discussed and then packets of randomly selected student responses were scored. The scores were recorded and any discrepancies were discussed and resolved using the rubric to arrive at resolutions. Measurement Incorporated then used these scored responses to create the training materials.



Q: Many times students repeat some information they have already presented in the short responses when they answer the essay question. How should scorers treat this when evaluating the entire response? Should they award this type of response a lower score?

A: Not necessarily. When a student repeats information from a short response it should be viewed as a missed opportunity to further develop ideas using other information from the text that can further demonstrate the students level of understanding. The scorer must balance how much information is merely repetition (which should be treated as neutral rather than penalized ) and how much new elaboration and other text support the student presents. Since, as mentioned previously, development is key to achieving a high score, it is the new information which will move the response up the score point scale.

Q: If we look at the specific rubric for item #30 at the score point level 4, we see the word comparison, but in the question the word used is difference. Are we saying these two words are interchangeable?

A: We must use the widest definition of the word comparison. In this case, comparison in the sense used in the rubric does include the concept of describing differences.

Q: The score point "6" response in the Practice Set (#7) does not seem to be as strong as the example of a "6" in the Guide (#11). Which should we have our scorers use to help them score effectively?

A: Recommend to your readers that they use the "6" in the Practice Set (#7) to help them decide if a response is a "5" or a "6". If the response they are scoring is not as strong as Practice Set #7 then they should score it a "5", however anything they perceive to be as good as or better than # 7 should receive a score of 6.

Q: What are the key words to describe each score point?

A: Score Point 1: Understanding of only parts

Score Point 2: Limited understanding

Score Point 3: Partial understanding

Score Point 4: Predominantly literal understanding

Score Point 5: Clear understanding

Score Point 6: Thorough understanding

Q: Practice Set response # 5, which received a score of 5, seems very similar to Practice Set response # 7, which was scored a 6. What makes # 7 a stronger response?

A: Let’s compare how successful each student was in responding to the different parts of the response. The first short responses (#30) are very similar in quantity and quality of development. Both students give specific text detail in describing the differences between Martha Washington and Eleanor Roosevelt as First Ladies ( P.S. #5 : Eleanor Roosevelt did the traveling her husband could not. P.S. #7: Eleanor Roosevelt traveled and gathered ideas for her husband). The graphic organizer is complete, accurate and thorough in both responses. So far both responses are very similar. When we look at the second short response (#32) we start seeing some differences. Practice Set #5 gives an accurate response but the quality of the support stays fairly general (she was helping the President with his job). It is also fairly brief. Response #7, however, draws strong conclusions and supports them with much more specific text details (she helped him get his peace treaty signed and to approve the League of Nations). Here the response gives a specific example of "helping". Finally in the extended response (#33), Practice Set #5 demonstrates clear understanding but the development of ideas is uneven (the support in the second paragraph remains general). Practice Set #7, however, brings in specific text details to support every conclusion presented. Since we need to evaluate the entire response to arrive at the appropriate score, we need to consider the more thorough development of #32 and #33 in response # 7 as the deciding factors in its achieving a score of 6.


Q: In the video the trainer spoke of this as a 4- point rubric, isn’t this a 3- point rubric?

A: The trainer was describing the range of scores on the rubric. Remember that there is a description for a score point 0. It is true that the highest score point is a score of 3.

Q: Can a response that does not address all the bullets on the rubric receive a score of 3 (full credit)?

A: Yes, a response will not be penalized for not addressing all the bullets, however, this response must still have all the qualities of a score point 3. It must be fully developed, clearly focused, and effectively organized.

Q: In the Writing Practice Set the first response receives a score of 2 even though it only focuses on the present role of women in sports. How can we clearly explain this to our scorers?

A: The description of a score point 2 says that responses at this level need to fulfill only some requirements of the task and make some connections. This is an example of what is meant by "some". Throughout the response there is an implication that the role of women in sports is different now, that there has been change (Many women are now into sports . . .) and the last paragraph continues this theme by mentioning next year.

Q: In scoring Writing Mechanics what happens if the student has not answered all three of the extended responses? How is the cluster scored?

A: In this case refer to the Scoring Consideration page in the back of the Writing Mechanics Scoring Guide, before the Scorer Practice set begins, across from page 4g. The last section refers to this question and addresses how these responses should be scored. At least two of the responses must be answered for the cluster to be considered sufficient to score. If there is only one response present, Condition Code C (insufficient to score) should be applied.

Q: Can we mark on the responses to keep track of the type and number of errors we find?

A: No. Although we encourage scorers to make notes on their training materials, actual student responses should never be marked or written on. Remember that the amount of time you have to score is limited, spending time keeping track of the errors should not be encouraged. A strictly numerical approach to errors can result in inaccurate scoring. Remember it’s not only the number of errors but also the length and complexity of the responses, and the degree to which the errors interfere with readability and comprehension.


Q: What happens if the student lists panna as a difference rather than a similarity because they don’t connect it with cream sauces available for pastas today. Is this considered accurate?

A: Yes. Students are not penalized for lacking prior knowledge of present day cuisine and what is available. A student can list panna and/or pesto as a difference or a similarity but not as both.

Q: It seems that in all our examples of 5’s and 6’s in the guide, the extended responses rely on engagement and voice rather than insight and synthesis to achieve their score. Is engagement and voice more important than the other characteristics?

A: While it is true that most strong writers have developed the qualities of engagement and voice and these are present in the 5’s and 6’s in the training examples, it is the quality of the development of ideas that demonstrates the student’s understanding of the task.