|
May 31: Last day to save $75 on registration for 2012 Annual Conference in Boston More Info June 30: Board of Trustees nomination deadline More Info July 22-25: 2012 Summer Conference in Denver More Info |
![]()
8 smooth steps SOLID FOOTWORK MAKES EVALUATION OF STAFF DEVELOPMENT PROGRAMS A SONG By Joellen Killion Journal of Staff Development, Fall 2003 (Vol. 24, No. 4) Copyright, National Staff Development Council, 2003. All rights reserved.
Evaluating the effectiveness of staff development and demonstrating its impact on student achievement are more important than ever. The language in staff development policies requires districts to show evidence of professional learning's ability to improve student learning. The National Staff Development Council, some states' legislation, and the federal No Child Left Behind Act all call for rigorous evaluation of professional learning programs (see "Dancing to the same tune" on the next page). With more emphasis on accountability, staff developers will want to explore ways to evaluate their programs and to link staff development to student learning. An evaluation also will help providers and leaders improve their programs. "Evaluation is a systemic, purposeful process of studying, reviewing, and analyzing data gathered from multiple sources in order to make informed decisions about a program" (Killion, 2002, p. 42). A good evaluation of a professional learning program can be accomplished by following eight steps. This eight-step process is drawn from extensive practice and research in program evaluation. Step 1: The first step is determining the degree to which a program, as planned, is ready to be evaluated. Sometimes staff development leaders and providers want to link an episode of staff development, such as a workshop or single professional development day, to student learning. This is nearly impossible because the workshop or professional development day alone is insufficient to produce results for students or teachers. Evaluations of partial or insufficient staff development programs likely will yield disappointing results. Most staff development programs are inadequate to produce the results they seek. "We cannot expect results for students from a staff development program that is unlikely to produce them. And we cannot expect an evaluation to produce useful results when the program being evaluated is poorly conceived and constructed. Perhaps Chen (Chen, 1990) said it best: 'Current problems and limitations of program evaluation lie more with a lack of adequate conceptual framework of the program than with methodological weakness'" (Killion, 2002). Before evaluating any staff development program, the evaluator asks whether the program is feasible, clear, sufficiently powerful to produce the intended results, and worth doing. To determine whether a program is ready to be evaluated, an evaluator analyzes the program's goals, its standard of success, indicators of success, theory of change, and logic model. Goals A staff development program's goals express its intended results in terms of student achievement. Instead of "provide training to all teachers" as its goal, a results-driven program has as a goal improving student achievement. A sample goal might be to improve student achievement in mathematics by 2005 by 10% as measured on the state assessment. When the goals are expressed in terms of student achievement, the program's design is more likely to include sufficient actions to achieve them. Standard of success A program's standard of success is the benchmark that defines its success. It typically is a number representing the performance increase that, when met, is sufficient to declare the program a success. If the goal does not specify a particular degree of improvement, then any degree of improvement, even 0.002, may indicate success. Most staff development leaders want a specific increase in student performance as a return on their investment. For example, in the goal above, the standard of success is 10%. If the staff development program increases student achievement by 10% in mathematics, it is declared a success. If not, it falls short of its intended results and may be altered to increase effectiveness in subsequent years. Indicator of success An indicator of success is the specific way success will be demonstrated. It is the way an evaluator will know if the standard of success has been achieved. In the example above of a 10% increase in math test scores, the indicator of success is student performance on the state assessment in mathematics. Certainly other indicators might be used to demonstrate students' increased achievement in math: performance on other assessments, classroom tasks, enrollment of underrepresented populations in advanced level courses, grades, performance on a national standardized test, or a combination of these. Program designers might specify single or multiple indicators of success. Program designers must identify both a standard of success and indicator of success early when planning a staff development program so the program's design can be tailored to achieve the desired results. Theory of change A theory of change requires program designers to think carefully about how their program will bring about the changes they want. A theory of change (see diagram below) specifies how change is expected to happen, the program's components, their sequence, and the assumptions upon which the program is based (Killion, 2002). An explicit theory of change is a roadmap for program designers, managers, participants, stakeholders, and evaluators showing how the program will work. It is the big picture that serves as a planning tool, an implementation guide, a monitoring tool, and a tool for evaluating the program's success. It allows the program designers to explain how they see the connection between educator learning and student achievement. Without the theory of change, the connection between the program's components and its results may be unclear. Any one program can have multiple theories of change. Individual theories are neither right nor wrong, but one may be more appropriate for a specific context and circumstances. Theories can be based on other theories, research, or best practice. For example, the social interaction theory of learning might serve as the basis for designing how adult learning happens in a professional development program. Based on this theory, participants would have multiple, frequent, in-depth opportunities to process their learning with colleagues. Logic model A logic model is a particular kind of action plan that specifies the inputs, activities, initial, intermediate, and intended outcomes that will accomplish the identified goal. Thorough planning increases a program's potential to succeed. Planning ensures that all the program's activities align with the intended outcomes and that initial and intermediate outcomes will lead to the intended results. A logic model provides a framework for conducting the formative program evaluation as well as for the program design. (See sample logic model above.) The logic model identifies the benchmarks of progress toward a goal. The short-term outcomes lead to medium-term outcomes that lead to long-term outcomes. With this map of the outcomes in place, evaluators are able to determine which outcomes are important to collect evidence about in order to explain the link between staff development and student achievement (Killion, 2002). A logic model has several components.
Building on the program's theory of change, which identifies the program's key components, the logic model specifies what will change as a result of each program component. Staff development is most successful in increasing student achievement when it targets changes in knowledge, attitude, skill, aspiration, and behavior (see "Spelling out KASAB" below). For example, if one component of a staff development program is providing coaching to classroom teachers, the initial outcome of this might be that teachers become more motivated to implement the strategies in their classroom (teachers' aspirations change). An intermediate outcome might be that teachers use the new strategies regularly (a teacher behavior change). The intended outcome is that student achievement increases (student knowledge, skill, and behavior change) as a result of teachers regularly implementing new instructional strategies in their classrooms. Knowing the precursors to the goal, program developers can monitor for evidence that the precursors are affecting student and teacher learning and adjust the program design to ensure that the precursors occur. Without monitoring, one cannot expect the intended results. For the evaluator, the precursors, or initial and intermediate outcomes, typically provide benchmarks for collecting evidence in the formative evaluation. To form a reasonable and supportable claim about the link between staff development and student achievement, the evaluator must know whether teachers received coaching, whether that coaching motivated them to implement the strategies, and whether teachers implemented the strategies. When developing a theory of change and the logic model, program designers specify the types of changes they want to occur. By clearly delineating these changes, designers will be able to design the appropriate actions to accomplish them. Often professional development program planners want teachers to change their behavior, for example, but plan actions that will change only teachers' knowledge and skills. Step 2: Formulate evaluation questions The questions an evaluation attempts to answer shape the evaluation's design. For example, if a formative evaluation asks whether teachers are integrating new technologies in their classrooms, the evaluation questions might be:
The theory of change and the logic model are used to generate formative evaluation questions. Questions can be formulated from each initial and intermediate outcome in the logic model, from each step of the theory of change, from both, or from steps in either that are pivotal to the program's success. For example, for the theory of change and logic model above, an evaluator may choose not to measure whether teachers and principals learned about the value of technology, but rather to measure whether teachers are integrating technology in their classrooms and whether principals are providing the appropriate level of support to their teachers. An evaluator may assume that, if a teacher is using technology appropriately, teachers know how technology contributes to student learning. Summative evaluation questions ask whether the program achieved its goals. If the goals are written as student achievement goals, then the evaluation is able to yield evidence about the staff development's impact on student achievement. If the goals are not expressed as student achievement goals, then the evaluation will allow claims about merit--the degree to which the program achieved its results--but not its impact on student achievement. The summative evaluation question for the goal expressed earlier is: Does student achievement in mathematics increase by 10% by 2005 as a result of integrating technology into the classroom? Evaluators craft questions that allow them to know whether the goal is achieved. To know whether technology integration influenced students' achievement in mathematics, evaluators first examine the theory of change and logic model to understand how teacher learning influences student achievement and then design formative and summative evaluation questions that allow them to gather the appropriate evidence to make a claim that teacher learning contributes to student learning. Without first answering the formative questions, evaluators will be unable to claim that teachers' learning contributes to student learning in mathematics. Step 3: Construct evaluation framework The evaluation framework is the plan for the evaluation. Decisions made in this step determine the evidence needed to answer the formative and summative evaluation questions, decide the appropriate sources of that evidence, determine appropriate and feasible data collection methods, the timeline for data collection, person(s) responsible for the data collection, and data analysis method. Knowing what change is expected helps the evaluator determine the best source of evidence and the most appropriate data collection method. For example, if the evaluator wants to know whether teachers are using technology, teachers themselves are the best source of that information. To triangulate, the evaluator may want to include students, principals, and documents as other data sources to confirm the accuracy of teachers' judgments. Classroom observations of teachers integrating technology may be the most authentic data collection method for knowing whether teachers are using technology; however, evaluators may select alternative data collection methods that will be less time-consuming or costly. Approximate indicators of teachers' use of technology might include assignments, student work samples, student surveys about technology use, principals' observations, and system administrators' records about student time using particular software programs. Step 4: Collect data The evaluator next prepares for and collects the data. Evaluators will want to pilot newly developed or modified data collection instruments to ensure the instruments' accuracy and clarity. Data collectors may require training to ensure consistency and data reliability if more than one individual is collecting data. Data collection processes must be refined for accuracy, and appropriate protocols for collecting data must be developed that give detailed explanations for how to collect data. Once these responsibilities are met, data are collected. This is relatively routine work for most evaluators, although this step holds the potential for compromising the quality of the evaluation if data are not accurately collected and recorded. When collecting data, evaluators adhere to standards established by the American Evaluation Association (1995) and the Joint Committee on Standards for Educational Evaluation (1994) on working with human subjects, if applicable. They ensure that they have met all the policy expectations of schools and districts for notification, privacy of records, or other areas, and abide by a code of ethics for evaluators. Data collection requires a systematic and thoughtful process to ensure that data collected are accurate and have been collected as planned. To ensure accuracy in this step, evaluators often create checks and balances for themselves to ensure that data are recorded accurately, that errors in data entry are found and corrected, and that missing data or outlier data are handled appropriately. Evaluators who attend to details well and who are methodical in their work collect data well. Step 5: Organize and analyze data Evaluators must organize and analyze data collected. Evaluators ensure the data's accuracy by checking for any abnormalities in the data set and checking that data are recorded appropriately and records are complete. Once evaluators are confident that the data have integrity, they analyze the data. Many practitioners distrust their own ability to do a statistical analysis. But in most cases, simple analyses such as counting totals, finding patterns and trends, or simple calculations such as determining the mean, median, mode, and range are sufficient. Sometimes it may be appropriate to use more sophisticated comparisons that include factoring, assessing covariance, or creating statistical models. When evaluators want this level of analysis, they might want to get help from someone experienced in inferential statistics. Once data are analyzed, they are displayed in charts, tables, graphs, or other appropriate formats to allow people with different preferences to find the format that works best for them. Careful titling and labeling helps ensure that readers interpret the data accurately. Step 6: Interpret data While data analysis is the process of counting and comparing, interpreting is making sense of what the analysis tells us. "Interpretation is the 'meaning-making' process that comes after the data have been counted, sorted, analyzed, and displayed" (Killion, 2002, p. 109). For example, we can tell that the scores went up if we compare scores over three years (analysis). In the interpretation phase, we ask what that means in terms of our work - what contributed to the increase, what does the increase mean, was the increase consistent across all grades, etc.? Evaluators seek multiple interpretations and talk with stakeholders about which interpretations are most feasible from their perspective. The evaluators then determine which interpretations are most supported by the analyzed data (Killion, 2002). Interpreting data is best done as a collaborative process with program designers and key stakeholders, including participants. In most evaluations of staff development programs, this means that teachers, principals, and central office staff together study the data and form claims about the program's effectiveness and impact on student learning, and then recommend improvements. Evaluators form claims about a program's merit, the degree to which it achieved its goals, its worth, participants' perception of the program's value, and the program's contribution to student learning. Claims of contribution, those stating that the program influenced student achievement, are made when the evaluation design is descriptive or quasi-experimental. Claims of attribution, that staff development and nothing else caused the results, require experimental, randomized design not often used in evaluation studies. Step 7: Disseminate findings After they interpret data, evaluators share their findings. Evaluators must decide what audiences will receive results and the most appropriate formats in which to share those results since different audiences require different formats. Formats for sharing evaluation results include technical reports, brief executive summaries, pamphlets, newsletters, news releases to local media, and oral presentations. Evaluations sometimes fail to have an impact on future programs because results are not widely shared with key stakeholders. Step 8: Evaluate the evaluation Evaluations rarely include this step. Evaluating the evaluation involves reflecting on the evaluation process to assess the evaluator's work, the resources expended for evaluation, and the overall effectiveness of the evaluation process. Evaluating the process is an opportunity to improve future evaluations and strengthen evaluators' knowledge and skills. "When evaluators seek to improve their work, increase the use of evaluation within an organization, and build the capacity of others to engage in 'evaluation think,' they contribute to a greater purpose. Through their work, they convey the importance of evaluation as a process for improvement and ultimately for increasing the focus on results" (Killion, 2002, p. 124). Conclusion Evaluating staff development requires applying a scientific, systematic process to ensure reliable, valid results. Evaluation not only provides information to determine whether programs are effective, it provides information about how to strengthen a program to increase its effectiveness. With more limited resources available today for professional learning, staff development leaders will face harder decisions about how to use those resources. Evaluations can provide the evidence needed to make these critical decisions. References American Evaluation Association Task Force. (1995). Guiding principles for evaluators. New Directions for Evaluation, 66, 19-34. Chen, H. (1990). Theory-driven evaluation. Newbury Park, CA: Sage Publications. Killion, J. (2002). Assessing impact: Evaluating staff development. Oxford, OH: National Staff Development Council. Killion, J., Munger, L., & Psencik, K. (2002). Technology innovation: Initiating change for students, teachers, and communities. Oxford, OH: National Staff Development Council. National Staff Development Council. (2000). Staff development code of ethics. Oxford, OH: Author. National Staff Development Council. (2001). NSDC's standards for staff development, revised. Oxford, OH: Author. The Joint Committee on Standards for Educational Evaluation. (1994). Program evaluation standards (2nd ed.). Thousand Oaks, CA: Sage Publications.
About the Author Joellen Killion is director of special projects for the National Staff Development Council. You can contact her at 10931 W. 71st Place, Arvada, CO 80004-1337, (303) 432-0958, fax (303) 432-0959, e-mail: joellen.killion@nsdc.org. |
|||||||||