
The age of our accountability
Evaluation must become an integral part of staff development
By Thomas R. Guskey
Journal of Staff Development, Fall 1998 (Vol. 19, No. 4)
For many years, educators have operated under the premise that professional development is good by definition, and therefore more is always better. If you want to improve your professional development program, the thinking goes, simply add a day or two.
Today, however, we live in an age of accountability. Students are expected to meet higher standards, teachers are held accountable for student results, and professional developers are asked to show that what they do really matters.
For many, this is scary. They live in fear that a new superintendent or board member will come in who wants to know about the payoff from the districts investment in professional development. If the answers arent there, heads may roll and programs may get axed.
Now it may be that your professional development programs and activities are state-of-the-art efforts designed to turn teachers and school administrators into reflective, team-building, global-thinking, creative, ninja risk-takers. They also may be bringing a multitude of priceless benefits to students, teachers, parents, board members, and the community at large. If that is the case, you can stop reading now.
But if youre not sure, and if theres a chance youll be asked to document those benefits to the satisfaction of skeptical parties, you may want to continue. In order to provide that evidence, youre going to have to give serious attention to the issues of evaluation.
Historically, many professional developers have considered evaluation a costly, time-consuming process that diverts attention from important planning, implementation, and follow-up activities. Others believe they simply lack the skill and expertise to become involved in rigorous evaluations. As a consequence, they either neglect evaluation issues, or leave them to "evaluation experts" who are called in at the end and asked to determine if what was done made any difference. The results of such a process are seldom very useful.
Good evaluations are the product of thoughtful planning, the ability to ask good questions, and a basic understanding about how to find valid answers. In many ways, they are simply the refinement of everyday thinking. Good evaluations provide information that is sound, meaningful, and sufficiently reliable to use in making thoughtful and responsible decisions about professional development processes and effects.
What is evaluation?
Just as there are many forms of professional development, there are also many forms of evaluation. In fact, each of us engages in hundreds of evaluations every day. We evaluate the temperature of our shower in the morning, the taste of our breakfast, the chances of rain and the need for an umbrella when we go outdoors, and the likelihood we will accomplish what we set out to do on any particular day. These everyday acts require the examination of evidence and the application of judgment.
The kind of evaluation on which we focus here, however, goes beyond these informal acts. Our interest is in evaluations that are more formal and systematic. While not everyone agrees on the best definition of this kind of evaluation, for our purposes, a useful operational definition is: Evaluation is the systematic investigation of merit or worth. (This definition is adapted from the Joint Committee on Standards for Educational Evaluation, 1994.)
Lets look carefully at this definition. The word "systematic" distinguishes this process from the many informal evaluations we conduct every day. "Systematic" implies that evaluation in this context is thoughtful, intentional, and purposeful. Its done for clear reasons and with explicit intent. Although the specific purpose of evaluation may vary from one setting to another, all good evaluations are deliberate and systematic.
Because its systematic, some educators have the mistaken impression that evaluation in professional development is appropriate for only those activities that are "event-driven." In other words, they believe evaluation applies to formal professional development workshops and seminars, but not to the wide range of other less formal, ongoing, job-embedded professional development activities. Regardless of its form, however, professional development is not a haphazard process. It is, or should be, purposeful and results- or goal-driven. Its objectives remain clear: To examine staff development activities to see if theyre making a difference in teaching, helping educators reach high standards and, ultimately, having a positive impact on students. This is true of workshops and seminars, as well as study groups, action research, collaborative planning, curriculum development, structured observations, peer coaching and mentoring, and individually-guided professional development activities. To determine if the goals of these activities are met, or if progress is being made, requires systematic evaluation.
"Investigation" refers to collecting and analyzing appropriate and pertinent information. While no evaluation can be completely objective, the process isnt based on opinion or conjecture. Rather, its based on acquiring specific, relevant, and valid evidence examined through appropriate methods and techniques.
Using "merit or worth" in our definition implies appraisal and judgment. Evaluations are designed to determine somethings value. They help answer such questions as:
- Is this program or activity leading to the results that were intended?
- Is it better than what was done in the past?
- Is it better than another, competing activity?
- Is it worth the costs?
The answers to these questions require more than a statement of findings. They demand an appraisal of quality and judgments of value, based on the best evidence available.
Three purposes, three categories
The purposes of evaluation are generally classifed in three broad categories, from which stem the three major types of evaluation. Most evaluations are actually designed to fulfill all three purposes, although the emphasis on each changes during various stages of the evaluation process. Because of this inherent blending of purposes, distinctions between the different types of evaluation are sometimes blurred. Still, differentiating their intent helps in clarifying our understanding of evaluation procedures (Stevens, Lawrenz, & Sharp, 1995). The three major types of evaluation include planning, formative, and summative evaluation.
1. Planning
Planning evaluation occurs before a program or activity begins, although certain aspects may be continual and ongoing. Its designed to give those involved in program development and implementation a precise understanding of what is to be accomplished, what procedures will be used, and how success will be determined. In essence, it lays the groundwork for all other evaluation activities.
Planning evaluation involves appraisal usually on the basis of previously established standards of a program or activitys critical attributes. These include the specified goals, the proposal or plan to achieve those goals, the concept or theory underlying the proposal, the overall evaluation plan, and the likelihood that plan can be carried out with the time and resources available. In addition, planning evaluation typically includes a determination of needs, assessment of the characteristics of participants, careful analysis of the context, and the collection of pertinent baseline information.
Evaluation for planning purposes is sometimes referred to as "preformative evaluation" (Scriven, 1991) and may be thought of as "preventative evaluation." It helps decision makers know if efforts are headed in the right direction and likely to produce the desired results. It also helps identify and quickly remedy the difficulties that might plague later evaluation efforts. Furthermore, planning evaluation helps ensure that other evaluation purposes can be met in an efficient and timely manner.
2. Formative
Formative evaluation occurs during the operation of a program or activity. Its purpose is to provide those responsible for the program with ongoing information about whether things are proceeding as planned and whether expected progress is being made. If not, this same information can be used to guide necessary improvements (Scriven, 1967).
The most useful formative evaluations focus on the conditions for success. They address issues such as:
- What conditions are necessary for success?
- Have those conditions for success been met?
- Can the conditions be improved?
In many cases, formative evaluation is a recurring process that takes place at multiple times throughout the life of the program or activity. Many program developers, in fact, are constantly engaged in the process of formative evaluation. The evidence they gather at each step of development and implementation usually stays in-house, but is used to make adjustments, modifications, or revisions (Worthen & Sanders, 1989).
To keep formative evaluations efficient and to avoid unrealistic expectations, Scriven (1991) recommends using them as "early warning" evaluations. In other words, use formative evaluations as an early version of the final, overall evaluation. As development and implementation proceed, formative evaluation can consider intermediate benchmarks of success to determine what is working as expected and what difficulties must be overcome. Flaws can be identified and weaknesses located in time to make the adaptations necessary for success.
3. Summative
Summative evaluation is conducted at the completion of a program or activity. Its purpose is to provide program developers and decision makers with judgments about the programs overall merit or worth. Summative evaluation describes what was accomplished, what the consequences were (positive and negative), what the final results were (intended and unintended), and, in some cases, whether the benefits justify the costs.
Unlike formative evaluations that are used to guide improvements, summative evaluations present decision makers with information they to make crucial decisions about a program or activity. Should it be continued? Continued with modifications? Expanded? Discontinued? Ultimately, its focus is "the bottom line."
Perhaps the best description of the distinction between formative and summative evaluation is one offered by Robert Stake: "When the cook tastes the soup, thats formative; when the guests taste the soup, thats summative" (quoted in Scriven, 1991, p. 169).
Unfortunately, many educators associate evaluation with its summative purposes only. Important information that could help guide planning, development, and implementation is often neglected, even though such information can be key in determining a program or activitys overall success. Summative evaluation, although necessary, often comes too late to be much help. Thus, while the relative emphasis on planning, formative, and summative evaluation changes through the life of a program or activity, all three are essential to a meaningful evaluation
Critical levels of professional development evaluation
Planning, formative, and summative evaluation all involve collecting and analyzing information. In evaluating professional development, there are five critical stages or levels of information to consider.
The five levels in this model are hierarchically arranged, from simple to more complex. With each succeeding level, gathering evaluation information is likely to require more time and resources. More importantly, each higher level builds on the ones that come before. In other words, success at one level is usually necessary for success at the levels that follow.
Level 1: Participants Reactions
This is the most common form of professional development evaluation, the simplest, and the level at which educators have the most experience. Its also the easiest type of information to gather and analyze.
The questions addressed at this level focus on whether participants liked a particular professional development activity. When they completed the experience, did they feel their time was well spent? Did the material make sense? Were the activities meaningful? Was the leader or instructor knowledgeable and helpful? Do they believe what they learned will be useful?
Also important for professional development workshops and seminars are questions such as: Was the coffee hot and ready on time? Were the refreshments fresh and tasty? Was the room the right temperature? Were the chairs comfortable? To some, questions such as these may seem silly and inconsequential. But experienced professional developers know the importance of attending to these basic human needs.
Information on participants reactions is generally gathered through questionnaires handed out at the end of a session or activity. These questionnaires typically include a combination of rating-scale items and open-ended response questions that allow participants to provide more personalized comments.
Measures of participants reactions are sometimes referred to as "happiness quotients" by those who insist they measure only the entertainment value of an activity, not its quality or worth. But measuring participants initial satisfaction with the experience provides information that can help improve the design and delivery of programs or activities in valid ways. In addition, positive reactions from participants are usually a necessary prerequisite to higher level evaluation results.
Level 2: Participants Learning
In addition to liking their professional development experience, we also hope participants learned something. Level 2 focuses on measuring the knowledge, skills, and perhaps the new attitudes that participants gained. Depending on the goals of the program or activity, this can involve anything from a pencil-and-paper assessment (Can participants describe the critical attributes of mastery learning and give examples of how these might be applied in common classroom situations?) to a simulation or full-scale skill demonstration (Presented with a variety of classroom conflicts, can participants diagnose each situation, and then prescribe and carry out a fair and workable solution?). Oral or written personal reflections, or examination of the portfolios participants assemble can also be used to document their learning.
Although evaluation information at Level 2 sometimes can be gathered at the completion of a session, it seldom can be accomplished with a standardized form. Measures must be based on the learning goals prescribed for that particular program or activity. This means specific criteria and indicators of successful learning must be outlined before the professional development experience begins. Openness to possible "unintended learnings," either positive or negative, also should be considered. If theres concern that participants may already possess the requisite knowledge and skills, some form of pre- and post-assessment may be required. Analyzing this information provides a basis for improving the content, format, and organization of the program or activities.
Level 3: Organizational Support and Change
Organizational variables can be key to the success of any professional development effort. They also can hinder or prevent success, even when the individual aspects of professional development are done right (Sparks, 1996a).
Suppose, for example, a group of educators participates in a professional development program on cooperative learning. They gain a thorough understanding of the theory, and organize a variety of classroom activities based on cooperative learning principles. Following their training, they try to implement these activities in schools where students are generally graded "on the curve," according to their relative standing among classmates, and great importance is attached to selecting the class valedictorian. Organizational policies and practices such as these make learning highly competitive and will thwart the most valiant efforts to have students cooperate and help each other learn (Guskey, 1996).
The lack of positive results in this case isnt caused by poor training or inadequate learning, but by organizational policies that are incompatible with implementation efforts. The gains made at Levels 1 and 2 are essentially canceled by problems at Level 3 (Sparks & Hirsh, 1997). Thats why its essential to gather information on organizational support and change.
Questions at this level focus on the organizational characteristics and attributes necessary for success. Was the advocated change aligned with the organizations mission? Was change at the individual level encouraged and supported at all levels? Did the program or activity affect organizational climate and procedures? Was administrative support public and overt? Were problems addressed quickly and efficiently? Were sufficient resources made available, including time for sharing and reflection (Langer & Colton, 1994)? Were successes recognized and shared? Such issues can be major contributors to the success of any professional development effort.
Gathering information on organization support and change is generally more complicated than at previous levels. Procedures also differ depending on the goals of the program or activity. They may involve analyzing district or school records, or examining the minutes from follow-up meetings, for example. Questionnaires sometimes can be used to tap issues such as the organizations advocacy, support, accommodation, facilitation, and recognition of change efforts. Structured interviews with participants and district or school administrators also can be helpful. This information is used not only to document and improve organizational support, but also to inform future change initiatives.
Level 4: Participants Use of New Knowledge and Skills
Here our central question is: Are participants using what they learned, and using it well? The key to gathering relevant information at this level rests in the clear specification of indicators that reveal both the degree and quality of implementation. Depending on the goals of the program or activity, this may involve questionnaires or structured interviews with participants and their supervisors. Oral or written personal reflections, or examination of participants journals or portfolios, also can be considered. The most accurate information is likely to come from direct observations, either by trained observers or using video and/or audiotapes. When observations are used, however, they should be kept as unobtrusive as possible. (For examples, see Hall & Hord, 1987.)
At this level, information cant be gathered at the completion of a professional development session. Measures of use must be made after sufficient time has passed to allow participants to adapt new ideas and practices to their setting. Also, remember that meaningful professional development is an ongoing process, not just a series of episodic training sessions. Because implementation is often a gradual and uneven process, measures also may be necessary at several time intervals. Analysis of this information provides evidence on current levels of use and can help staff developers improve future programs and activities.
Level 5: Student Learning Outcomes
Here we address "the bottom line" in education: What was the impact on students? Did the professional development program or activity benefit students in any way? The particular student outcomes of interest will depend, of course, on the goals of each specific professional development effort. In addition to the stated goals, certain "unintended" outcomes may be important as well. For this reason, multiple measures of student learning are always essential (Joyce, 1993).
Consider this example: A group of elementary educators devotes their professional development time to finding ways to improve the quality of students writing. In a study group, they explore the research on writing instruction, analyze various approaches, and devise strategies they believe will work for their students. Gathering Level 5 information, they find students writing test scores increased significantly during the school year, compared to the scores of comparable students who were not involved in these strategies.
On further analysis, however, these educators discover that during the same time, students math achievement declined. This "unintended" outcome apparently occurred, they conclude, because instructional time in mathematics was inadvertently sacrificed to provide more writing time for students. If the educators had only gathered information about improvements in student writing, this important "unintended result" wouldnt have been identified.
Measures of student learning typically include indicators of student performance and achievement, such as assessment results, portfolio evaluations, marks or grades, and scores from standardized examinations. But in addition to these cognitive indicators, affective (attitudes and dispositions) and psychomotor outcomes (skills and behaviors) may be considered as well. Examples include assessments of students self-concepts, study habits, school attendance, homework completion rates, or classroom behaviors. Schoolwide indicators such as enrollment in advanced classes, memberships in honor societies, participation in school-related activities, disciplinary actions, and retention or drop-out rates also might be considered.
The major source of such information is student and school records. Results from questionnaires and structured interviews with students, parents, teachers, and/or administrators also could be included. The summative purpose of this information is to document a program or activitys overall impact. But formatively, it can be used to guide improvements in all aspects of professional development, including program or activity design, implementation, and follow-up. In some cases, information on student learning outcomes is used to estimate the cost-effectiveness of professional development, or what is sometimes referred to as "return on investment" or "ROI evaluation" (Parry 1996; Todnem & Warner, 1993).
Evaluation at any of these five levels can be done well or poorly, laughably or convincingly. The information gathered at each level is important and can help improve professional development programs and activities. But as many have discovered, tracking effectiveness at one level tells you nothing about impact at the next. Although success at an early level may be necessary for positive results at the next level, it is clearly not sufficient. That is why each level is important. Sadly, the bulk of professional development today is evaluated only at Level 1, if at all. Of the rest, the majority are measured only at Level 2 (Cody & Guskey, 1997).
Twelve great guidelines
Good evaluations of professional development dont have to be costly. Nor do they demand sophisticated technical skills (although technical assistance can sometimes be helpful). What they do require is the ability to ask good questions, and a basic understanding about how to find valid answers. Good evaluations provide sound, useful, and sufficiently reliable information that can be used to make thoughtful and responsible decisions about professional development processes and effects.
The following guidelines are designed to improve the quality of professional development evaluations. Although adhering to these guidelines wont guarantee your evaluation efforts will be flawless, it will go a long way toward making your efforts more meaningful, more useful, and far more effective.
1. Clarify the intended goals. The first step in any evaluation is to make sure your professional development goals are clear, especially in terms of the results you hope to attain with students and the classroom or school practices you believe will lead to those results. Change experts refer to this as "beginning with the end in mind." It is also the premise of a "results-driven" approach to professional development (Sparks, 1995, 1996b).
2. Assess the value of the goals. Take steps to ensure the goals are sufficiently challenging, worthwhile, and considered important by all those involved in the professional development process. Broad-based involvement at this stage contributes greatly to a sense of shared purpose and mutual understanding. Clarifying the relationship between established goals and the schools mission is a good place to begin.
3. Analyze the context. Identify the critical elements of the context where change is to be implemented and assess how these might influence implementation. Such an analysis might include examining pertinent baseline information on students and teachers needs, their unique characteristics and background experiences, available resources, parent involvement and support, and organizational climate.
4. Estimate the programs potential to meet the goals. Explore the research base of the program or activity, and the validity of the evidence supporting its implementation in contexts similar to yours. When exploring the literature on a particular program, be sure to distinguish facts from persuasively argued opinions. A thorough analysis of the costs of implementation and what other services or activities must be sacrificed to meet those costs should be included as well.
5. Determine how the goals can be assessed. Decide up front what evidence you would trust. Ensure that evidence is appropriate, relevant to the various stakeholders, and meets at least minimal requirements for reliability and validity. Keep in mind, too, that multiple indicators will probably be necessary, in order to tap both intended and possible unintended consequences.
6. Outline strategies for gathering evidence. Determine how evidence will be gathered, who will gather it, and when it should be collected. Be mindful of the critical importance of intermediate or benchmark indicators that might be used to identify problems (formative) or forecast final results (summative). Select procedures that are thorough and systematic, but considerate of participants time and energy. Thoughtful evaluations typically use a combination of quantitative and qualitative methods, based on the nature of the evidence sought. To document improvements you must also plan meaningful contrasts using appropriate comparison groups, pre- and post-measures, or longitudinal time-series measures.
7. Gather and analyze evidence on participants reactions. At the completion of both structured and informal professional development activities, collect information on how participants regard the experience. A combination of items or methods is usually required to assess perceptions of various aspects of the experience. In addition, keeping the information anonymous generally guarantees more honest responses.
8. Gather and analyze evidence on participants learning. Develop specific indicators of successful learning, select or construct instruments or situations in which that learning can be demonstrated, and collect the information through appropriate methods. The methods used will depend, of course, on the nature of the learning sought. In most cases, a combination of methods or procedures will be required.
9. Gather and analyze evidence on organizational support and change. Determine the organizational characteristics and attributes necessary for success, and what evidence best illustrates those characteristics. Then collect and analyze that information to document and improve organizational support.
10. Gather and analyze evidence on participants use of new knowledge and skills. Develop specific indicators of both the degree and quality of implementation. Then determine the best methods to collect this information, when it should be collected, and how it can be used to offer participants constructive feedback to guide (formative) or judge (summative) their implementation efforts. If there is concern with the magnitude of change (Is this really different from what participants have been doing all along?), pre- and post-measures may need to be planned. The methods used to gather this evidence will depend, of course, on the specific characteristics of the change being implemented.
11. Gather and analyze evidence on student learning outcomes. Considering the procedures outlined in Step 6, collect the student information that most directly relates to the program or activitys goals. Be sure to include multiple indicators to tap the broad range of intended and possible unintended outcomes in the cognitive, affective, and psychomotor areas. Anecdotes and testimonials should be included to add richness and provide special insights. Analyses should be based on standards of desired levels of performance over all measures and should include contrasts with appropriate comparison groups, pre- and post-measures, or longitudinal time-series measures.
12. Prepare and present evaluation reports. Develop reports that are clear, meaningful, and comprehensible to those who will use the evaluation results. In other words, present the results in a form that can be understood by decision makers, stakeholders, program developers, and participants. Evaluation reports should be brief but thorough, and should offer practical recommendations for revision, modification, or further implementation. In some cases, reports will include information comparing costs to benefits, or the "return on investment."
Conclusion
Over the years, a lot of good things have been done in the name of professional development. So have a lot of rotten things. What professional developers havent done is provide evidence to document the difference between the good and the rotten. Evaluation is the key, not only to making those distinctions but also to explaining how and why they occurred. To do this, we must recognize the important summative purposes that evaluation serves, and its vital planning and formative purposes as well.
Just as we urge teachers to plan carefully, and to make ongoing assessments of student learning an integral part of the instructional process, evaluation needs to become an integral part of the professional development process. Systematically gathering and analyzing evidence to inform what we do must become a central component in professional development technology. Recognizing and using this component will tremendously enhance the success of professional development efforts everywhere.
Author's note: The five levels are adapted from an evaluation model developed by Kirkpatrick (1959) for judging the value of supervisor training programs in business and industry. The model, although widely applied, has seen only limited use in education because it lacks explanatory power: Its helpful in addressing a broad range of "what" questions, but falls short when it comes to explaining "why" (Alliger & Janak, 1989; Holton, 1996). The model presented here is designed to resolve that inadequacy.
References
Alliger, G.M. & Janak, E.A. (1989). Kirkpatricks levels of training criteria: Thirty years later. Personnel Psychology, 42(2), 331-342.
Cody, C.B. & Guskey, T.R. (1997). Professional development. In J.C. Lindle, J.M. Petrosko, & R.S. Pankratz (Eds.), 1996 Review of research on the Kentucky Education Reform Act (pp. 191-209). Frankfort, KY: The Kentucky Institute for Education Research.
Fullan, M.G. (1992). Visions that blind. Educational Leadership, 49(5), 19-20.
Gordon, J. (1991). Measuring the "goodness" of training. Training (August), 19-25.
Guskey, T.R. (1996). Reporting on student learning: Lessons from the past Prescriptions for the future. In T.R. Guskey (Ed.), Communicating Student Learning. 1996 Yearbook of the Association for Supervision and Curriculum Development (pp. 13-24). Alexandria, VA: Association for Supervision and Curriculum Development.
Guskey, T.R. (1997). Research needs to link professional development and student learning. Journal of Staff Development, 18(2), 36-40.
Guskey, T.R., & Sparks, D. (1996). Exploring the relationship between staff development and improvements in student learning. Journal of Staff Development, 17(4), 34-38.
Hall, G.E. & Hord, S.M. (1987). Change in schools: Facilitating the process. Albany, NY: SUNY Press.
Holton, E.F. (1996). The flawed four-level evaluation model. Human Resources Development Quarterly, 7(1), 5-21.
Johnson, B.M. (1995). Why conduct action research? Teaching and Change, 3(1), 90-104.
Joint Committee on Standards for Educational Evaluation (1994). The program evaluation standards (2nd ed.). Thousand Oaks, CA: Sage.
Joyce, B. (1993). The link is there, but where do we go from here? Journal of Staff Development, 14(3), 10-12.
Kirkpatrick, D.L. (1959). Techniques for evaluating training programs. A four-part series beginning in the November issue (Vol. 13, No. 11) of Training and Development Journal (then titled Journal for the American Society of Training Directors).
Kirkpatrick, D.L. (1977). Evaluating training programs: Evidence vs. proof. Training and Development Journal, 31(11), 9-12.
Parry, S.B. (1996). Measuring trainings ROI. Training & Development, 50(5), 72-75.
Scriven, M. (1967). The methodology of evaluation. In R. E. Stake (Ed.), Curriculum evaluation. American Educational Research Association Monograph Series on Evaluation, No. 1. Chicago: Rand McNally.
Scriven, M. (1991). Evaluation thesaurus (4th ed.). Newbury Park, CA: Sage.
Stevens, F., Lawrenz, F. & Sharp, L. (1995). User-friendly handbook for project evaluation: Science, mathematics, engineering, and technology education. Arlington, VA: National Science Foundation.
Sparks, D. (1995, April). Beginning with the end in mind. School Team Innovator, 1(1), p.1.
Sparks, D. (1996a, January). Results-driven staff development. The Developer, p. 2.
Sparks, D. (1996b, February). Viewing reform from a systems perspective. The Developer, pp. 2, 6.
Sparks, D.,& Hirsh, S. (1997). A new vision for staff development. Alexandria, VA: Association for Supervision and Curriculum Development.
Todnem, G. & Warner, M.P. (1993). Using ROI to assess staff development efforts. Journal of Staff Development, 14(3), 32-34.
Worthen, B.R. & Sanders, J.R. (1989). Educational evaluation. New York: Longman.
About the author
Thomas R. Guskey is a professor at the College of Education at the University of Kentucky. This article is based on material included in his new book, Evaluating Professional Development, which is due to be released later this year. He can be reached at the Taylor Education Building, College of Education, University of Kentucky, Lexington, KY 40506, (606) 257-8666, fax (606) 257-4243, or e-mail: guskey@pop.uky.edu.
An additional small, related article follows.
"You can collect awfully good evidence"
Knowing about planning, formative, and summative evaluation, are you ready to "prove" that your professional development programs and activities make a difference? Can you demonstrate that what was done in professional development, and nothing else, is solely responsible for that 10 percent increase in student achievement scores? For the five percent decrease in dropout rate? For the 50 percent reduction in recommendations to the office for disciplinary action?
Are you trying to say the counseling department had nothing to do with it? Do the principal and assistant principal get no credit for their support and encouragement? Might not year-to-year fluctuations in students have something to do with the results? And consider the other side of the coin: If achievement ever drops following some highly touted professional development initiative, would you be willing to accept full blame for the loss?
Arguments about whether you can absolutely, positively isolate the impact of professional development on improvements in student performance are generally irrelevant. In most cases, you simply cannot get ironclad proof (Kirkpatrick, 1977). To do so, you would need to eliminate or control for all other factors that could have caused the change. This requires the random assignment of educators and students to experimental and control groups: The experimental group would take part in the professional development activity while the control group did not. Comparable measures would then be gathered from each and the differences tested.
The problem, of course, is that nearly all professional development takes place in real-world settings where such experimental conditions cant be created. The relationship between professional development and improvements in student learning in these real-world settings is far too complex, and there are too many intervening variables to allow for simple causal inferences (Guskey, 1997; Guskey & Sparks, 1996). Whats more, most schools are engaged in systemic reform initiatives that involve the simultaneous implementation of multiple innovations (Fullan, 1992). Isolating the effects of a single program or activity under such conditions is usually impossible.
But despite the absence of this kind of proof, you can collect awfully good evidence about whether or not professional development is contributing to specific gains in student learning. Setting up meaningful comparison groups and using appropriate pre- and post-measures, for example, provides extremely valuable information. Time-series designs, which include multiple measures collected before and after implementation, are another useful alternative. Above all, you must be sure to gather evidence on measures that are meaningful to stakeholders in the evaluation process. Evidence is what most people want anyway. Superintendents and board members rarely ask, "Can you prove it?" What they ask for is evidence.
Sources and types of evidence
Consider, for example, the use of anecdotes and testimonials. From a methodological perspective, theyre a poor source of data, typically biased and highly subjective. They may be inconsistent and unreliable. Nevertheless, they can be powerful and convincing. And as any trial attorney will tell you, they offer the kind of evidence that most people believe. Although it would be imprudent to base your entire evaluation on anecdotes and testimonials, they are an important source of evidence that should never be ignored.
Keep in mind, too, that good evidence is not that hard to come by if you know what youre looking for before you begin. If you do a good job of clarifying your goals up front, most evaluation issues pretty much fall into line. The reason many educators think that higher levels of evaluation such as those seeking to measure teachers use of new information and student outcomes is so difficult, expensive, and time-consuming is that theyre coming in after the fact to search for results. Its as if theyre saying, "We dont know what were doing or why were doing it, but lets find out if anything happened." (Gordon, 1991.) If you dont know where youre going, its very difficult to tell if youve arrived.
When it comes to evidence versus proof, the message is this: Always seek proof, but gather a lot of evidence along the way. Because of the nature of most professional development efforts, your evidence may be more exploratory than confirmatory. Still, it can offer important indications about whether you are heading in the right direction or whether you need to go back to the drawing board.
TRG