By Wolfgang Beywl and Amy Gullickson
Efforts to develop evaluation in transdisciplinary research have mostly been conducted without reference to the evaluation literature, effectively re-inventing and re-discussing key concepts. What do transdisciplinary researchers need to know to build on the in-depth knowledge available in evaluation science?
Here we add to other key contributions about evaluation in i2Insights, especially:
- Belcher and colleagues, who provide a tool for evaluating transdisciplinary research
- Nagy and Schäfer, who describe how to systematically design transdisciplinary project evaluation
- Meagher and Edwards, who provide a framework to evaluate the impacts of research on policy and practice
- Louder and colleagues, who discuss ways of choosing a framework to assess research impact.
We focus specifically on the origins and the current state of the evaluation field.
Evaluation science origins
Evaluation science has evolved over five generations starting in the mid-19th Century (Stufflebeam and Coryn, 2014; Alkin, 2022).
The first generation of modern evaluation (“measurement”) favoured methodological (statistics, surveys) and technological innovations such as performance measurement in schools. Although also inspired by other disciplines (for example, agronomy with rigorous experiments in plant cultivation), evaluation was primarily undertaken and developed in the context of education.
The second generation (“analysis”) commenced about 1950. Goals of (curricular) programs were defined precisely, the interventions assigned to them and, if possible, research-based assumptions about the mechanisms of impact, were critically analysed. Textual-visual models of program logic emerged, which became a core component of evaluation (eg., the Context, Input, Process and Product (CIPP) evaluation model: Stufflebeam, 1969; Stufflebeam and Zhang, 2017). Data collection, analysis, and interpretation were increasingly systematically rule-based within the framework of a critical-rationalist social science methodology.
The third generation (“valuation”) from the end of the 1960s elaborated criteria as a reference for the valuation processes constitutive for evaluation. Thus, the goals set by policy or program directors were questioned by evaluators, alternative values were brought into play, and the dependence of the evaluation process on social or cultural (power) constellations was worked out. It was increasingly doubted that evaluation could be politically and socially neutral.
Since the end of the 1970s, the fourth generation (“negotiation”) focused on the brokering of evaluation criteria between stakeholders who have interests in relation to the object to be evaluated (evaluand) (Guba and Lincoln 1981). The claim: evaluation should provide information for as many legitimate stakeholders as possible that would be potentially useful to them and actually used by them. Evaluation itself became the object of systematic description and valuation within the framework of meta-evaluation. The central reference document for this was the “Program Evaluation Standards”, first published as a book in 1981 (now in its third revised edition, Joint Committee, 2011). In many countries, these evaluation standards, originally based in the field of education, were regarded as the criteria to use for evaluations in broad policy domains.
The fifth generation of evaluation (“engagement”) emerged in the 2000s. With globalization and the awareness promoted by natural science research, not only the values and interests of the respective stakeholders but also those of future generations came to the fore (Gullickson and Hannum, 2019; Roorda and Gullickson, 2019). The economically and technologically developed industrial nations influenced living conditions all over the world and understanding of Gaia emerged as a hardly predictable complex system. The key question for evaluation science now is what role should it take in view of urgency and unpredictability (Better Evaluation, 2022; Patton, 2019; https://bluemarbleeval.org; Uitto, 2019)?
Evaluation science now
As in any emerging scientific community, there is no authoritative definition of evaluation. Following innumerable textbooks, journal articles and debates, the field has accumulated the following definitional elements with increasingly established technical terms (some in italics below):
- Evaluation can be defined as a scientific endeavour and professional service, which reasonably exhaustively describes and valuates evaluands (ie., programmes, projects, measures, policies, etc.).
- It is guided by purposes (eg., improvement, fundamental decision, accountability*) and evaluation questions (eg., how well did this programme increase equity of marginalised populations?), which are clarified collaboratively by clients and stakeholders.
- The achievement of the prioritised evaluation purposes and answers to the evaluation questions should be reflected in stakeholder utilisation, an important prerequisite for generating the intended influence on both the evaluand and wider social, economic, natural etc. systems.
- To obtain reliable information for the descriptive task, evaluation uses a wide range of empirical, especially social science, methods: qualitative, quantitative, and mixed.
- For the valuation task – a uniqueness – ie., the determination of context-independent merit, situationally bound worth and socially attributed significance – criteria (and threshold points, if applicable) are clarified in the evaluation process, and transparent valuation procedures also take place in a methodical fashion (to see more: Gullickson, 2020; Balzer et al., 2020).
Evaluation and transdisciplinary research in future
Given current trends in evaluation science, we expect that it could advance the efforts of transdisciplinary research to address the wicked questions and challenges of the Anthropocene by providing tools that help focus attention on values and stakeholders. What do you think? Are any of the resources we cited herein useful for your practice? Where and how do you think evaluation research could support transdisciplinary research? Are there other aspects of the history and current state of evaluation science that transdisciplinary researchers should be aware of? Do you have a favourite evaluation tool or resource to share?
* “accountability” replaced the incorrect word “enhancement” three days after the blog post was published.
Alkin, M. C. (2022, in press). Evaluation roots. 3rd Edition, Sage: Los Angeles, United States of America.
Balzer, L., Laupper, E., Eicher, V. and Beywl, W. (2020). The key to evaluation. 10 steps – ‘evaluiert’ in brief. Swiss Federal Institute for Vocational Education and Training, SFUVET: Zollikofen, Switzerland. (Online – open access): https://www.sfuvet.swiss/evaluiert
Better Evaluation. (2022). Footprint Evaluation. (Online): https://www.betterevaluation.org/en/themes/footprint_evaluation
Guba, E. G. and Lincoln, Y. S. (1989). Fourth generation evaluation. Sage, Newbury Park, United States of America.
Gullickson, A. M. (2020). The whole elephant: Defining evaluation. Evaluation and Program Planning, 79. (Online) (DOI): https://doi.org/10.1016/j.evalprogplan.2020.101787
Gullickson, A. M. and Hannum, K. M. (2019). Making values explicit in evaluation practice. Evaluation Journal of Australasia, 19, 4: 162–178. (Online) (DOI): https://doi.org/10.1177/1035719X19893892
Roorda, M. and Gullickson, A. M. (2019). Developing evaluation criteria using an ethical lens. Evaluation Journal of Australasia, 19, 4: 179–194. (Online) (DOI): https://doi.org/10.1177/1035719X19891991
Joint Committee on Standards for Educational Evaluation. (2011). The program evaluation standards. A guide for evaluators and evaluation users. 3rd Edition, Sage: Thousand Oaks, United States of America.
Patton, M. Q. (2019). Blue Marble evaluation. Premises and principles. Guilford Press, New York, United States of America
Stufflebeam, D. L. (1969). Evaluation as enlightenment for decision-making. In: W. H. Beatty (ed.), Improving educational assessment and an inventory of measures of affective behavior. The Association for Supervision and Curriculum, pp. 41-73: Washington D.C., United States of America.
Stufflebeam, D. L.. and Coryn, C. L. S. (2014). Evaluation theory, models, and applications. 2nd Edition, Jossey-Bass: San Francisco, United States of America.
Stufflebeam, D. L. and Zhang, G. (2017). The CIPP Evaluation Model: How to evaluate for improvement and accountability. Guilford Press: New York, United States of America.
Uitto, J. I. (2019). Evaluation for the Anthropocene: Global environmental perspectives. Evaluation Matters—He Take Tō Te Aromatawai, 5. (Online – open access) (DOI): https://doi.org/10.18296/em.0044
Biography: Wolfgang Beywl PhD is professor at the Institute for Continuing Education at the School of Education at the University of Applied Sciences of Northwestern Switzerland in Windisch and science director of Univation, Institute of Evaluation, Cologne, Germany. He has published several textbooks, chaired the Evaluation Standards committee of the DeGEval-Evaluation Association (Austria and Germany), and developed the longstanding post graduate evaluation training program at Berne University. His work helps to demonstrate the value and impact of transdisciplinary educational research and evaluation.
Biography: Amy Gullickson PhD is the Director of the Centre for Program Evaluation at the University of Melbourne in Australia, and the Chair of the International Society for Evaluation Education. She led development of the fully online Masters and Graduate Certificate programs in evaluation at the University of Melbourne, and has been deeply engaged with the Australian Evaluation Society in the development and refinement of their competencies for evaluators. She spends her time conducting evaluations, teaching people to conduct evaluations, teaching people to teach others how to conduct evaluations, and teaching organisations how to integrate evaluation into their day to day operations – and doing research on all of the above.