A tool for transdisciplinary research planning and evaluation

By Brian Belcher, Rachel Claus, Rachel Davel, Stephanie Jones and Daniela Pinto

1. Brian Belcher; 2. Rachel Claus; 3. Rachel Davel; 4. Stephanie Jones; 5. Daniela Pinto (biographies)

What are the characteristics of high-quality transdisciplinary research? As research approaches increasingly cross disciplinary bounds and engage stakeholders in the research process to more effectively address complex problems, traditional academic research assessment criteria are insufficient and may even constrain transdisciplinary research development and use. There is a need for appropriate principles and criteria to guide transdisciplinary research practice and evaluation.

In response to this need, Belcher et al. (2016) developed the Transdisciplinary Research Quality Assessment Framework based on a systematic review of literature that discussed the definition and measurement of research quality for inter- and trans- disciplinary research. Through applied testing in case study evaluations, we have revised and refined the framework to improve its utility.

Principles and Criteria

The tool provides guiding principles and corresponding criteria for transdisciplinary research design and implementation. Four principles comprise the Quality Assessment Framework:

  1. Relevance, which refers to the appropriateness of the problem positioning, objectives, and research approach for intended users;
  2. Credibility, which pertains to rigour of the design and research process to produce dependable and defensible conclusions;
  3. Legitimacy, which refers to the perceived fairness and representativeness of the research process; and
  4. Positioning for use, which assesses the degree to which research is likely to be taken up and used.

The criteria provide a checklist for planning and a scoring system for evaluation (see the figure below for the list of criteria).

Revised principles and criteria of the Transdisciplinary Research Quality Assessment Framework. (Source: the authors)

Using the Tool

The Quality Assessment Framework was designed to be applied at any stage of a research intervention (eg., project, program). For planning, the Quality Assessment Framework can help support comprehensive thinking on how to design and implement a research project to (co)produce knowledge that will be useful and used, and that will contribute to a process of change. The criteria and associated guidance:

  • encourage a holistic and in-depth understanding of the problem context;
  • facilitate the integration of diverse perspectives and opportunities for a genuine collaborative research process, while ensuring transparency and ethical conduct;
  • identify opportunities to build knowledge, skills, and relationships, and to influence the attitudes and behaviours of stakeholders of the project or program; and
  • tailor outputs for uptake and use in the respective research/program context and beyond.

For monitoring and adaptive management, the criteria guide reflection on progress and help identify gaps and opportunities to make adjustments. For evaluation, the framework provides a comprehensive assessment of a research project or program with respect to its purpose. The criteria help guide research projects aiming for impact, highlighting critical elements of successful research design and implementation. This can help evaluators identify strengths and weaknesses in research design and implementation for future learning. It also provides a systematic way to compare projects.

The Quality Assessment Framework was designed for multiple users, including research funders and research managers assessing proposals; researchers designing and planning a research project; and, research evaluators like ourselves who apply the tool to learn lessons about effective research practice.

In evaluation, each criterion should be assessed considering the purpose of the project. Document review, interviews, and survey data can serve as evidence to inform the scoring process. Criteria are scored using a three-point Likert scale:

  • 2 = criterion is fully satisfied
  • 1 = criterion is partially satisfied
  • 0 = criterion is not at all satisfied

The scores of one or multiple projects can be mapped in a diagram, as shown in the figure below for three projects (A, B, and C). Please note that in this figure, the original principles and criteria were used – these have since been revised, as shown in the figure above.

Example of Transdisciplinary Research Quality Assessment Framework ‘spidergrams’ for three projects (A, B, C) using the original principles and criteria (subsequently modified). Figure adapted from Belcher et al. (2021).

Contributions Welcome!

We encourage broader testing and application of the Quality Assessment Framework in various transdisciplinary contexts. We welcome suggestions from the transdisciplinary research community to further improve the framework to comprehensively capture the unique characteristics of transdisciplinary research. What other criteria do you think would be useful to integrate?

To find out more:

Sustainability Research Effectiveness (Online): https://researcheffectiveness.ca/

For early access to our two pre-crafted contributions for the 2021 International Transdisciplinarity Conference, see 1) the Integration and Implementation Sciences (i2S) YouTube channel at: https://youtu.be/YdY8UadvERc and 2) our Prezi at https://prezi.com/p/52ap5yagsezm/qaf/. The conference will be held online from September 13–17, 2021. (Online): https://transdisciplinarity.ch/de/veranstaltungen/itd-conferences/itd-conference-2021/.

Belcher, B. M., Rasmussen, K. E., Kemshaw, M. R. and Zornes, D. A. (2016). Defining and assessing research quality in a transdisciplinary context. Research Evaluation, 25, 1: 1-17. (Online – open access): https://doi.org/10.1093/reseval/rvv025

Belcher, B. M., Claus, R., Davel, R. and Jones, S. M. (2021). Evaluating and improving the contributions of university research to social innovation. Social Enterprise Journal. Online (open access): https://doi.org/10.1108/SEJ-10-2020-0099


Brian Belcher PhD is the Ashoka Chair in Research Effectiveness and professor in the College of Interdisciplinary Studies at Royal Roads University in Victoria, Canada. He leads the Sustainability Research Effectiveness Program, developing theory, methodology, and methods for evaluating research in complex transdisciplinary contexts. This work helps to demonstrate the societal value and impact of research and learns lessons to improve future research. He is also a Senior Associate Scientist with the Centre for International Forestry Research and the Consortium Research Program on Forests, Trees and Agroforestry.

Rachel Claus MSc is a research assistant with the Sustainability Research Effectiveness Program at Royal Roads University in Victoria, Canada. Her research expertise and interests are in research design for impact and theory-based research evaluation methods. She has five years of experience developing and applying theories of change to projects and programs to support monitoring and evaluation strategies to optimize effectiveness.

Rachel Davel MDev is a research assistant with the Sustainability Research Effectiveness Program at Royal Roads University in Victoria, Canada. With fours years of experience in theory-based research evaluation methods, her current work focuses on impact assessment and research effectiveness to understand how research and development projects contribute to societal change.

Stephanie Jones MSc is a consultant with the Center for International Forestry Research, a non-profit research organisation headquartered in Bogor, Indonesia, working on a program-level impact evaluation of the Forests, Trees and Agroforestry research program. She is based in Vancouver, Canada. She is also an ongoing collaborator with the Sustainability Research Effectiveness Program at Royal Roads University in Victoria, Canada.

Daniela Pinto MA is a research assistant with the Sustainability Research Effectiveness Program at Royal Roads University in Victoria, Canada. She has more than 15 years of experience monitoring multi-country, multi-stakeholder projects and programs focused on human rights, gender equality, and environmental protection.

7 thoughts on “A tool for transdisciplinary research planning and evaluation”

  1. It is interesting to note how much overlap there is between the four principles and the four standard groups that structure 30 criteria for scientific evaluations.

    TDR Evaluation

    Positioning für Use near to Utility
    Legitimacy near to Propriety
    Credibility near to Accuracy
    Relevance – [missing, as the client confirms relevance]
    [Missing, why?] Feasibility
    [Missing] Evaluation Accountability Standards [specific for second order observation]

    In Europe the four JCSEE-Standards-groups originally ‘born in education’ have been transferred to all policy fields. http://www.degeval.de/fileadmin/user_upload/Sonstiges/STANDARDS_2008-12_kurz_engl.pdf

    • Thank you for your comment, Wolfgang. Thank you for also sharing these two program evaluation standards. There do seem to be several parallels between the principles and criteria to design and assess the quality of evaluations and transdisciplinary research, though there are also those which seem unique to each (probably down to differences in purpose, scale, and context of the evaluand). Many criteria seem to be organized differently; taking the JCSEE/GeDEval’s feasibility standards as an example, some resemble other criteria in the Quality Assessment Framework’s credibility principle (e.g., ‘feasible research project’, ‘ongoing monitoring and reflexivity’). The utility standards seem to overlap with criteria in the Quality Assessment Framework’s relevance, legitimacy, and positioning for use principles.

      Are there any specific criteria from the JCSEE or GeDEval’s standards not currently reflected in the Quality Assessment Framework that you think are relevant for assessing the quality of transdisciplinary research? Likewise, are there any quality criteria from the Quality Assessment Framework that may be useful to inform program evaluation standards and practice?

      As for your point about Evaluation Accountability Standards, are you familiar with any evaluation standards that exist specifically to assess an evaluation tool? We would be interested to learn more from what you can share on this.

  2. This looks like an exciting approach worthy of testing in a variety of contexts. I wonder whether you have faced differences among researchers / stakeholders in assigning the rating on the three level Likert scale? For instance under Credibility, the extent to which the findings are generalizable / transferrable may be open to different interpretations among the research team members. A second challenge could be the difficulty of ascertaining Use, given that often-times cutting edge research may have limited initial acceptance by potential research users who have first to become sensitized / informed on the topic. I will be interested in hearing about such practicalities from your experience.

    • Great questions, Ricardo! In our experiences applying the criteria (i.e., predominantly for ex-post evaluation) – and the scoring approach which we recommend – we have multiple evaluators (four in our case) individually review evidence to inform the scoring process (e.g., project documents, external documents about the project, researcher interviews, project stakeholder interviews, etc.), record their individual scores and justifications, and then we meet as a group to discuss scores, justifications, and evidence. As part of this process, we continuously reflect on the assessment of each criterion against the purpose of the project. In some instances, these discussions have uncovered misinterpretations and enabled the evaluators to reassess with a shared understanding, or adjust initial scores based on convincing justifications or new evidence (other times, evaluators may choose to stick with their original scores). This is also why we take an average of the evaluators’ scores, which can help to illustrate a stronger or weaker 1. Non-consensus between evaluators would only be concerning if some evaluators assign a 0 and others a 2, indicating drastically different interpretations and/or evidence.

      As part of the revisions made to the Quality Assessment Framework, one aspect that we focused on was to better clarify the definitions and provide guidance for elements to consider for each criterion by ensuring they are sufficiently descriptive. We did this to help build common understanding of the criteria amongst evaluators and reduce the potential for different interpretations. You raise an important point with regard to ascertaining use; we have had case studies of projects bridging the science-policy divide and these processes take time to influence. As you can see in the revised definitions and guidance for the criterion ‘practical application’ (https://bit.ly/QAF2definitions), we tried to account for this situation by adding framing for potential future use and indications of “system actors[‘ …] intentions to use or apply the research” that have not yet happened but are likely to (e.g., because it is too early in the sensitization process to have evidence of uptake).

      Our team is keen to learn from others’ testing of the Quality Assessment Framework in new contexts, particularly what aspects of the tool are more challenging to apply, what criteria may be missing, and where improvements for clarity can be made.

  3. It is great to see the four criteria being developed and specified, especially the one now called “positioning for use”. I also strongly support the general comment that “each criterion should be evaluated in light of the purpose of the project.” What I would like to know more about is Project A. It seems to be the perfect transdisciplinary project. Until now, I thought that the perfect project does not exist because there are always trade-offs needed between the divers goals of transdisciplinary research. It seems I was wrong.

    • Thank you, Christian! Indeed, we felt it necessary to expand upon the criteria under the ‘Positioning for Use’ principle (formerly ‘Effectiveness’) to interrogate more closely the ways in which the research process and research outputs stimulate the uptake and use of knowledge as well as outcomes (i.e., changes in knowledge, attitudes, skills, and/or relationships that manifest as changes in behaviour). We also saw potential for misinterpretation of the spidergrams if a project scored on average lower in the first three principles but scored stronger in the ‘Effectiveness’ principle, leading to (misguided) conclusions that less relevant, credible, and legitimate research can still be effective.

      Project A was a highly transdisciplinary research project and is a valuable learning case study, not only in how it was designed and implemented in an extremely sensitive context, but also in terms of what it set out to do and was able to accomplish. You can learn more about Project A at https://bit.ly/ProjectAevaluation. We agree that ‘perfect’ scoring is unusual; interestingly, the principal investigator of Project A shared that there were aspects of the project that they would do differently in hindsight (so maybe not ‘perfect’). Yet, it is clear that the principal investigator fully instilled transdisciplinarity throughout Project A’s approach. To emphasize, the Quality Assessment Framework does not intend to judge projects’ excellence, but is to be used to assess the extent to which a project fulfills transdisciplinary characteristics. In many of our case studies, the projects were not intentionally designed as transdisciplinary projects and it would be unfair to evaluate them against that standard (which connects to your second point). It would be an interesting exercise for our team to rescore Project A using the revised set of criteria to see whether the newly introduced or revised criteria continue to be fully satisfied.

    • Thank you for your comment Christian. I think you can safely continue to assume that there is no perfect project. The universally high scores on Project A are, in part, an artefact of the 3-point scoring system we used. In the next round we will test a 4-point system for more sensitivity. Also, the TDR QAF (Transdisciplinary Research Quality Assurance Framework) is intended to be used to guide and to evaluate TDR design and implementation. In this application we used it more as a checklist of TDR characteristics, as a basis to compare projects (which were not explicitly designed as TDR projects) and to assess whether projects that used more TDR elements had more or different kinds of outcomes.


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: