Scatterplots as an interdisciplinary communication tool

Community member post by Erin Walsh

erin-walsh
Erin Walsh (biography)

Scatterplots are used in many disciplines, which makes them useful for communicating across disciplines. They are also common in newspapers, online media and elsewhere as a tool to communicate research results to stakeholders, ranging from policy makers to the general public. What makes a good scatterplot? Why do scatterplots work? What do you need to watch out for in using scatterplots to communicate across disciplines and to stakeholders?

What makes a good scatterplot?

In his 1983 magnum opus, The Visual Display of Quantitative Information, statistician Edward Tufte outlined nine principles of excellence and integrity in data visualisation:

  1. Show the data
  2. Induce the viewer to think about the substance rather than about methodology, graphic design, the technology of the graphic production or something else
  3. Avoid distorting what the data have to say
  4. Present many numbers in a small space
  5. Make large datasets coherent
  6. Encourage the eye to compare different pieces of data
  7. Reveal the data at several levels of detail, from a broad overview to a fine structure
  8. Serve a reasonably clear purpose: description, exploration, tabulation or decoration
  9. Be closely integrated with the statistical verbal descriptions of a dataset.

Noting “Graphics reveal data” (1983: 13), Tufte presented the classic case of Anscombe’s Quartet (Anscombe 1973) as an example of successful application of these principles. X, Y, and the relationship between X and Y in Anscombe’s four datasets are numerically indistinguishable (sharing a mean, variance, and correlation). Viewed as pure numbers, it is difficult to see any difference between the sets:

(Data generated by Erin Walsh in accordance with Anscombe’s Quartet (Anscombe 1973))

Striking differences become immediately obvious once they are displayed as scatterplots.

(Source: Erin Walsh)

This demonstrates the importance of data visualisation in a broad sense, and more specifically shows the power of the commonplace scatterplot.

Emerging late in nineteenth century, scatterplots are ubiquitous in the modern data visualisation landscape. Whether a simple monochrome display with two axes, or enhanced through colour, interactivity, motion, or the addition of a third dimension, scatterplots are in widespread use.

Why do scatterplots work?

So, what makes scatterplots so versatile? Scatterplots are remarkably accessible because their interpretation leverages the universal human capacity for pattern recognition. Apophenia is the unprompted awareness of connections and meaningfulness of phenomena.

Such heuristics are evolutionarily vital for making sense of ever-changing complex visual input that may represent important predator, prey or social interaction information. A more subtle, but equally pervasive example of apophenia is the tendency to connect points to find lines, trends, and patterns. Scatterplots convey perceptually simple information, points within a field, which is straightforward to encapsulate neutrally and perceptually. The combination of perceptual simplicity and bootstrapping of apophenic tendencies provide what appears to even lay viewers as conceptual simplicity and straightforward meaning extraction. This underlies the scatterplot’s appeal for conveying knowledge both within, across and beyond disciplinary boundaries.

What do you need to watch out for in using scatterplots to communicate across disciplines and to stakeholders?

  • For cross-disciplinary communication:
    • Be aware of differences in conventions that underpin the data or topic (eg., in chemistry beta means something very different from beta in psychology).
  • In the context of a single plot:
    • Try to always keep Tufte’s principles of excellence and integrity in data visualisation in mind.
    • Give yourself time to properly generate the plot (too many people leave it to the last-minute).
    • Honest mistakes:
      • Too much data/overcrowding points.
      • Trying to say too much at once (multiple groups denoted by size and shape and colour…).
      • Too little (poor axis labels) or too much (caption takes more space than the figure) context.
    • Signs of nefarious intent:
      • Truncated axes without disclosure.
      • Aspect ratio distorted to exaggerate trends.
      • Plotting things which don’t make sense.
  • In the context of the larger communication, if multiple plots:
    • Use a consistent aesthetic across plots (so the eye focuses on meaning, not wondering why the fonts on the axes are different, or the colour scheme has changed).
    • Don’t use too many plots (only important things need a figure; nobody will properly read 10+).

When have you found scatterplots helpful for either obtaining or sharing knowledge? Are there circumstances where they got in the way of information exchange?

References:
Anscombe, F. J. (1973). Graphs in Statistical Analysis. The American Statistician, 27: 17-21

Tufte, E. and Graves-Morris, P. (1983). The visual display of quantitative information. Graphics Press: Connecticut, United States of America.

Biography: Erin Walsh PhD is a postdoctoral fellow at the Centre for Research on Ageing, Health and Wellbeing, Research School of Population Health, The Australian National University in Canberra, Australia. She is also a freelance scientific illustrator with over ten years of experience converting scientific ideas, data, and excitement into visual form. Her primary research interest is the impact of blood glucose on the ageing brain, which she investigates with an eclectic cross-disciplinary range of concepts and statistical techniques, spanning the fields of animal biology, psychology, geography, computer science and population health.

Erin Walsh is a member of blog partner PopHealthXchange, which is in the Research School of Population Health at The Australian National University.

Structure matters: Real-world laboratories as a new type of large-scale research infrastructure

Community member post by Franziska Stelzer, Uwe Schneidewind, Karoline Augenstein and Matthias Wanner

What are real-world laboratories? How can we best grasp their transformative potential and their relationship to transdisciplinary projects and processes? Real-world laboratories are about more than knowledge integration and temporary interventions. They establish spaces for transformation and reflexive learning and are therefore best thought of as large-scale research infrastructure. How can we best get a handle on the structural dimensions of real-word laboratories?

What are real-world laboratories?

Real-world laboratories are a targeted set-up of a research “infrastructure“ or a “space“ in which scientific actors and actors from civil society cooperate in the joint production of knowledge in order to support a more sustainable development of society.

Although such a laboratory establishes a structure, most discussions about real-world laboratories focus on processes of co-design, co-production and co-evaluation of knowledge, as shown in the figure below. Surprisingly, the structural dimension has received little attention in the growing field of literature.

Overcoming structure as the blind spot

We want to raise awareness of the importance of the structural dimension of real-world laboratories, including physical infrastructure as well as interpretative schemes or social norms, as also shown in the figure below. A real-world laboratory can be understood as a structure for nurturing niche development, or a space for experimentation that interacts (and aims at changing) structural conditions at the regime level.

Apart from this theoretical perspective, we want to add a concrete “infrastructural” perspective, as well as a reflexive note on the role of science and researchers. Giddens’ use of the term ‘structure’ helps to emphasize that scientific activity is always based on rules (eg., rules of proper research and use of methods in different disciplines) and resources (eg., funding, laboratories, libraries).

The two key challenges of real-world laboratories are that:

  1. both scientists and civil society actors are involved in the process of knowledge production; and,
  2. knowledge production takes place in real-world environments instead of scientific laboratories.
Franziska Stelzer (biography)

white-space_approx-6font

Uwe Schneidewind (biography)

white-space_approx-6font

Karoline Augenstein (biography)

white-space_approx-6font

Matthias Wanner (biography)

white-space_approx-6font

Continue reading

Sharing integrated modelling practices – Part 2: How to use “patterns”?

Community member post by Sondoss Elsawah and Joseph Guillaume

sondoss-elsawah
Sondoss Elsawah (biography)

In part 1 of our blog posts on why use patterns, we argued for making unstated, tacit knowledge about integrated modelling practices explicit by identifying patterns, which link solutions to specific problems and their context. We emphasised the importance of differentiating the underlying concept of a pattern and a pattern artefact – the specific form in which the pattern is explicitly described. Continue reading

Sharing integrated modelling practices – Part 1: Why use “patterns”?

Community member post by Sondoss Elsawah and Joseph Guillaume

sondoss-elsawah
Sondoss Elsawah (biography)

How can modellers share the tacit knowledge that accumulates over years of practice?

In this blog post we introduce the concept of patterns and make the case for why patterns are a good candidate for transmitting the ‘know-how’ knowledge about modelling practices. We address the question of how to use patterns in a second blog post. Continue reading

Looking for patterns: An approach for tackling tough problems

Community member post by Scott D. Peckham

Scott D. Peckham (biography)

What does the word ‘pattern’ mean to you? And how do you use patterns in addressing complex problems?

Patterns are repetitions. These can be in space, such as patterns in textiles and wallpaper, which include houndstooth, herringbone, paisley, plaid, argyle, checkered, striped and polka-dotted.

The pattern concept can also be applied to repetitions in time, as occur in music. Those who know the temporal patterns can classify a piece of music as a blues, waltz or salsa. For each of these types of music, there are also classic dance steps, that usually go by the same name; these are patterns of movement in space and time.

These examples get to the idea that patterns can be viewed more generally as any type of repetitive structure or recurring theme that we can look for and potentially recognize or discover and then assign a memorable name to, such as “houndstooth” or “waltz”. Recognizing the pattern may then indicate a particular course of action, such as “perform dance moves that go with a waltz”.

The ability to recognize a pattern and then take appropriate action is something that we associate with intelligence. Continue reading