Data Visualization

Yu Chuan Shan
7 min readNov 24, 2020

Learnings and reflections from CMU’s Communication Design Studio in Fall 2020

Week 1

For this project, I’ll be investigating climate data to explore and visualize sources of greenhouse gas emissions and how might each contributor help to reach net zero carbon emission.

Analyzing existing visualization

We started the project by analyzing existing visualization on the website Information Is Beautiful. Questions to consider include:

  • What data is introduced?
    The visualization presents the information in three panels. The left panel that occupies half of the space shows different sources of CO₂ emission. Each source is in a card with the source name, the percentage and number of carbon dioxide equivalent (a unit for measuring the impact of multiple greenhouse gases) of CO₂ that the source contributes. The sources are grouped into 5 categories: energy, buildings, land, industry and transport. The middle panel lists approaches for each category that can help helve CO₂ emission by 2030 and how much CO₂ each of the approaches can reduce. The right panel shows the policies that need to be in place within each category in order to get to zero net CO₂ emission by 2050.
  • How would you characterize the steps in the story?
    When I first saw this visualization, my attention was caught by the boxes on the left. As I read the percentages and the text in each box, I started to realize that the boxes represent how much CO₂ was emitted by that source. I then moved to the right two panels and read in more detail about ways to reduce CO₂. The layout of information follows a chronological order from left to right that I found helpful in understanding the whole story.
  • What relationships emerge from the visualization?
    The most prominent relationship is the color-coding of information groups. Elements with the same color fall under the same category of emission source. There’s also temporal relationship as the information represents the current state, 10 years in the future and 30 years in the future.
  • What do you believe the maker wants you to see?
    The main goal is to help the viewer identify major sources of CO₂ emission and ways to reduce emissions. The maker also demonstrates that achieving zero net emission is a progressive process that requires continued efforts.
  • Why is their stance important? What other relationships might be inherent in the data and of value to highlight?
    This visualization’s emphasis is on identifying and categorizing the CO₂ sources. It invites the viewer to focus on the type of sources rather than things like geographic distribution of the emission or comparing emission from different countries.

Initial project ideation

  • What have you gathered from the readings and class activities to date?
    Traditional quantitative data visualizations can communicate information more effectively by providing an engaging experience that encourages the viewers to actively explore the data rather than passively interpret what’s given. When creating the experience, it is helpful to consider the audiences’ expectations and help them draw the connection to familiar objects so as to reduce cognitive load and foster a better understanding. Furthermore, abstract visual representations can make data more digestible by focusing people’s attention on the key parts.
  • What facets of your data are you considering using in your project and why?
    Because one of my main goals is to help people understand the current state of CO₂ emissions, I will focus on the current level of CO₂ emission by each sector and country.
  • What design research question is guiding your project?
    Who are the largest contributors to global carbon dioxide emissions and how is each country doing in reaching the goal of net zero emission by 2050?
  • What organization methods do you imagine leveraging in the data (LATCH)? What coordinate system(s) do you see emerging as logical and appropriate?
    As I’m interested in how reaching net zero emission means differently for different countries, I will leverage location and hierarchy as the main methods of data organization. The geographic coordinate system will be used to visualize spatial relationships and the cartesian coordinate system will be used to show quantitative variations over time.

“Charts and maps are particularly useful for layering multiple data sets on top of each other to show relationships between the sets; in order to be effective, though, the scales and coordinates must be consistent and relative.” — Nathan Shedroff

  • What may serve as a logical sequence for people to move through the content (narrative/indexical/combo)?
    The narrative will be structured based on the change in scales. The audience will first be introduced to the problem of CO₂ emission on a global scale. As the story unfolds, the audience will be able to explore the issue in the selected countries.

Week 2

Constructing the narrative

One thing I want to push myself to do in this project is finding a narrative that I can weave the data visualizations into. A narrative would help me determine the scale and range of the datasets that I need to collect. Building a narrative for data is about asking the right questions. Therefore, I started to jot down potential topics and questions around CO₂ emissions.

Ideas for the narrative.

The international debate on carbon reduction responsibilities

In my research on CO₂ emissions, I found that the international community has been debating on which countries should take the most responsibility for reducing their carbon footprint. The developed countries such as the US and Australia believe they take an unfair burden of reducing carbon footprint since the developing countries are currently the major emitters of CO₂. However, the developing countries pointed out that the developed countries emitted more CO₂ cumulatively in history, and that reducing CO₂ means changes in lifestyle for the developed countries while for them it’s a matter of survival.

“…under the [The Paris] agreement, China will be able to increase these emissions by a staggering number of years — 13. They can do whatever they want for 13 years. Not us. India makes its participation contingent on receiving billions and billions and billions of dollars in foreign aid from developed countries. There are many other examples. But the bottom line is that the Paris Accord is very unfair, at the highest level, to the United States.”
–Donald Trump, Statement to Withdrawal from the Paris Agreement, 2017

After further investigation into the problem, it occurred to me that the international debate stemmed from the different ways countries look at the CO₂ emission data. The developed countries focus on current CO₂ emissions while the developing countries back their arguments based on historic records.

This finding let me become interested in how different representations of data can lead to different interpretations. So I decided to create multiple visualizations for analyzing CO₂ emission data in different ways and explore what information does each convey. The key questions I want to ask are:

  • Which countries emit the most CO₂ at present?
  • Which countries emitted the most CO₂ in total?
  • Which countries emit the most CO₂ per person?
  • What’s the difference between the production-based and the consumption-based model for CO₂ emissions?
  • How much CO₂ has been emitted during different time periods?

Visualizations under consideration

Carbon clock

The Carbon Clock by Mercator Research Institute. (screenshot on 11/24/2020)

Week 3

A framework for visualizing data — a reflection on Chapter 3 of Nathan Yau’s book “Data Points: Visualization That Means Something”

Statistician Nathan Yau provided a systematic framework for data visualization by breaking down a visualization into four components including visual cues, coordinate system, scale, and context. Each component can be further divided into different representations.

  • Visual Cues
    - Position
    - Length
    - Angle
    - Direction
    - Shape
    - Area and Volume
    - Color
  • Coordinate Systems:
    - Cartesian
    - Polar
    - Geographic
  • Scales
    - Numeric
    - Categorical
    - Time
  • Context

As Yau summarized, the visual cues are “the main thing that people see, and the coordinate system and scale provide structure and a sense of space”. The context makes the data more comprehensible and relatable to the audience.

Week 4

Cleaning data

The CO2 emissions dataset for this project is downloaded from Our World in Data (OWID), a nonprofit scientific online publication that originated from a research project at the University of Oxford.OWID’s CO2 data consist of different types of annual emissions such as production-based, consumption-based, and cumulative etc. in each country for different years. Because 2017 is the latest year with data available for nearly all countries, the 2017 dataset is selected to represent the current situations.

The four variables we will use to compare emissions from different countries include:
- annual CO2 emissions,
- annual consumption-based emissions or CO2 emissions adjusted for trade
- per capita emissions,
- cumulative emissions.

Proposed data in four buckets.

We will also investigate the temporal distribution of emissions by comparing time series data for 21 countries. The 21 countries are selected from the largest carbon emitters at present as well as those during the 19th century. They are also chosen so that each continent has a representation.

Week 5

Sections of the story

  • 2017 Annual Emissions (Production–based)
  • Consumption-based Emissions
  • Cumulative Emissions
  • Per-capita Emissions

--

--