We will be discussing fundamental graphs and data transformation. In the previous article, we discussed data and data abstraction. Here, I will go from data to visual representation.
It is a two-step process:
- Step 1: Select and Transform
- Step 2: Choose or Design appropriate representation
We will first go by step 2 because it's easier to first study the number of predefined fundamental graphs and what kind of data they can accommodate, and what kind of information they can communicate and then how some transformations of the original data is needed before the data can be visualized with these graphs.
Categorical(C), Quantitative(Q), Ordinal(O)
- It allows visualizing how a quantity distributes across a set of categories
- C/O + Q
- It displays how a quantity changes with another quantity which is mostly time.
- T + Q
- Alternate: Area Chart
- It displays how a quantity relates to another quantity.
- Q + Q
- Alternate: Slope Chart
- It displays how a quantity distributes across two categories.
- C/O + C/O + Q
- Alternate: Stacked Bar chart, Bar Graph
- It displays how a quantity distributes across two spatial coordinates.
- S + Q
- Alternate: Bar Graph
More than two attributes:
Stacked Bar Chart
- We have as many bars as the number of categories that are included in the first categorical attribute and as many segments within each bar as the values that are in the other categorical attribute.
- It is very good when your main question is regarding the proportion. If it's important to understand what is the proportion of values within each category, it's very good to communicate proportions, sometimes this is also called part-to-whole information.
Grouped Bar Chart
- We have the same bar graph repeated multiple times for the number of categories that exist for the other categorical attribute.
- It is better when the goal is to compare every single value one to another.
- Select one categorical/ordinal attribute
- Create as many sets as no. of values
- Create one plot for each value
- Small multiples - When we have split a bigger plot into several plots
Now, we are going back to the first step of data visualization
Every time designing a new visual representation requires choosing which attributes are going to be used for these visual representations. This process or step is called selection.
Typically, when we have more attributes than what we need to visualize. So, first, we have to figure out which of these attributes we need to select, to create the visualization that we need, and many visualizations require an intermediate step, which typically is the aggregation or other transformations.
- Aggregation - Common aggregation functions that are used are the sum, the maximum, the minimum, the average, the median, and the standard deviation, but there may be situations where we may need to calculate some other type of information.
- Transformations related to attributes that encode information on time and date. Aggregation by days, weeks, months, years.
- Transformation related to spatial data
Transformation of the quantitative attribute to ordinal attribute(Binning)
- Taking quantities and binning them into several categories and then sorting them according to their values.
Rescaling/Re-expressing a given quantitative attribute(Normalization)
- If the attribute has a given minimum and maximum value, one can represent the same range using a different scale.
- Transforming quantitative values into percentages
Creating the right, effective visual representation for a given problem is not only about finding the right graphical format, but also finding the right information. It's rarely the case that we can take the original data and represent it as it is. We need some intermediary transformations.
Data transformation is a crucial step in visualization design. Visualization design is never only about finding the right graphical representation, but also finding the right data transformation.
We are clear about the process of data abstraction and choosing a graph that is appropriate for a given type of data.
Now, We will discuss what kind of individual components can be used to create a visualization that is appropriate for a type of data that we have and the goal that we have. This concept of graphical components and mapping between data and graphical components mainly for two reasons.
- We can better understand the visualization if we know how to visually encode and decode the representation, so it is a useful evaluation tool.
- It's helpful in designing and re-designing visualization.
These are the rules that a person implements in a computer program to transform data into a graphical representation.
- Graphical elements representing data items
- Points, Line, Bar, Area
- Encode properties of data-items
- Position, Size, Angle & Slope, Color, Texture & Shape
- It is the reverse of visual encoding, going from observing a visualization and trying to figure out rules/mapping rules and graphical components of visualization.
- Step 1 -> Identify graphical components explicitly(Visual Marks)
- Step 2 -> Identify mapping rules means what data items represent
- Step 3 -> Identify Visual channels
The goodness of a visual representation is decided by two principles
- It states that "the visual representation should represent all and only the relationships that exist in the data."
- It means that the visual representation should represent the information that is present in the data, but even more important, it shouldn't convey information that is not contained in the data.
- It states that the relevance of information that is displayed should match the effectiveness of the channel.
- Use more effective channels.
They are really important for visualization and helps in interpreting the visualization.
Legends, Labels and Annotations
- Legends and Labels enable the interpretation of the graphical elements.
- Annotations guide attention and explain patterns of interest.
Axis, Grids and Reference lines
- These enable value reading and comparison.
This concludes the article. We introduced a series of fundamental graphs, described how to use them effectively, and to transform data to give it the shape that is needed to convey certain types of information. We discussed the individual graphical components that one can use to build a visualization and to use specific types of encoding rules to transform information and data more in general, into visual representations.
I would be really grateful if you let me know by sharing it on Twitter!
Follow me @ParthS0007 for more tech and blogging content :)