Creating alternatives with Python to using instead of pie charts
A pie chart is a typical graph for showing the proportions of categorical data. Basically, this is a circular graphic divided into slices to display the proportional contribution of data compared to a total. The areas can be expressed in percentages by calculating the total 360 degrees equal to 100%.
This chart is frequently used in data visualization since the concept is simple to create, and the result is easy to understand.
However, there are some controversial issues. Some sources explain that it is hard for humans to measure quantity from the slices on the graph (link). Moreover, the information can be distorted and mislead the reader (link).
Fortunately, the pie chart is not the only choice we can use. There are various graphs that can express proportions or percentages. This article will guide nine alternatives that can exhibit the same data dimension as a pie chart.
The intention of this article is not against the pie chart. Every chart has its pros and cons. The main purpose is to guide some graphs that can express data in proportions or percentages in comparison to a total.
Please take into account that these visualizations are not perfect; they also have pros and cons.
Let’s get started.
Starting with importing libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns%matplotlib inline
To show that the method mentioned in this article can be applied to real-world data, I will use data from a List of countries by coal production on Wikipedia. This article shows a list of sovereign states and territories by coal production larger than 5 million tonnes as of 2020.
The data from Wikipedia is used under the terms of the Creative Commons Attribution-ShareAlike 3.0 Unported License.
I followed the useful and practical steps to download the data from Web Scraping a Wikipedia Table into a Dataframe.
After downloading, parse the downloaded data with BeautifulSoup
In this article, I will select some European countries with the amount of coal production in 2020. For example, the selected countries are Russia, Germany, Poland, the Czech Republic, Ukraine, Romania, Greece, and Bulgaria.
If you want to select other countries or change the year, feel free to modify the codes below.
list_country = ['Russia', 'Germany', 'Poland', 'Czech Republic',
'Ukraine', 'Romania', 'Greece', 'Bulgaria']
Melt the DataFrame and create a percentage column for use later
Before continuing, let’s plot a pie chart to compare with the obtained result in this article later.
This article will cover 9 visualizations to use instead of a pie chart. These alternatives can be categorized into two groups:
- Dumbbell chart (aka barbell chart)
- Bubble chart
- Circle packing
- Interactive Pie Chart
- Interactive Donut Chart
- Waffle chart
- Bar chart
- Stacked bar chart
1 Comparing each category with a Dumbbell chart (aka barbell chart)
A dumbbell chart is a graph for comparing two data points. As mentioned earlier, comparing slices in a pie chart can be difficult. We can make the comparison for each category with a dumbbell chart.
As its name, the dumbbell chart consists of two circular graphics unified with a straight line. Normally, the Dumbell chart is used to compare data values. In this article, we are going to set the X-axis range from 0 to 100 percent to show percentages of coal production.
For example, we can compare countries or show each country’s percentage compared with the rest. Firstly, we will create another DataFrame to apply with the dumbbell chart.
Plot two countries with the highest coal production in 2020
Plot each country’s percentage compared with other countries
The result looks nice, but the data points are all equal. This may be inconvenient to compare between countries. We can improve the result by changing the circular size in accordance with the percentage values. Different sizes will help compare between categories’ percentages.
2 Use circular areas with a Bubble chart
Instead of just one circle on a pie chart, we can use multiple circles from a bubble chart. Basically, a bubble chart is a scatter plot with different sizes of data points. This is an ideal plot for displaying three-dimensional data, X value, Y value, and data size.
The good thing about using a bubble chart as an alternative to a pie chart is that we don’t have to worry about the X and Y values. The bubbles can be located the way we want. For example, let’s plot the bubbles horizontally.
Sorting the values before plotting will make the result looks organized.
Add X and Y columns
df_coal['Y'] = *len(df_coal)
list_x = list(range(0,len(df_coal)))
df_coal['X'] = list_x
Plot the Bubble chart
A concern of applying the bubble chart is the plotting space. The more circles are plotted, the more area is needed.
3 Organizing the bubbles with Circle packing
A circle packing consists of unifying multiple circles with fewer gaps and no overlapping area. This technique helps save plotting space when working with many circular graphics.
A drawback of circle packing is figuring out the difference between bubbles that have close sizes can be hard. An easy solution is to label each circle with its information.
We need to calculate each circle’s size and position before plotting. Fortunately, a library called circlify can be used to make the calculation easy.
Plot the circular packing
4 Insisting on using the pie chart… No problem… Let’s make an interactive pie chart.
Even though a pie chart has some drawbacks, as previously mentioned, we can not deny that it is easy to understand. Knowing your audience is a must. If your readers are not accustomed to complex charts, a pie chart is still a good option to communicate the information.
We can make an interactive pie chart to improve a typical one. By doing this, the readers can filter and play with the graph to get the data they want. However, sometimes, the readers may not know about the functions. There should be instructions or notes to inform how to use it.
Plotly is a useful library for creating an interactive chart.
5 Cutting out the center to create an interactive donut chart
Practically, a donut chart is a pie chart with a blank center. By the way, some sources explain that it has some advantages over the pie chart, such as facilitating the readers’ narrative or more information can be added to the center (link1 and link2).
The interactive donut chart shares some advantages and drawbacks with the interactive pie chart. We can also create an interactive donut chart with Plotly.
6 Using rectangular areas with Treemap
Theoretically, a treemap is a visualization for displaying hierarchical data. Inside a large rectangle, multiple rectangular areas are used to compare the proportions. Even though our data have no hierarchy, we can still apply the method to show the proportional contribution.
Same as the pie chart, the sum of the overall area is equal to 100 percent.
One thing to be considered is that if there are too many categories or differences between values, small areas may be hard to distinguish from the others.
Plot a treemap
7 Combining little rectangles with a Waffle chart
We have worked with a pie and donut chart; now it is time for a waffle chart. With the fancy name, this graph combines multiple small rectangles of the same size into a large rectangular graphic.
The waffle chart is usually used to show a task’s progress percentage. Thus, we can apply the concept to show the percentages of categorical data.
Plot a waffle chart
Even though the result looks nice, it can be noticed that distinguishing between close colors may be difficult. This may be a drawback of applying the waffle chart with many categorical data.
To cope with the issue, we can plot each category’s percentage and combine them as a photo collage. Please consider that the result from the code below will be exported onto your computer for importing later.
Define a function to create a photo collage. I found an excellent code below to combine the plots from Stack Overflow(link).
Apply the function
# to create a fit photo collage:
# width = number of columns * figure width
# height = number of rows * figure heightget_collage(1, 7, 644, 123*7, keep_sname, 'collage_waffle.png')
Now we can see each country’s percentage clearer than the previous result. The photo collage can also be used as an infographic.
8 Back to basic with a bar chart
Another typical graph is a bar chart, which is a 2-dimensional graph with rectangular bars on the X-axis or Y-axis. These bars are used to analyze data values by comparing their heights or lengths. Compared with the pie chart, the bar chart requires more space to handle a large number of categories.
In this article, we are going to improve a normal bar chart by using Plotly to display information when hovering the cursor over the bars. It is recommended to sort the data before plotting to facilitate the analysis.
A benefit of creating an interactive bar chart is that categories with small annotation text can be read easier when hovering the cursor over each bar.
9 Saving space with a stacked bar chart
A stacked bar chart is a type of bar graph showing the proportions of individual data points compared to a total. From the concept, we will apply a stacked bar chart to display the proportions of the data we have. The total area is equal to 100 percent.
It has the same problem as a bar chart. Small areas may be hard to read if there are too many categories or differences between categories. Creating an interactive stacked bar chart will help display information when hovering the cursor over the area.
Plot the stacked bar chart
We can notice that, in this case, the stacked bar chart can save the plotting area compared with the bar chart.
The pie chart is typical in data visualization. It has some advantages, such as saving the plotting space and being easy to understand. However, nothing is perfect. There are some drawbacks, such as being hard for readers to estimate the quantity, and the information can be distorted.
This article has shown the 9 alternatives that can use instead of a pie chart. Please consider that they also have their pros and cons.
I am sure there are more charts to display proportions or percentages than mentioned in this article. If you have any suggestions, feel free to leave a comment.
Lastly, the bar chart is another graph that is frequently used as well as the pie chart. By the way, using too many bar charts may result in a dull display. If you are looking for ideas to use instead of a bar chart, you might find the article 9 Visualizations that Catch More Attention than a Bar Chart interesting.
Thanks for reading.