Let’s build a dashboard to learn more about the millions of wildfires that occurred between 1992 and 2015 using Holoviz tools
I recently wrote a post about making data dashboards using a Python library called Panel. We created a simple dashboard that allowed us to visualize various summary plots for each year of our dataset. In this post, I want to build on those skills and use Panel to make a dashboard for viewing a time series of wildfire records in the United States.
Our dataset consists of over 1.8 million wildfires that occurred in the US between 1992-2015. Our goal will be to construct a data dashboard that helps us better understand what causes these wildfires, where they occur, and more.
Importing our data is pretty simple. We can use
sqlite3 to connect to our SQLite database and query the information we want to work with. This info is stored in a Pandas DataFrame where we can then drop Alaska and Hawaii to make plotting easier. Finally, we can calculate burn time by subtracting the date the fire was discovered from the data it was contained.
Let’s start by mapping all of the fires so we can see the spatial distribution. We can do this easily using hvPlot since it takes advantage of Datashader to rasterize our 1.8 million points, which makes them more manageable to render.
You can interact with these plots using the interactive notebook that is posted here. It’s impressive how easy it is to create a quality spatial visualization with hvPlot. Since it takes advantage of other Holoviz libraries — Holoviews, Geoviews, Datashader, and Colorcet — it simplifies many of the usual steps needed to create an interactive map that is still responsive with large datasets.
Time series data can be annoying to visualize with static plots, however, Panel’s widgets give us access to all kinds of ways to manipulate and subset our data, like a slider to select the year.
Making a dashboard with Panel is a three-step process:
- Define a widget, like an integer slider to choose the year or a dropdown.
- Define a plotting function that takes the year value from the slider as an input.
- Layout and render our dashboard.
A Faster Way to Make Simple Dashboards
It may seem like a lot of extra code to lay out the dashboard when we only have one plot and a widget to display. For a simple dashboard like this, we can instead use
.interactive to make an interactive copy of our DataFrame and/or data pipeline.
To see how this works, let’s see if we can make a widget that lets us choose the cause of a fire and have the map only display those fires.
It’s nice that we don’t have to worry about the layout of our dashboard and can instead just define our widget and accompanying plot. We won’t focus on interactive DataFrames in this post, however, I thought it would be good to mention them for creating these less complex dashboards.
We have our basic map, but one of the most powerful parts of dashboarding is being able to see multiple aspects of our data at once. Let’s create a few plots to go with our map and then use Panel to link them all together.
Fires by Size
We can start by plotting the number of fires that occurred in each size class; wildfires are categorized by the size of the area they burned, class A being the smallest and class G being the largest.
All of our plotting functions take the year as a parameter so that they can subset the data when the slider value changes. Here we can use that subsetted data to group our DataFrame by the size classification and use
.size() to count the number of fires in each group.
Burn Time by Cause
Our next two plots will both look at the causes of wildfires; wildfires have many causes that range significantly in their frequency and potential danger. Let’s start by looking at the average burn time for each wildfire caused in 2003.
Number of Fires by Cause
Now, instead of burn time, let’s simply look at the number of fires caused by each group in 2003.
We can see some interesting differences between the two plots, like how smoking essentially causes the second longest burning wildfires while it ranks 9th in the total number of wildfires caused. Aside from lightning, the longest burning fires are not the most frequently occurring ones.
This makes sense when we think about what makes a wildfire hard to put out; if a fire is in a more remote location, then it can be hard to contain early. Lightning, smoking, and campfires all have the potential to start fires in these kinds of areas where there is lots of wood to burn and fewer people around to report the first signs of smoke.
There are also some years that can look quite different, like 2006, where a small number of powerline failures caused large fires that burned for several days on average. Let’s take a look:
Something to take note of here is how Panel and hvPlot recognize that these two plots share the same y-axis, and reorganizes the second plot to match the first. All of this happens despite me sorting both DataFrames, so if you want to avoid this, add
.opts(axiswise=True) to your plots.
Putting it Together
Now that we have our widgets and a few plots to explore across our time series, we can go ahead and create our dashboard!
And just like that, we have a working dashboard! I know gifs aren’t ideal, so make sure to check out the interactive notebook so you can play around with these plots yourself. Gifs aside, this is a pretty good dashboard considering how quickly we threw it together.
Room for Improvement
I’m sure we could make this more responsive by simplifying the code running in the plotting functions. For articles where I don’t want to explain what each line of code does, I try to make the code snippets as readable as possible.
For example, there is a lot of repetition when it comes to subsetting the DataFrame by
year; we could probably subset the DataFrame once each time the
year value changes and then pass the subsetted data to each plotting function.
We barely had to put in any effort beyond making the actual plots, and it’s pretty cool how much of the heavy lifting hvPlot and Panel can do for us when it comes to “linking” our plots together.
Holoviz tools can feel a bit daunting at times because they have created a large and interconnected ecosystem with many ways to produce similar visualizations. One key thing to remember when working with data visualization in Python is that there is always a multitude of tools and methods to accomplish your goal. Some may have advantages over others, but the important thing is to just start trying them.
Note: If you are enjoying reading my and others’ content here on Medium, consider subscribing using the link below to support the creation of content like this and unlock unlimited stories!
Dataset: Wildfires (Public Domain CC0)
Short, Karen C. 2017. Spatial wildfire occurrence data for the United States, 1992–2015 [FPAFOD20170508]. 4th Edition. Fort Collins, CO: Forest Service Research Data Archive. https://doi.org/10.2737/RDS-2013-0009.4