Predicting wildfire spread is a very challenging but important task in studying natural disasters. Here is a benchmark dataset for this application.
Wildfires are uncontrolled and unpredictable fires in areas of combustible vegetation, which although are sometimes useful for their ecosystem, can be a source of major damages to rural or urban areas. Whether they are naturally occurring wildfires or have a human source, they can cause damage to properties and human life. They are among the most common natural disasters and it is important for governments to be able to study them, or even predict the way that they can / will spread.
“Next Day Wildfire Spread” is a large scale, multivariate dataset of historical wildfires aggregating nearly a decade of remote-sensing data across the United States. In contrast to existing fire data sets based on Earth observation satellites, this data set combines 2D fire data with many explanatory variables (e.g., topography, vegetation, weather, drought index, population density) aligned over 2D regions, providing a feature-rich data set for machine learning applications. This data set can be used as a benchmark for developing wildfire propagation models based on remote sensing data for a lead time of one day.
The data is aggregated from 2012 to 2020 in 18,445 samples. Each sample is a 64-km x 64-km region at 1 km resolution from a location and time at which a fire occurred. The fire information is presented as a fire mask over each region, showing the locations of ‘fire’ versus ‘no fire’, with an additional class for uncertain labels (i.e., cloud coverage or other unprocessed data). Fire masks at both times of (T) and (T + 1) are provided. This data set contains the following features: elevation, wind direction and wind speed, minimum and maximum temperatures, humidity, precipitation, drought index, normalized difference vegetation index (NDVI), energy release component (ERC), and population density.
The paper is available at:
and the dataset can be found at: