It is used for exploratory data analysis. using pandasgui one can perform many tasks such as statistical analysis, querying a dataframe i.e applying filters to a DataFrame, plotting various types of graphs(line, box, bar, histogram, etc), reshaping the DataFrame and other tasks too.
In this article, we’ll learn exploratory data analysis (EDA) with a Python library called pandasgui. Note this library is not part of pandas, rather it’s a standalone library that we need to install.
pip install pandasgui
import pandas as pd
from pandasgui import show
data = pd.read_csv("../dataset/Voitures.csv")
Here the DataFrame tab shows the pandas DataFrame similar to how a jupyter notebook would show it.
The size of the frame is 18 rows 7 columns
We can create filters easily we will click on the column name that we what to filter
Keep in mind that if we apply filters to a dataset, all the views and operations we do from now on will apply to the filtered data as opposed to the original dataset.
This is one of the best features in pandasgui. We can simply modify data values in the DataFrame tab by selecting a cell and then typing a new value. Everything we modify there will be stored and reflected automatically in the underlying dataframe.
on the statistics tab, we can see the statistical information on the variables such as the data type, the count, the mean, etc.
to create a chart, you will have to go to the grapher section and select the type of graph you require. select the variable and pandasGUI will plot it for you.
In our example, we plot the maximum speed of a car by model. We can see that the Renault 30 has the max speed
- Pandasgui provides a lot of flexibility to users to interact with the data by plotting and reshaping the dataset.
- Work on multiple dataframes simultaneously
- Unlike pandas_profiling and sweetviz, pandasgui doesn’t provide a set of pre-defined analysis on each variables (data column).
- pandasgui also doesn’t provide the correlation coefficient matrix.