This article shares 10 Python data visualization libraries suitable for multiple disciplines, ranging from the very famous to the lesser-known!
1. Matplotlib
Matplotlib is the cornerstone of Python visualization libraries. After more than a decade, it remains the most commonly used plotting library among Python users. Its design is very close to MATLAB, a commercial programming language designed in the 1980s. As matplotlib was the first Python visualization library, many other libraries are built upon it or directly call it. For instance, pandas and Seaborn are essentially wrappers around matplotlib, allowing you to call matplotlib’s methods with less code. While it’s convenient to get a general idea of the data with matplotlib, creating publication-ready charts quickly and easily is less straightforward. As Chris Moffitt mentioned in “A Brief Introduction to Python Visualization Tools”: “It’s very powerful but also very complex.” Its default plotting style, bearing a strong 1990s aesthetic, has been criticized for years. The upcoming matplotlib 2.0 promises to include more modern styles.
- Developer: John D. Hunter
- More Info: http://matplotlib.org/
2. Seaborn
Seaborn utilizes matplotlib to create aesthetically pleasing plots with concise code. The biggest difference between Seaborn and matplotlib is its default modern plotting styles and color palettes. Since Seaborn is built on matplotlib, you need to understand matplotlib to adjust Seaborn’s default parameters.
- Developer: Michael Waskom
- More Info: http://seaborn.pydata.org/index.html
3. ggplot
ggplot is based on R’s ggplot2 package and employs concepts from The Grammar of Graphics. The key difference from matplotlib is that ggplot allows you to overlay different layers to build a plot. For example, you can start with axes, then add points, lines, trendlines, etc. While The Grammar of Graphics is praised for its method that “approaches the thought process,” users familiar with matplotlib might need time to adapt to this new mindset. The author of ggplot notes that it’s not ideal for highly customized plots, sacrificing some complexity for ease of use. ggplot is tightly integrated with pandas, so it’s best to store your data in a DataFrame when using ggplot.
- Developer: ŷhat
- More Info: http://ggplot.yhathq.com/
4. Bokeh
Like ggplot, Bokeh is based on the concepts of The Grammar of Graphics. However, unlike ggplot, it is purely Python-based and not ported from R. Its strength lies in creating interactive, web-ready plots. Charts can be output as JSON objects, HTML documents, or interactive web applications. Bokeh also supports streaming and real-time data. Bokeh offers three levels of control for different users: the highest level for quick plotting of common charts (e.g., bar charts, box plots), the middle level allowing control over basic elements (like points in a scatter plot), similar to matplotlib, and the lowest level for developers/engineers requiring definition of every element without defaults.
- Developer: Continuum Analytics
- More Info: https://docs.bokeh.org/en/latest/
5. pygal
Like Bokeh and Plotly, pygal provides interactive charts that can be embedded directly into web browsers. Its key distinction is the ability to output charts as SVG. If your dataset is relatively small, SVG is sufficient. However, rendering can become slow with hundreds or thousands of data points. Since charts are encapsulated as methods and the default style is attractive, creating nice charts is easy with just a few lines of code.
- Developer: Florian Mounier
- More Info: http://www.pygal.org/en/latest/index.html
6. Plotly
You might have heard of the online charting tool Plotly, but did you know you can use it from Python? Like Bokeh, Plotly focuses on interactive charts, but it offers several chart types hard to find elsewhere, such as contour plots, dendrograms, and 3D charts.
- Developer: Plotly
- More Info: https://plotly.com/python/
7. geoplotlib
geoplotlib is a toolbox for creating maps and working with geographical data. You can use it to create various map types like choropleths, heatmaps, and dot density maps. You must install Pyglet (an object-oriented programming interface) to use geoplotlib. However, since most Python visualization tools don’t provide maps, having a dedicated tool for cartography is quite convenient.
- Developer: Andrea Cuttone
- More Info: https://github.com/andrea-cuttone/geoplotlib
8. Gleam
Gleam draws inspiration from Shiny in R. It allows you to turn your analyses into interactive web applications using only Python, without needing HTML, CSS, or JavaScript. Gleam can work with any Python visualization library. When you create a plot, you can add input fields allowing users to sort and filter the data based on it.
- Developer: David Robinson
- More Info: https://github.com/dgrtwo/gleam
9. missingno
Dealing with missing data is always a pain. missingno lets you quickly assess data missingness visually, instead of struggling through spreadsheets. You can sort or filter data based on completeness, or use heatmaps or dendrograms to inform data correction strategies.
- Developer: Aleksey Bilogur
- More Info: https://github.com/ResidentMario/missingno
10. Leather
Leather is best defined by its author, Christopher Groskopf: “Leather is for people who need charts now and don’t care if they’re perfect.” It works with any data type and generates SVG images, so quality isn’t lost when resizing.
- Developer: Christopher Groskopf
- More Info: [Link typically leads to the GitHub repository: https://github.com/wireservice/leather]