Interactive Visualisation and Dashboards#

Note

If you have not yet set up Python on your computer, you can execute this tutorial in your browser via Google Colab. Click on the rocket in the top right corner and launch “Colab”. If that doesn’t work download the .ipynb file and import it in Google Colab.

Then install the following packages by executing the following command in a Jupyter cell at the top of the notebook.

!pip install pypsa atlite pandas geopandas xarray matplotlib hvplot geoviews plotly highspy
%env HV_DOC_HTML=true
import pypsa
import atlite
import pandas as pd
import geopandas as gpd
import xarray as xr
import matplotlib.pyplot as plt
import holoviews as hv

plt.style.use("bmh")
Hide code cell content
from urllib.request import urlretrieve
from os.path import basename

urls = [
    "https://tubcloud.tu-berlin.de/s/2oogpgBfM5n4ssZ/download/PORTUGAL-2013-01-era5.nc",
]
for url in urls:
    urlretrieve(url, basename(url))

Load Example Data#

First, let’s load a few example datasets you know from previous tutorials.

A PyPSA network:

n = pypsa.Network(
    "https://tubcloud.tu-berlin.de/s/kpWaraGc9LeaxLK/download/network-cem.nc"
)
INFO:pypsa.io:Retrieving network data from https://tubcloud.tu-berlin.de/s/kpWaraGc9LeaxLK/download/network-cem.nc
WARNING:pypsa.io:Importing network from PyPSA version v0.21.3 while current version is v0.31.0. Read the release notes at https://pypsa.readthedocs.io/en/latest/release_notes.html to prepare your network for import.
INFO:pypsa.io:Imported network network-cem.nc has buses, carriers, generators, global_constraints, loads, storage_units
n.optimize(solver_name="highs");
Hide code cell output
WARNING:pypsa.consistency:The following buses have carriers which are not defined:
Index(['Germany'], dtype='object', name='Bus')
WARNING:pypsa.consistency:The following buses have carriers which are not defined:
Index(['Germany'], dtype='object', name='Bus')
/opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/linopy/common.py:147: UserWarning:

coords for dimension(s) ['Generator'] is not aligned with the pandas object. Previously, the indexes of the pandas were ignored and overwritten in these cases. Now, the pandas object's coordinates are taken considered for alignment.

INFO:linopy.model: Solve problem using Highs solver
INFO:linopy.io:Writing objective.
Writing constraints.:   0%|          | 0/17 [00:00<?, ?it/s]
Writing constraints.:  47%|████▋     | 8/17 [00:00<00:00, 79.51it/s]
Writing constraints.:  94%|█████████▍| 16/17 [00:00<00:00, 63.47it/s]
Writing constraints.: 100%|██████████| 17/17 [00:00<00:00, 67.59it/s]

Writing continuous variables.:   0%|          | 0/6 [00:00<?, ?it/s]
Writing continuous variables.: 100%|██████████| 6/6 [00:00<00:00, 156.64it/s]
INFO:linopy.io: Writing time: 0.31s
INFO:linopy.solvers:Log file at /tmp/highs.log
Running HiGHS 1.7.2 (git hash: 184e327): Copyright (c) 2024 HiGHS under MIT licence terms
Coefficient ranges:
  Matrix [1e-04, 2e+02]
  Cost   [4e-02, 3e+05]
  Bound  [0e+00, 0e+00]
  RHS    [3e+04, 8e+04]
Presolving model
25230 rows, 18665 cols, 69120 nonzeros  0s
25230 rows, 18665 cols, 69120 nonzeros  0s
Presolve : Reductions: rows 25230(-25151); columns 18665(-3241); elements 69120(-41526)
Solving the presolved LP
Using EKK dual simplex solver - serial
  Iteration        Objective     Infeasibilities num(sum)
          0     0.0000000000e+00 Pr: 2190(1.32738e+09) 0s
      12190     1.7507428468e+10 Pr: 5106(8.0176e+13); Du: 0(9.94798e-08) 5s
      22596     6.5781554169e+10 Pr: 406(2.97246e+09); Du: 0(9.42109e-08) 10s
      23199     6.5813180032e+10 Pr: 0(0); Du: 0(4.35755e-12) 11s
Solving the original LP from the solution after postsolve
Model   status      : Optimal
Simplex   iterations: 23199
Objective value     :  6.5813180032e+10
HiGHS run time      :         10.68
INFO:linopy.constants: Optimization successful: 
Status: ok
Termination condition: optimal
Solution: 21906 primals, 50381 duals
Objective: 6.58e+10
Solver model: available
Solver message: optimal
INFO:pypsa.optimization.optimize:The shadow-prices of the constraints Generator-ext-p-lower, Generator-ext-p-upper, StorageUnit-ext-p_dispatch-lower, StorageUnit-ext-p_dispatch-upper, StorageUnit-ext-p_store-lower, StorageUnit-ext-p_store-upper, StorageUnit-ext-state_of_charge-lower, StorageUnit-ext-state_of_charge-upper, StorageUnit-energy_balance were not assigned to the network.
Writing the solution to /tmp/linopy-solve-46eiy6rj.sol

Wind, solar and demand time series:

url = (
    "https://tubcloud.tu-berlin.de/s/nwCrNLrtL6LAN3W/download/time-series-lecture-2.csv"
)
ts = pd.read_csv(url, index_col=0, parse_dates=True)

Power plants in Europe

url = (
    "https://raw.githubusercontent.com/PyPSA/powerplantmatching/master/powerplants.csv"
)
ppl = pd.read_csv(url, index_col=0)
geometry = gpd.points_from_xy(ppl["lon"], ppl["lat"])
ppl = gpd.GeoDataFrame(ppl, geometry=geometry, crs=4326)

NUTS2 regions:

url = "https://tubcloud.tu-berlin.de/s/RHZJrN8Dnfn26nr/download/NUTS_RG_10M_2021_4326.geojson"
nuts = gpd.read_file(url).set_index("id").query("LEVL_CODE == 2")

An atlite cutout:

cutout = atlite.Cutout("PORTUGAL-2013-01-era5.nc")

Limitations of Static Plotting with Matplotlib#

You will agree that using matplotlib for static plotting is great for reports, but that it’s lacking some features for interactive visualisation.

ts["onwind [pu]"].plot(figsize=(10, 2))
<Axes: >
_images/079d0c43ecca6bc03ffcae2b6833ca97d03a448a16086f0c11f012725b131bd2.png

There are many Python-based interactive plotting libraries out there, and it can be confusing to keep an overview. This tutorial introduces you to two of them:

These two tools allow you to produce shiny interactive figures with minimal code, however, at the expense of fewer customisation options.

hvPlot#

.hvplot() is a powerful and interactive Pandas-like .plot() API. You just replace .plot() with .hvplot() and you get an interactive figure. Simple as that.

Documentation can be found here: https://hvplot.holoviz.org/index.html

To use it, we have to import hvplot.pandas, which makes the .hvplot accessor available on Pandas DataFrame and Series objects, which means that after that df.hvplot becomes a valid statement while before that it would raise an error.

import hvplot.pandas

Let’s try it by plotting onshore wind time series for the year…

# for Google Colab add:
# hv.extension('bokeh')
ts["onwind [pu]"].hvplot(height=200)

… or the load time series for February

# for Google Colab add:
# hv.extension('bokeh')
ts.loc["2015-02", "load [GW]"].hvplot(height=200)

We can also plot geographic data with hvPlot, for instance, the locations of all hard coal power plants in Europe.

The geo=True declares that the data will be plotted in a geographic coordinate system. Once hvPlot knows that your data is in geo-coordinates, you can use the tiles keyword argument to overlay a the plot on top of map tiles.

Note

For a list of available tiles, look here.

# for Google Colab add:
# hv.extension('bokeh')
ppl.query("Fueltype == 'Hard Coal'").hvplot(
    geo=True, tiles=True, frame_height=600, frame_width=600
)

Like in geopandas, we can tell hvPlot to plot the point sizes and colors according to columns of the pandas.DataFrame. We can also change the opacity with alpha and the colormap with cmap.

# for Google Colab add:
# hv.extension('bokeh')
plot = ppl.query("Fueltype == 'Hard Coal'").hvplot(
    geo=True,
    tiles="CartoLight",
    frame_height=600,
    c="DateIn",
    cmap="viridis",
    s="Capacity",
    alpha=0.6,
)
plot

There are a few more options of the graph we can tweak in the opts() section, like which tools should be activated by default.

# for Google Colab add:
# hv.extension('bokeh')
plot = plot.opts(xaxis=None, yaxis=None, active_tools=["pan", "wheel_zoom"])
plot

All this does not only work with points but also shapes. We can also pick the columns that should be shown when hovering on a shape using hover_cols.

# for Google Colab add:
# hv.extension('bokeh')
nuts.hvplot(
    geo=True,
    tiles="OSM",
    hover_cols=["NUTS_NAME", "NUTS_ID"],
    c="CNTR_CODE",
    frame_height=500,
    alpha=0.2,
).opts(xaxis=None, yaxis=None, active_tools=["pan", "wheel_zoom"])

We can also plot the time series of solar generation in Germany on a heatmap:

# for Google Colab add:
# hv.extension('bokeh')
ts.hvplot.heatmap(
    x="index.hour", y="index.month", C="solar [pu]", cmap="blues"
).aggregate(function="mean")

hvPlot also offers stacked area charts that come in handy for plotting the power dispatch of a solved PyPSA network:

dispatch = (
    pd.concat([n.generators_t.p, n.storage_units_t.p], axis=1).loc["2015-02"].div(1e3)
)
# for Google Colab add:
# hv.extension('bokeh')
dispatch.where(dispatch > 0, 0).hvplot.area(
    stacked=True,
    line_width=0,
    width=1300,
    height=350,
    hover=False,
    color=[n.carriers.at[c, "color"] for c in dispatch.columns],
    ylabel="electricity supply [GW]",
    ylim=(0, 180),
)

Plotly Express#

The plotly.express module (usually imported as px) contains functions that can create entire figures at once. Plotly Express is a built-in part of the plotly library, and is the recommended starting point for creating most common figures. Every Plotly Express function uses graph objects internally and returns a plotly.graph_objects.Figure instance. Throughout the plotly documentation, you will find the Plotly Express way of building figures at the top of any applicable page, followed by a section on how to use graph objects to build similar figures. Any figure created in a single function call with Plotly Express could be created using graph objects alone, but with between 5 and 100 times more code.

Documentation is available here: https://plotly.com/python/plotly-express/

import plotly.io as pio
import plotly.express as px
import plotly.offline as py

Note

We need to import plotly.io and plotly.offline, so that the interactive plots are also visible on the course’s static website.

Let’s reproduce the plots we previously created with hvPlot. Onshore wind capacity factor time series:

px.line(ts["onwind [pu]"])

Load time series in February:

px.line(ts.loc["2015-02", "load [GW]"])

Hard coal power plants in Europe:

df = ppl.query("Fueltype == 'Hard Coal'")
px.scatter_mapbox(
    df, lat="lat", lon="lon", mapbox_style="carto-positron", zoom=2, height=600
)
px.scatter_mapbox(
    df,
    lat="lat",
    lon="lon",
    mapbox_style="carto-positron",
    color="DateIn",
    size="Capacity",
    zoom=2,
    height=600,
)
px.choropleth_mapbox(
    nuts,
    geojson=nuts.geometry,
    locations=nuts.index,
    mapbox_style="carto-positron",
    zoom=2,
    height=600,
    color="CNTR_CODE",
    center={"lat": 48, "lon": 12},
)

In plotly, hovering information on the area chart works much better.

dispatch = (
    pd.concat([n.generators_t.p, n.storage_units_t.p], axis=1).loc["2015-02"].div(1e3)
)
df = (
    dispatch.where(dispatch > 0, 0)
    .stack()
    .reset_index()
    .rename(columns={"level_1": "technology", 0: "GW"})
)
fig = px.area(df, x="snapshot", color="technology", y="GW", line_group="technology")
fig.update_traces(line=dict(width=0))
fig

Interactive Dashboards#

There are many different options for building interactive dashboards. Some are brand new, some have been around for a few years.

Each of them has different characteristics, for instance in terms of customisation options and ease of use.

If you want to read a detailed comparison, the best one I found is this one:

https://www.datarevenue.com/en-blog/data-dashboarding-streamlit-vs-dash-vs-shiny-vs-voila

Just tell me which one to use

As always, “it depends” – but if you’re looking for a quick answer, you should probably use:

  • Dash if you already use Python for your analytics and you want to build production-ready data dashboards for a larger company.

  • Streamlit if you already use Python for your analytics and you want to get a prototype of your dashboard up and running as quickly as possible.

  • Shiny if you already use R for your analytics and you want to make the results more accessible to non-technical teams.

  • Jupyter if your team is very technical and doesn’t mind installing and running developer tools to view analytics.

  • Voila if you already have Jupyter Notebooks and you want to make them accessible to non-technical teams.

  • Flask if you want to build your own solution from the ground up.

  • Panel if you already have Jupyter Notebooks, and Voila is not flexible enough for your needs.

In this tutorial, we look at streamlit because it is the easiest to get to results quickly. However, compared to other dashboarding libraries, it has more limited configuration options.

Documentation for this package can be found here: https://docs.streamlit.io/

Streamlit can be installed, for example, with conda, mamba or pip:

conda install -c conda-forge streamlit'>=1.18'

or

pip install streamlit

Note

This tutorial requires streamlit>=1.18.

This tutorial is stored on Github with instructions how to install, run and deploy it:

fneum/streamlit-tutorial

You can see a live demo of the final product here:

https://ppm-dash.streamlit.app/