Best tutorials on how to visualize user journeys with Sankey Diagrams with Python

September 29, 2023
Best tutorials on how to visualize user journeys with Sankey Diagrams with Python

In this article, you will find the best python tutorials you can use to visualize user journeys on your app or website!

User journey analysis is tricky

Coding in Python and SQL as a data analyst

Here is my story.

I have been data analyst for several years for several app and website companies. Some were social networks, some were group chat apps, some were even dating apps. I also did some analytics on e-commerce, as well as one-pager websites - believe it or not, there is a lot you can analyse in one single landing page!

While some metrics are very easy to analyse (DAU, Conversion rates, Event counts), some are a bit more challenging (Retention rates, Stickiness metrics - the famous DAU/MAU and some other ratios). Overall, with the most popular tools and softwares on the market such as Excel, Looker Studio (previously called Data Studio), Power BI, or even Tableau, almost everything can be looked at without coding too much.

Of course, sometimes a little bit of code is needed. Coding adds great flexibility when you need to come up with a more custom solution. In my honest opinion, when it comes to behavioral data analysis in an app or a website, the coding part is not too hard. I'm far from being the best at coding, and I quite easily managed to get the work done after a couple of weeks of learning. I would not say the learning curve is too steep, and they are many great free tutorials online. 

For SQL queries, BiqQuery is intuitive, flexible, and the documentation around it is quite rich. Since it can be plugged to Looker Studio, it makes almost everything possible to analyse. For investigation that requires some Python code, pandas and matplotlib libraries are great and simplify the work a lot.

However, there is one field in user behavior analytics that I find particularly tricky : User journey analysis (sometimes called customer journey analysis). 

The number of possible paths grows exponentially the more steps you are looking at, and figuring out what is the most common path requires to plot the aggregates of the most used paths in a clever way. This is where Sankey diagrams come in!

Sankey diagrams can help identify areas of the application that are causing users to drop off or get stuck, which can be useful for improving the user experience. Additionally, Sankey diagrams can provide insights into the most common paths that users take through the application, which can help inform decisions about which features or content to prioritize. In my opinion, this is one of the best and most underrated data visualization technique. 

Example of Sankey Diagram showing all user journeys in one image.

Using Sankey diagrams to plot user journeys with Python

As data analyst, when I figured out that Sankey Diagrams were the way to go to analyse user journeys, I knew my coding skills in Python would not be enough. 

Lucky me, the python delvelopers community is awesome, and they are great 100% free tutorials online that explain how to visualize customer journey using python. 

In this article, I will list the 3 best Python tutorials or documentation I found online to plot my Sankey charts and better understand my users paths!

SankeyJourney - the no-code solution to visualize customer journey with Sankey charts

Before starting, for those searching to skip the code part, here is an easy and very flexible solution for you.

Funny enough, I got so convinced by the great potential of flow charts in the specific context of user journey analysis that me and my dad developed SankeyJourney, a no-code tool to generate Sankey Diagrams in seconds. All you need to do is to import your csv file containing your events data, and... that's it!

Your Sankey graph is generated automatically, and you can interact with it. Real time interaction is the biggest advantage I see with SankeyJourney compared to alternatives such as python to build sankey graphs. You can decide the number of the steps to display, the events to filter on, you can filter out some details, and there are many other features. You may want to give it a try!

Sankey Journey landing page. Click here to get there!

Tuto 1 : Visualizing In-App User Journey Using Sankey Diagrams In Python

This tutorial has been created by Nicolas Esnis. You can find his complete tutorial here. 

Let's start with what is the best tutorial ever created for this purpose in my opinion. 

The article presents a Python script that reads data from a CSV file containing information about user behavior in a web application. The input data consists of a sequence of events for each user, where each event is represented by the fields user_id, time_install, event_name, and time_event. The script uses Pandas to process the data and create a dictionary of nodes and links. Each node represents a page or screen in the application, while each link represents a transition from one page to another.

The script then uses Plotly to create a Sankey diagram from the nodes and links dictionary. The resulting diagram shows the flow of users between different pages or screens in the application, as well as the number of users who followed each path and the average time it took them to make each transition.

And here is the final result! Pretty clean no ?

Tuto 2 : A Python Plotly example of the customer journey

This tutorial has been written by Summer He. You can find the complete tutorial here

The tutorial provides a step-by-step guide for creating a Sankey diagram using Python Plotly. The guide starts with a made-up dataset on customer behavior, which includes columns for user ID, event name, platform, and timestamp. The dataset is transformed and aggregated to generate the source and target data for the Sankey diagram, which is then plotted using Python Plotly.

The code first creates a simulated dataset of user events, which includes a user ID, the time of the event, the name of the event (e.g. "Home", "Cart", "Purchase"), and the platform on which the event occurred (e.g. "iOS", "PC").

The main function in the code is user_journey, which takes the dataset as input along with a starting step (i.e. the event that the user started their journey with) and an optional number of steps to include in the visualization (default is 5). The function sorts the dataset by the time of the events and selects the users that have performed the starting step. It then aggregates the first n steps of the user journey for each of these users and counts the number of identical journeys. The resulting dataframe is transformed into a source-target pair format suitable for the Sankey diagram. The function also defines the colors of the nodes and links based on the events in the dataset.

The code then creates a list of labels for the nodes and a list of colors for the nodes and links based on the user_journey function output. Finally, it creates the Sankey diagram using the plotly library and displays it in the output. The Sankey diagram shows the flow of users between the different events on the website/app, with the thickness of the links representing the number of users that followed that particular path.

Along the way, the tutorial provides code snippets and explanations of the various parameters that can be customized to create a more effective Sankey diagram. These parameters include labels for annotation, node colors and link colors, and unique integer IDs for both source and target.

Tuto 3 : Google Charts and its Sankey Diagrams

Google Chart has an excellent documentation for generating Sankey diagrams, making it a great choice for visualizing complex user journeys on an app or website. Sankey diagrams are particularly useful for showing how users navigate through a series of pages or steps in a process, and Google Chart makes it easy to create these diagrams with its intuitive interface and powerful tools.

For our project SankeyJourney, we have used Google Chart extensively to develop our no-code tool, which helps businesses visualize user journeys on their apps or websites. With its extensive range of customization options and straightforward documentation, Google Chart has allowed us to create visually stunning and informative diagrams that are both easy to use and understand. We had to mention them in our top 3 !

Google Charts Sankey Diagram example

And that wraps it up!
Hopefully, these 3 tutorials will make it easier for you to generate your own Sankey Diagram. And remember ! If you are searching for a great easy-to-use solution for building your own sankey diagrams to better understand your users or customers' paths, SankeyJourney is here for you!

If you like tutorial listings, you could like this article about the best tutorials to generate Sankey diagrams on Excel. 

TRY FOR FREEBook a demo