How to format your data to build Sankeys and alluvial diagrams

April 29, 2023
How to format your data to build Sankeys and alluvial diagrams

Sankey diagrams are a great way to visualize flow data, making it easy to see how items or people move from one category to another. They're commonly used in a variety of fields, including marketing, finance, and supply chain management. To build a Sankey diagram, you need to have your data in the right format. Depending on the use case, the data format may differ. In this article, we'll explore how to format your data to build Sankey diagrams, both in the generic case and in the specific case of user journey analysis.

Generic Case - Source, target and weight

The generic case involves importing a table that contains the data "source," "target," and "weight." The source and target columns represent the nodes, while the weight column represents the flow between the nodes. For example, consider the following table:

| Source | Target          | Weight |

| Wages  | Budget          | [1500] |

| Other  | Budget          | [250]  |

| Budget | Taxes           | [450]  |

| Budget | Housing         | [420]  |

| Budget | Food            | [400]  |

| Budget | Transportation | [295]  |

| Budget | Savings         | [25]   |

In this example, the nodes are Wages, Other, Budget, Taxes, Housing, Food, Transportation, and Savings. The flows between the nodes are represented by the weight column, which contains the amount of flow.

Once you have your data in this format, you can easily import it into a Sankey diagram tool like SankeyMATIC or RAWGraphs. These tools will then generate a Sankey diagram based on your data.

Example of Sankey diagram generated on SankeyMATIC

Specific Case for User Journey Analysis

User journey analysis involves tracking how users move through a website or app, from one page or action to another. To build a Sankey diagram for user journey analysis, you need to have your data in a specific format. Fortunately, there are tools like SankeyJourney that make this process easy.

SankeyJourney allows you to export an event table containing the information "user_id," "event_name," and "event_timestamp." The tool then reorganizes the data into the generic format with source, target, and weight columns.

For example, consider the following events:

| user_id | event_name      | event_timestamp |

| 123     | click on button | 2022-01-01 10:00:00 |

| 123     | open page      | 2022-01-01 10:01:00 |

| 123     | purchase       | 2022-01-01 10:05:00 |

| 456     | click on button | 2022-01-01 11:00:00 |

| 456     | open page      | 2022-01-01 11:01:00 |

| 456     | purchase       | 2022-01-01 11:05:00 |

SankeyJourney would reorganize this data into the generic format, with the nodes representing the events and the flows representing the number of users who moved from one event to another.

To use SankeyJourney, simply upload your event table in CSV format. The tool supports various formats for the user_id and event_timestamp columns, making it easy to work with data from a variety of sources.

Sankey diagram display customer journeys on SankeyJourney

In conclusion, formatting your data correctly is crucial when building Sankey diagrams. Depending on the use case, you may need to format your data differently. In the generic case, you need to have your data in a table with source, target, and weight columns. For user journey analysis, tools like SankeyJourney can help you easily convert your event data into the generic format, so you can create a Sankey diagram that visualizes the flow of users through your website or app.

Sankey diagrams can provide valuable insights into complex systems by presenting data in a clear and easy-to-understand format. Whether you're analyzing user journeys, financial data, or any other type of flow data, it's important to have your data in the correct format to create a useful and informative Sankey diagram. With the information provided in this article, you should be able to format your data correctly and start visualizing your data with Sankey diagrams.

TRY FOR FREEBook a demo