How to Build an Interactive Data Analytics Dashboard in Microsoft Power BI using YugabyteDB
In this blog, I will walk you through how to quickly build an interactive data analytics dashboard in Power BI using the data stored in your YuabyteDB database and specify the storage mode of a table. The storage mode lets them control whether Power BI Desktop caches table data in-memory for reports.
Introducing Power BI and YugabyteDB
Microsoft Power BI is an interactive data visualization tool that focuses primarily on business intelligence and data analytics. It is part of the Microsoft Power Platform. With the tool you can create common data models and aggregate data from different sources, including databases, online services, and different files types. YugabyteDB is a multi-cloud distributed SQL database for transactional (OLTP) applications. YugabyteDB is open source and built to run in any cloud – private, public, or hybrid.
Why Use YugabyteDB with Power BI?
As part of the Microsoft Intelligent Data Platform (MIDP) partner ecosystem, YugabyteDB connects seamlessly with Power BI using the standard PostgreSQL connector available in Power BI (either through “Import” or “Direct Query” Modes).
As an ecosystem partner, we work hard to deliver new YugabyteDB distributed database capabilities and to simplify the process of connecting the two systems. Power BI can help connect heterogeneous data sources, transform and clean the data, place it into a data model, and create visualizations (reports, graphs, dashboards). With Power BI you can quickly and easily explore and visualize the large amounts of data stored you have stored in YugabyteDB.
Benefits of Yugabyte DB / Power BI integrations
Specifically, the expanded partnership between Yugabyte and Microsoft Intelligent Data Platform (MIDP) delivers some key benefits:
- Simplify data connectivity: Via a 1-click experience, our joint customers will be able to generate and run the Power BI visualization from the YugabyteDB database using the PostgreSQL Connector
- Aggregation Offloading: By using the aggregated table feature of Power BI, you can fetch one or more fields from the Direct Query of YugabyteDB table.
- Power of Composite Model: Power BI can combine data from more than one DirectQuery source (e.g. YugabyteDB) or combine DirectQuery with import data (e.g. Azure Synapse SQL Pool or Azure Data Lake Gen2)
- Cloud/Infra Agnostic: Power BI can connect from its own Infrastructure to the YugabyteDB where it can reside either at on-premise or public cloud.
Hybrid Storage Feature of Power BI
Using the storage mode properties of a YugabyteDB table in Power BI, you can specify which visuals require a query to the back-end data sources. Visuals that don’t require a query are imported even if they’re based on DirectQuery. This feature helps improve performance and reduce back-end load.
Storage Modes in Power BI: As referred by Microsoft, the following storage modes are supported for DB communication.
- Import: Imported tables with this setting are cached. Queries submitted to the Power BI dataset that return data from Import tables can be fulfilled only from cached data.
- DirectQuery: Tables with this setting aren’t cached. Queries that we submit to the Power BI dataset—for example, DAX queries—and that return data from DirectQuery tables are fulfilled only by executing on-demand queries to the data source.
- Dual: Tables with this setting act as either cached or not cached, depending on the context of the query that’s submitted to the Power BI dataset. In some cases, we fulfil queries from cached data. In other cases, we fulfil queries by executing an on-demand query to the data source.
Pros and Cons of Storage Modes in Power BI:
Storage Mode | Pros | Cons |
Import | Faster compared to Direct Query since it retrieves data from Power Bi directly. It leverages full Power BI functional features. | Power BI files are limited based on the license model. Data is not as fresh compared to Direct Query. it’s not suited for real-time dashboards or spot decision applications. Query caching is only available on Power BI Premium or Power BI Embedded, for Import datasets. |
Direct Query | Always keeps the data fresh compared to Import data. It doesn’t need any storage. | Doesn’t use the full feature set of Power BI queries like DAX/Custom Models |
Dual | Tables can be configured based on the data refresh requirement. | Query context will decide whether it needs cached or non-cached data. It’s difficult to decide at the time of design; configure over the period of usage time. |
Visualizing Real-World Business Data:
Now let’s explore Northwind schema (available as a sample DB with YugabyeDB installation). Let’s also create a dashboard showing different metrics like Number of Orders by different customers, country, product and supplier and metrics like Discount, Quantity in Hand, Reorder Level by Product. We will demonstrate how seamless and easy it is to connect Microsoft Power BI with YugabyteDB using the standard PostgreSQL connector and create this dashboard or report.
Setup the Northwind Schema
Install YugabyteDB
You have multiple options to install or deploy YugabyteDB if you don’t have one already available.
Note: if you are running a Windows Machine then you can leverage Docker on Windows with YugabyteDB.
Load the sample Northwind database
We’ll use the sample Northwind database that comes packaged with YugabyteDB and only needs to be loaded. If using yugabyted you can load the DB with ./bin/yugabyted connect command from your shell, otherwise follow these instructions.
Install Power BI Desktop
Power BI can then be installed from scratch through a link on the Microsoft website.
Connect YugabyteDB using Power BI’s PostgreSQL Driver
Click File New -> Get Data and it will showdialog below to establish the connection with YugabyteDB using PostgreSQL Database option.
Note: if you choose Data Connectivity Mode as DirectQuery, it will establish a direct connection to YugabyteDB, and it will fetch the data immediately to your dashboard or report based on your request. If you choose Data Connectivity Mode as Import mode then data will be offloaded from YugabyteDB to PowerBI Storage. It requires a scheduled refresh to update the data residing in Power BI Storage based on the schedule frequency (e.g. by x minutes or x hours or x day)After clicking OK, you will be presented with the “Login” Window to enter your YugabyteDB credentials. After a successful connection, Navigator window will show the list of YugabyteDB’s tables (see example below). Select the list of tables that are required to build the visualization.
Model the Data and Assign the Storage Type
After selecting of tables from the Navigator, the Power BI will show “Report”, “Data”, “Model” views (see example below). As you can see, we selected Model Option (circled in the left) to choose the table’s storage mode (circled in the lower right corner).
This dual storage is a hybrid approach. Just as when importing data, the dual storage mode caches the data in the table. However, Power BI determines the best way to query the table depending on the query’s context.
Create the Visualization using PowerBI’s Visualization tools
Using Power BI’s visualization toolset, we can easily build the dashboard by dragging and dropping the fields from the respective tables/columns. Based on the nature of the data (either “measure” or “dimension”) it can be plotted to different graphs’ x- or y- axis or visuals which need a single metric (e.g. order ID) or single dimension (e.g. product name).
Conclusion and Summary
In this blog post we walked you through how to analyze and visualize the data stored in YugabyteDB using Microsoft’s Power BI. YugabyteDB is PostgreSQL compatible, and you can seamlessly connect Power BI with YugabyteDB using the standard PostgreSQL connector that is part of the Power BI Database Drivers. Using the “Dual” storage feature with hybrid tables, we can avoid database calls. This will help us to query the transactional data only through Direct Query. The rarely changing table’s data can be imported at Power BI. This will significantly reduce the number of database calls and ensure that the report’s response time is improved. YugabyteDB and Power BI are both designed for modern enterprises that need to create business value from the large amounts of data they collect and store.