Best Practices for data modeling in Power BI

Power BI is a powerful business intelligence tool developed by Microsoft, designed to transform raw data into meaningful insights and visualizations for informed decision-making. It offers a wide range of data analysis capabilities, including data modeling, data transformation, and advanced analytics, empowering users to uncover hidden patterns and trends in their data. In this article we are going to mainly focus on how to make the best out of your report by building a robust data model and following these steps for a clean and easy-to-use report.

Let’s first define what we mean by data modeling. Data modeling is the process of organizing data into a logical and consistent structure. This structure makes it easier to understand, analyze, and report on data. Power BI is a powerful business intelligence tool that can be used to create data models.

A significant point to underline here is that Power BI works best when creating a star schema. The star schema structure keeps things simple and efficient. Instead of having data spread out all over the place, the star schema brings the main fact or facts table right to the center, surrounded by smaller dimension tables. This way, when you want to analyze your data, Power BI can quickly navigate through the connected stars to fetch the relevant information

Source: microsoft.com

How to connect tables in Power BI

The main way to connect tables in Power BI is through relationships. Relationships play a key role in connecting different data tables in a friendly and efficient way. Think about them as special bonds between tables that define how they relate to each other. By setting up these relationships, you can avoid data confusion and ensure that your visuals and calculations display accurate information. It is crucial to have a good understanding of your data before starting to connect tables. There are three types of relationships in Power BI:

  • One-to-one relationships link a unique value in one table to a unique value in another, like a personal ID connecting to a specific person's details.

  • One-to-many relationships link a single value in one table to multiple matching values in another, such as a product ID appearing in multiple sales records.

  • Many-to-many relationships are less common but still useful for connecting multiple values from one table to multiple values in another, often through a bridge table.

You can let Power BI detect automatically these relationships or you can manually create or modify them to suit your data model.

How to connect columns from fact and dimensions

Another key factor for efficient and effective data modeling is to also connect IDs. Try to avoid creating relationships based on columns that include texts. Every dimension should have an ID column, if it does not have one, you can create one by using the auto-index functionality of Power Query. The process for that is the following:

  1. Open Power Query by clicking on the Transform Data at the Home ribbon

  2. Select the dimension you want to create the unique key

  3. Click on the Add Column ribbon at the top of the page

  4. Click on the Add Index option

  5. Click on the drop down menu and choose the most convenient option for you

Remember the above process is to make sure that all the dimensions have unique IDs, which means that you need to repeat the same process for the rest of your dimensions.

After creating all the ID for your dimensions, it is time to move to the fact table. The most appropriate process for connecting the fact with the dimensions is to replace all the text columns with the IDs you created before. The process is as follows:

  1. Again open Power Query

  2. Select your fact table

  3. From the home ribbon, click on the merge queries at the right hand side

  4. Select the dimension you want to connect it with

  5. Select the common column

  6. Specify the type of join

  7. After merging the two tables, you need to expand the columns and bring the unique ID of the respective dimension

  8. Rename the column

  9. Delete the text column and keep only the ID column

  10. Close and apply all your steps

Now, you are ready to connect the fact table with the relevant dimension table through an ID column.

And since we are in the Power Query environment, it is very important to always go back and rename your steps in a way which would make sense to other users. It is crucial to not neglect this step and name your Power Query steps properly. Always keep in mind that your report will be consumed by other people, and every step of the process needs to be clear to them too.

Tips and Tricks for Power Query

Besides that some other advices regarding Power Query are the following:

  1. Try to delete all the unnecessary columns in order to not load unused data in the memory of Power BI

  2. Reorder the columns, so that the IDs or the foreign keys are first and then the rest of the columns follow

  3. Change the type of the data as a last step, when it is applicable

  4. Try to combine steps, instead of repeating the same process over and over again

  5. Instead of Removing columns, try to select them. You can achieve that by selecting the column you want to keep and on the home ribbon select Remove Other Columns

Another crucial step to take in order to not only avoid overloading your data model but also to make your report clean and clear for other users is to disable the loading of tables that you do not need. This will dramatically improve your report's performance. There are many cases in which we use tables that ultimately append to a new query, but serve no other purpose. The rule of thumb is to consume as little memory as possible, and you can achieve that by following the disable load step. Here is how you can do that:

  1. Open Power Query. Right-click on the table that you do not need in your model.

  2. By default, the option Enable Load is selected.

  3. You need to click on it.

  4. You can be assured that the specific table is not loaded into the data model because its name will be displayed in an acrylic font.

How to structure properly the measures

Once you have applied all the steps above, try to rename all your tables in a self-explanatory way. On top of that, in the data model section of Power BI, try to hide all the columns that you are not going to use in your report. This leads to a cleaner report.

The last step to proper data modeling is to structure your measures. Instead of having a general measures table, the approach wed follow is to create all the measures in the respective fact and keep only these as visible columns. With this approach, you implicitly create measure tables for each of your facts. Another significant thing to have a clean report and a proper data model is to stick to logical and simple name conventions. Try to name your measures in a way that is understandable to everyone. Lastly, it is important to create folders and sub-folders to group your measures. If you want to create folders for your measures:

  1. Open the data modeling tab in Power BI

  2. On the right hand side click on the measure

  3. Open the Properties tab

  4. On the Display folder you can write the name of your folders

  5. If you want to create sub folders, you can use the back slash, eg [Name of Folder] \ [Name of the Sub folder ]

And there you have it! We've taken a journey through the best practices for data modeling in Power BI. So, go ahead and embrace these practices, and watch as your Power BI experience reaches new heights of awesomeness!

Previous
Previous

The TO_JSON_STRING Function: Everything You Need to Know

Next
Next

Keep Your Data Fresh with the Last Refresh Date in Power BI