Lytix’ XTL Framework

  You’ll often hear a Lytix consultant say that he/she is proud of our XTL Framework. What is it exactly? And how can it improve your business?  

Why do I need a Data Warehouse (DWH) ? 

Data Warehouses support organisations in their decision making by providing data in a well-governed way, this includes:

– Integration of source systems: organisations prefer a mix of solutions to achieve a competitive advantage. Often, large ERP systems (Dynamics AX for Finance and Operations, SAP, Odoo, …) are combined with niche products (SalesForce, Google Analytics, Silverfin, Exact Online, … ) to support their business processes. In an analytical setting, this data also needs to be enriched with open-source APIs (weather APIs, ID verification, … ).

Single source of truth: valuable time can be wasted if numbers cannot be trusted. A DWH provides you one validated catalogue of data.

Easy Managed Access: Data is provided in a way that business users can crunch large quantities of data. This, while still applying security; users should be able to see only what they’re entitled to see (no more, no less).

A foundation for Data Science / AI: Provide a well-structured data repository for your data scientists and minimize their valuable time of cleaning-up and combining sources.

Why do I need the Lytix’ XTL Framework?

Lack of internal engineering experience can result in an unsustainable solution on the long-term. It gets even worse if this person leaves the organisation. Our framework provides unified dimensional models and minimizes ETL engineering. 

 

Advantages of the XTL Framework:

– Unified way of Working: In each of our supported technologies (Azure SQL DB, SnowFlake, DataBricks and Azure Synapse), the main task of engineers is to convert business logic to SQL, a query language that is widely known and which can be relocated to other technologies.

– Maximized logging (with minimal effort during development): Logging modules are automatically wrapped around ETL statements, which reduces a time-intensive burden of developers. Within these modules, all events are captured: number of reads, inserts, updates and deletes, as well as warnings and errors.

– Built for the cloud: The framework is built in such a way it take advantage of cloud benefits; pay-as-you-use, scale-up and scale-down easily.

– Applied Kimball: The modelling techniques are based on best-practices defined by the renowned data-modeller Ralph Kimball (The Data Warehouse Toolkit, Ralph Kimbal & Margy Ross). The most-used and popular Slowly Changing Dimension (SCD) types are supported, such as ‘SCD0: Retain the Original’, ‘SCD1: Overwrite with newer values’, ‘SCD2: Track history by adding new rows’, ‘SCD3: Add new attribute’, ‘SCD6: Add type 1 attributes to a type 2 dimension’ and ‘SCD7: Dual Type 1 and Type 2 Dimensions’. Switching an attribute from SCD Type 1 to SCD Type 2 (… and thus tracking history) is just a matter of a click.

Additional Features

Automated Documentations

The XTL framework is accompanied by a comprehensive report providing you with insights on the ETL processes. This report both provide you standard metric evolutions (number of updates, deletes, failures, …) as well as a dependency view which maps your sources to the targets and shows you the applied transformations.

Dependency Tree

Our framework allows to automatically generate dependency trees based on the metadata of your transformations. Generating the most optimal parallel execution steps within your whole ETL becomes child play. Do you need to refresh only one, two or three tables because you need updated data fast (e.g. for month-closing)? No problem, the dependency tree will only trigger those ETL steps which are directly related to the tables you want updated.  

No ‘All Or Nothing Approach’ in Batches

We often hear following complaint from customers not using our framework “I know one little package failed, but does that mean that the whole dataflow to the business should fail?!

Our framework adresses this issue! The dependency tree allows you to decide what to do upon failures, will you just ignore the failure and continue with your ETL? Or do you stop abruptly to make 100% sure the error doesn’t cause any future problems? This is fully configurable in our framework and it allows you to make a decision per ETL step. 

Data Quality Reporting

Our framework includes a ‘Data Quality’ module. By inputting business rules, the framework will look for inconsistencies and malicious input and report the findings back to the business.

 

Components

Following components are used throughout the framework:

– Azure Data Factory is used as orchestrator and scheduler. This component kicks-off at the requested time (at midnight, once per hour, …) and makes sure all dependencies are initiated in logical order.


– A ‘Storage and Compute Module’ to make the necessary transformations and to provide the data to data consumers; being it business users, data scientists or even mobile devices and webportals. One of four technologies can be implemented as a back-end for this component:
    o Azure SQL Database: A Microsoft product for customers who are already familiar with SQL Servers and in which the Data Warehouse needs lots of integrations.
    o SnowFlake provides both a data lake and strong compute for your analytical needs.
    o DataBricks: if streaming and distributed computing are necessary to build an all-comprehensive Data Warehousing solution. This platform integrates best with organisations that have a large data science team!
   o Azure Synapse (former Azure SQL Data Warehouse) providing large limitless compute for customers with more than 4 TBs of data.


An analytics engine which allows business users to create their own reports, being it Azure Analysis Services or Power BI Pro/Premium. 

Wrap-Up

This article describes one of our proven ways to fuel your data driven organization, covering a multitude of use-cases ranging from traditional reports to data science and this in a governed way.

In need of some help? Do not hesitate to contact Lytix or the author Sander Allert and we/he will be glad to exchange ideas on this subject!

 

Lytix’ Power BI 360° Certified: Uplift your Power BI game

We want you to attend!

Hello analytics enthusiasts! Lytix organizes four (4!) free meetups that demonstrate to you the 360 degree process of using Power BI within an organization. After completing the four hands-on session, you’ll be granted the ‘Lytix 360° Power BI’ certification.  Are you ready to become a Power BI expert? Then make sure to tune in at our next session on the 18th of December!

We are a group of motivated BI and Data Science consultants eager to share our knowledge with you.
Lytix is a start-up specialized in Microsoft Analytics and frequently organizes knowledge sharing sessions.
You are invited to follow our evening sessions (18:00 – 21:00) and we can already promise you will not leave empty-headed nor with an empty stomach.  🍕

This is the first part of our Power BI Track on becoming a 360° Power BI Track:

  • Quadrant 1 on 2/10: Power BI Introduction – Learn to create a simple datamodel and create reports on your data.
  • Quadrant 2 on 23/10: Power BI Measures & Columns – Learn the difference between calculated columns and measures, get acquainted with Power BI.
  • Quadrant 3 on 27/11: Power BI Service – Deploy your first report to the Power BI Service and get to know the differences with Power BI ReportServer
  • Quadrant 4 on 18/12: Power BI DAX Advanced – Learn how to create everything imaginary while using DAX, there are limits!

Our ‘Quadrant Trainings’ will be hosted in the region of Mechelen, Belgium. We hope to see you there!

We want you to attend!

Keep those Power BI reports in (source) control!

While self-service is an easy way of creating and changing existing reports, organizations may struggle with version control of the reports that are published to a Power BI service. We’ve often heard the sentence “Sh**, I’ve overwritten one of my important reports!”. Yes, this problem can also occur to experienced developers and even consultants of Lytix. In most development environments you can setup version-control to keep track of your changes, and in operational systems a daily backup is taken. Power BI Service however, doesn’t support version control (*)! Despite a ton of ideas have been posted to provide source/version control for reports, I couldn’t really find any on the backlog or planned to integrate with Power BI.

This article describes a way of setting-up your own source control by using the Power BI API to export all reports of a workspace. Having a local copy of your database may come-in handy for those organizations which need to adhere to any compliancy (e.g. SOX compliancy, regulatory requirements) or those that just want to revert loss of data/effort.

Walkthrough

Setting-up the download script is not that hard, but it does require you to follow some steps. Instead of copying the different guides, we’ll redirect you to the different guides on the documentation-site of Microsoft. If you’ve followed the steps one-by-one, you can download our script for free.


Step 1: Make sure you can use the Azure Powershell module. This adds some new functions to your default PowerShell environment to manage Azure objects. https://docs.microsoft.com/en-us/powershell/azure/install-az-ps?view=azps-2.4.0


Step 2: To validate your ‘requests to download’, we will obtain an identification token. This can be retrieved by following the next steps: https://docs.microsoft.com/nl-be/power-bi/developer/register-app 
Choose a ‘Native’ app and make sure to note down the ApplicationId. 

Step 3: One of the most forgotten steps to take is to ‘Grant Consent’ in the Azure portal. ‘Mostly forgotten’ because of two reasons: firstly, it is hidden in the documentation of Microsoft and, secondly, it is not so easy to retrieve in the Azure Portal as well. As a friendly reminder, I’ve provided an easy screenshot to find this.

Step 4: Download our PowerShell script and change the parameters at the top of the script, easy as ‘one-two-three(-four)’. This script will first create a connection to the Power BI API, then it will search for the WorkspaceID you’ve defined and it will download all the different reports within this workspace. https://lytix.be/product/power-bi-source-control-using-power-bi-api/

Wait, what if I use DirectQuery?

This method will only download what is published to your Power BI Service tenant. If your report is connected in DirectQuery mode to a source, then that data doesn’t get uploaded to Power BI Service and thus can also not be downloaded by our script. Combine this technique with some snapshots of your source system (… and even keeping track of history in your DWH and DataVault) for the best experience.

 

(*) OneDrive and Power BI

Some of the attentive readers may know that there is a way to do out-of-the-box version control with Power BI Service. In the old-way of creating workspaces, a OneDrive folder is created when you make a new workspace. Putting PBIX files on this folder will automatically publish the report to the Service. And the best functionality of OneDrive is to see the version history. BUT(!), Microsoft is now pushing a newer version of the workspaces with better defined roles which doesn’t support the creation of a OneDrive folder anymore. 

Wrap-Up

Thanks for reading through this blog post, we at Lytix hope you’ve learned something new. If you need advice on your data & analytics landscape, send a message to Lytix or the author Sander Allert and we’ll be more than happy to introduce ourselves. 🤜

5 Tips for Power BI/Analysis Services to improve the use of Excel

 

Wow, you may recall our previous post ‘5 tips for Excel users when connecting to Power BI’, its popularity needed us to upgrade our webservers. Luckily we run on azure and could easily scale up 🙂 Don’t fear, we are more than happy to provide you with five (5!) additional tips to improve your BI environment. In this blog post, we’ll show you different ways to optimize the user-experience for your Excel users focussed on tips and tricks for Power BI report creators and SQL Server Analysis Services developers! 

Power BI Tip 1: Create (a lot of) Measures

Have you ever noticed that there is a different experience when creating reports using PivotTables in Excel and creating reports in Power BI? In Excel you can only work with those measures that you’ve explicitly defined within your Power BI Report. Unfortunately, the ‘implicit measures’ where you drag an attribute in your chart and change the aggregation type (min, max, count, count distinct, …) cannot be used in Excel. Hence, each aggregation that a business user wants to use in its Excel report needs to be written in DAX beforehand, which is a nice incentive to start working on those DAX skills.

Power BI Tip 2: Folders in Power BI

Since a few months already, Power BI now also supports the creation of folders. Hence, design a well-arranged folder structure to locate your measures in. Here are three of the lesser known magic tricks when working with folders:

  • Use a backslash ‘’ if you want to create a subfolder.
  • Use a semicolon ‘;’ if you want to locate the same measure in multiple folders at the same time.
  • The backslash and semicolon can be used together!

While the tip above allows you to store the same measure in multiple folders, these folders always need to be stored on one and the same table.
This tip also applies when designing Tabular cubes. 

While the tip above allows you to store the same measure in multiple folders, these folders always need to be stored on one and the same table. This tip also applies when designing Tabular cubes.

Power BI Tip 3: Use Measure Tables

Another aspect of a different user-experience between Power BI and Excel is the location of your measures. While Power BI allows you to store them in whatever table you like, Excel is stubborn and shows the measures on top of your field list. 

If you want to provide business users with the same experience for both tools, there is a workaround! If you store your measures in a table where all columns are hidden, the icon of your table will change to be a ‘measure table’ and will now always show at the top of your field list! Hence, users will now see measures on top of the field list in both tools Power BI and Excel.

 

Do mind, applying this method will alter the ‘show-details’ functionality for Excel and Power BI files. Up to you to decide the impact of this trade-off. Storing your measure in a completely unrelated table has no performance impact. You could even make different measure tables to make a distinction between the different functional domains of your KPIs.
This tip also applies when designing Tabular cubes.

Power BI Tip 4: ‘Force’ as Date Table

Date dimensions not only are an added value within Power BI, but also in Excel. It allows your business users to apply some more advanced filters which contain time intelligence; filter on the current month, current year, current year or date periods.

In data warehousing scenario’s, you’ll often work with dummy date values. For fields where you would expect a date but none is provided, a default value ‘1900-01-01’ fills-in the empty space to show that it is not known. Your date dimension then contains all dates of the applicable years and this one default value. A few moments later, you’ll notice that Power BI doesn’t allow you to mark this as a date table as this table doesn’t contain full years.
Follow these steps as a nifty workaround:

1) Import your date-dimension but filter-out the dummy value.

2) Mark your table as a date table (logically this works as you’ll have complete years).

3) Remove the filter of the dummy value, you’ll notice that your date dimension continues to work despite it containing incomplete years!

This tip also applies when designing Tabular cubes.

Analysis Services Tip 1: IsAvailableInMdx

Looking for a quick win to speed-up your Tabular cube processing time? Wanting to reduce the amount of RAM used by your Tabular model? If there are attributes that are not used in any Excel workbook (… such as the attributes that your measures are referring to), then update the ‘IsAvailableInMDX’ setting of these attributes to False! This will reduce the size of your tables and improve processing time.

Wrap-Up

Thanks for reading through this blog post, we at Lytix hope you’ve learned something. If you need advice on your BI landscape, send a message to Lytix or the author Sander Allert and we’ll be more than happy to introduce ourselves. 🤜

 

5 tips for Excel users when connecting to Power BI

 

Excel is not dead! We often notice organizations starting their self-service / Enterprise Data Warehouse (EDW) project by stating the goal ‘getting rid of Excel’. We at Lytix do agree that ‘standardized reporting’ (= same report on a timed schedule with the same manual transformations) should as much as possible be automated. On the otherhand, Excel does provide added value if you can use it as a one-time sandboxing environment, to store simple mappings (if a master data solution is too much overkill) or to search in long lists of values. Power BI indeed is a great tool, but some business users may be more attracted to the classical Excel cell-based way of working with calculations. This blog provides some general tips and tricks that can help you optimize your experience in these tools.

Excel Tip 1: Navigating through a long list of attributes and measures

When using the functionality ‘Analyze in Excel’, you’ll often be presented with a long list of tables, attributes and measures. The default visualization shows this long list on top, on the bottom you’ll have a matrix of the filters, columns, rows and measures of the PivotTable. You are just a click away of showing this a lot more convenient; right-click the gear button and organise this as ‘side-by-side’. 

Excel Tip 2: Getting rid of the PivotTable’s Compact Form

If you’re working with larger volumes of data or need to perform a VLOOKUP / INDEX-MATCH on your data, you are probably tired of always re-formatting your PivotTable to a more tabular format. Following actions clearly show what Excel users have been doing more than a 1000 times in their life:

Did you know that you can change the default way that a PivotTable is shown in Excel? This super-handy feature is available in the latest versions of Excel and can be found on navigating to ‘Options > Data > Edit Default Layout’. These are the settings you should apply: 

Excel Tip 3: Slicers with Multiple Connections

Few people know that multiple PivotTables can be controlled with one and the same filter. To do that, select one PivotTable and create a slicer with the desired filter. Once the slicer has been created, right-click it and select ‘report connections’. This opens a list of all Pivot Tables which are linked to this data source connection. Then select the other pivotable that you want to have linked to this filter. From now on, this filter controls both lists. Hint; you can always rename a PivotTable to make this selection easier.

Business users do love this feature and ‘slicers’ are a known functionality in Excel. What a lot of users do not know, is that they can severely slow down your refresh rate. Apply following steps to overcome this problem: Change the setting ‘Visually indicate items with no data’. If this is marked as active (which is the default), then Excel will fire several queries at your Power BI/Analysis model to verify if it should be greyed-out or not. This will often result in performance issues. It is a best-practice to disable this functionality when using slicers.

Excel Tip 4: Automatic refresh

If the same Excel file is used on a daily, weekly or monthly basis, then business users always need to repeat the steps of opening the Excel file (… which takes a while) and then pushing the ‘Refresh All’ button. Few users know that you can trigger the ‘Refresh All’ function right after the Excel file is opened. To do that, open the properties of the connection and enable the ‘Refresh data when opening the file’ option. There is also the option to refresh it every few minutes while you keep your file open.

If the same file is used repeatedly, why not convert it to a more standard way of reporting such as Power BI or Reporting Services? This way it is easier to keep the data up-to-date and the access roles can be configured properly. 

 

Excel Tip 5: Show items with no data on Rows/Columns

By default, PivotTables only show values if a measure has one value linked to it. In the example below, we have eight different product categories. If we add a measure, two of those categories disappear as there are no values for these product categories. Often, business users try to solve this by adding more and more measures until eventually the value pops up (= dirty fix). Excel also provides you with a cleaner solution: open the PivotTable options, navigate to the Display tab and select the option ‘show items with no data on rows’. This functionality also works for columns.

Wrap-Up

Thanks for reading through this blog post, we at Lytix hope you’ve learned something new. If you need advice on your data & analytics landscape, send a message to Lytix or the author Sander Allert and we’ll be more than happy to introduce ourselves. 🤜