Source: Äri-IT, autumn 2022
Author: Mihkel Nugis, business analysis developer and consultant at BCS Itera
The architecture of BI solutions can vary greatly and each has its own pros and cons. Let’s take a closer look at the main options and see why a three-layer solution is the most reasonable option in companies where business is based on data.
In the previous issue of Äri-IT, we explained the three-layer architecture of a business intelligence solution. In that article, we pointed out that a fully-fledged BI solution includes the following layers:
- Presentation or reporting layer
The task of the database layer is to collect, process, and store data.
In the analysis layer, the data are converted into a format that meets the requirements of the business logic, relationships and calculations are created, and a dimensional data model is compiled.
The presentation layer consists of applications where end users create queries for the analysis layer and prepare reports in tabular or graphical form.
The data pass through several stages as they move from data sources to reports: they are cleaned up, standardised, and connected to the whole. A classic data analysis solution is characterised by data moving in one direction: from data sources, they are collected into a data warehouse, then the tables in the analysis layer are processed, and finally the information is communicated as reports based on the requests made. The overall purpose of such data processing is to ensure that reports are fast and that the results are understandable and interpretable to every user. Of course, the data must be correct. No matter how the data are transformed during the process, the end result must not be distorted and unrealistic.
However, not all reporting solutions are built with the three-layer structure in mind. Let’s take a closer look at why a three-layer solution is preferred and what the pros or cons of certain types of architecture are.
ARCHITECTURE OF SINGLE-LAYER REPORTING
This is a reporting system where reports are generated directly from a data source. This means that it only has the reporting layer, but no data warehouse or cubes.
This includes all in-app reports. An example is any enterprise resource planning software program used in a company. All of which enable the possibility to generate reports. Data for such reports are requested directly from the operational database, on which the application itself runs.
- The report is part of the ERP application and requires an extra external solution. The user does not have to switch to another environment to view the reports.
- The report reflects the current state of play. As requests are submitted directly to the operational base, it ensures that the data are up to date.
- No need to set up separate permissions. The user already has access to the data based on their role and permissions in the ERP application.
- There is no possibility of follow-up analysis. The format of the reports is predetermined, the user cannot change the level of detail of the report or the choice of fields if they need to analyse the numbers presented in the report in depth.
- Updating reports requires help from IT support. Because reports are built into the app, changing them or adding new ones requires the help of an IT professional.
- Bulkier reports burden the operational base. ERP application bases are optimised to respond quickly to process small-scale transactions. Large-scale data requests can occupy the resources of the entire base, paralysing the daily operations of other users.
One variation of a single-layer solution is one where reports are generated in an application outside the ERP, such as Excel. In this case, the Excel request directly addresses the operational base. We recommend to avoid submitting direct requests to the base that serves as the data source. This creates problems with securing access to data and ensuring reliability, and it is also difficult to control who has access when and to which data.
ARCHITECTURE OF TWO-LAYER REPORTING
As the name suggests, two of the three layers of business solutions are represented in a two-layer solution.
In the first option, analysis and reporting layers are present, but there is no data warehouse. Similar reporting can be built by using, for example, Power Query + Power Pivot functionality in Excel or data queries using the Power BI import method.
Microsoft apps for Excel and Power BI have the built-in ability to create data models. In essence, it is an integrated solution, where the analysis and reporting layers are in one place. Figure 3 shows a fairly common solution where a Power BI application accesses an ERP solution through an API interface and sends data from there to a data model built into Power BI. Data are visualised for reports in the same Power BI environment.
- Reporting is affordable and time- and resource-efficient. Most people are familiar with Excel and the number of Power BI users is also growing rapidly. Creating data models in these applications may require some training, but you do not need to be an IT professional for this. There is also no need for additional investments to start preparing these reports.
- The reports allow for follow-up analysis. This means that when someone completes a data model for a report, everyone involved in the reporting can use it to further analyse the data.
- Data models that are built for reports do not have to be limited to a single ERP application. You can import data into a model from multiple data sources.
- There is no central data policy. Every person who prepares a report creates their independent report with a model that may include calculation logic that is different from the others. As a result, they cover the same thing, but the numbers are different.
- Loading more large-scale data covering several years can be time consuming and may not always be completely successful.
- The API interface has a fixed option. When data are entered into the model through an API interface, the options presented to the person making the request are limited to a predetermined data structure. For additional requirements, an IT specialist must be contacted.
Another variant of the two-layer architecture that we have come across is one in which the data warehouse and reports are represented, but there is no analysis layer.
- The data are pre-processed and standardised in the data warehouse. Data collected from various sources have been brought under common denominators. It is much easier for the person preparing the report request to prepare the request compared to working with raw data.
- Data requests do not burden applications related to primary sources. Since the data are located in a separate base from the original sources, and often also on a separate server, requests for reports do not affect the operational work. Filling the data warehouse with new data is done outside of working hours.
- Users of reports do not need to have permission to use the native applications. If an employee of the company only needs reporting, there is no need for additional expenses, for example, to acquire a licence for an ERP application.
- The data in the data warehouse do not reflect the current state. As the data warehouse is usually updated once a day, the data available to the person preparing a report have a one-day delay.
- More complex requests take a long time. Requests that require large-scale computing are slow because large amounts of data need to be processed in the course of the request.
THREE-LAYER BUSINESS INTELLIGENCE ARCHITECTURE
The fact that complex and large-scale queries that involve several years of data are sometimes slow is the main reason why we need to implement an analytics layer in addition to a data warehouse.
The analysis layer can be an analysis service with a tabular data model, which, as already mentioned, is included in Power BI and Excel, or it can be realised in the Microsoft SQL Analysis service environment. In earlier deployments, the multidimensional data model was widely used, but more recently it is increasingly being replaced by a newer and more popular tabular model.
Why is the analysis layer so much faster in responding to report requests compared to when requests are submitted directly to the data warehouse? The reason lies in the technology of how the data in the data model of the analysis layer are organised. They are structured to provide quick answers to queries even in the case of large-scale data.
We have seen reports that took 5 to 10 minutes to request data directly from the data warehouse base. One can argue whether this is a lot or not. If you compare it with the time it takes to prepare a report manually, which can be anything from a few hours to a few days, this is acceptable. However, if the user needs to analyse the data and change the time filters often or change the level of detail of the view and the structure of the report, even a few minutes of waiting after each adjustment is quite unpleasant. Adding an analysis layer to such requests means that even if the answers are not received in an instant, they will definitely come in seconds instead of minutes.
If you wish to develop a business intelligence solution that is able to aggregate data from a wide range of sources, is available for reporting to many employees, and ensures fast, reliable and consistent reporting, you should follow world-renowned methodological and established practices.
The three-layer BI solution architecture is precisely what we would recommend.