Newsfeed:

Back

What does a modern cloud data warehouse look like?

Source: Äri-IT, autumn 2020

Author: Urmas Tutt

What exactly is a data warehouse and when can it be considered modern? In this article, you will find answers to these and many other questions.

 

What does a modern cloud data warehouse look like?

 

DATA WAREHOUSE

A data warehouse is a computer system that collects and analyses data. The goal is to discover trends, patterns, and relationships in the data to provide information and insight into what is really going on in the company’s business. Traditionally, the data are collected from internal sources such as marketing, sales, production, and financial systems. In most cases, these systems are transaction-based databases known collectively as ERP*, CRM** and DMS***, which are, respectively, software/systems for managing the company’s business processes, customers, and documents.

* ERP or Enterprise Resource Planning – business software for managing all of the company’s core processes.

** CRM or Customer Relationship Management – a system for administering the company’s interactions with customers.

*** DMS or Document Management System – a system for managing the company’s documents.

The first data warehouses were created when companies realised that analysing data directly from these transaction-based databases was slowing down their normal operations, often even causing systems to crash. A solution was found in copying the data to a data warehouse database created specifically for analysis purposes, thereby enabling the operational systems to focus on processing business transactions.

Over the years, the range of available data sources has expanded well beyond companies’ internal systems. The volume, diversity, and rate of creation of data have all increased exponentially. Data can be sourced from various websites, real-time web services, mobile devices and their apps, and machinery. Organisations are producing a huge amount of data over the Internet of Things.

Traditional data warehousing solutions were not designed to cope with the volume, diversity, and rate of creation of data we are seeing today. Thus, newer systems are being sought to overcome these shortcomings and to adapt to modern data access and analysis needs.

 

CHALLENGES RELATED TO DATA WAREHOUSES:

  • Numerous and diverse data sources mean you have a multitude of data structures that must co-exist in one place to ensure systematic and comprehensive analysis.
  • Traditional architectures engender competition between system users and data integration operations. It is difficult to ensure simultaneous loading of new data into the data warehouse and adequate performance for users.
  • Loading data in batches is still common, but many organisations today require continuous data loading, or micro-batches, and the use of streamed data, i.e. immediate loading.
  • The scalability of conventional solutions in regards to modern storage space needs and workloads is extremely expensive, cumbersome, and slow.
  • The latest alternative data platforms are often complex and require specialised knowledge, not to mention extensive setup and configuration. This in turn makes it difficult to cope with the growth of diverse data sources, user management, and analytical queries.

 

CLOUD DATA WAREHOUSE

A cloud data warehouse, on the other hand, is the most cost-effective way for businesses to benefit from the latest technologies and architectures without a dauntingly steep initial investment. It eliminates the need to purchase new hardware, software, and infrastructure and arrange for its installation, setup, and maintenance.

For data warehousing, the options for using cloud technologies are generally divided into three groups:

  1. A traditional data warehousing solution tailored to the cloud services infrastructure. This option is similar to traditional in-house solutions. The same software code is used with minimal adjustments to accommodate the new infrastructure. A so-called virtual server is rented from the cloud. While this eliminates the need to buy new hardware and software, the customer needs to have extensive IT knowledge to manage the data warehouse as a whole. This model is known as IaaS (Infrastructure as a Service).
  2. A traditional data warehouse maintained and managed by a third party in a cloud environment. This option is known in the market as the ASP (Application Service Provider) model. With this option, the data warehouse customer still needs to be able to predict disk space and computing power usage. The supplier takes care of hardware and software installation and upgrades. The customer handles the management, configuration, and optimisation of the software. Many of the limitations of traditional data warehousing solutions remain. Such hybrid solutions are also known as PaaS (Platform as a Service).
  3. A whole data warehouse as a software service, or SaaS (Software as a Service), also more specifically referred to as DWaaS (Data Warehouse as a Service). In this case, the supplier provides the customer with a complete cloud data warehouse solution that includes the full range of hardware and software tools. This eliminates the need for constant performance, governance policy, and security management for the data warehousing solution. The customer only pays for the resources they actually use. Scalability is automatic and on-demand. The workload is distributed and system performance is guaranteed on an ongoing basis.

Many providers of cloud data warehouse solutions have been on the market for a very long time. They started with so-called classical architectures in client-owned hardware and software. Now they are moving on to infrastructure and platform rental models. Adjustments in the software they have developed are minimal. Another group of cloud data warehouse solution providers consists of new arrivals, who have started using the full range of cloud services from the get-go.

The key factors when deciding between a hybrid and true cloud solution are the location of the main data sources and the need to include external data to maximise the benefits of the analytics system.

Additionally, it is important to consider the abundance of the main system usage scenarios offered by the cloud environment. In traditional environments, you have a multitude of systems to handle different use cases: operational reporting, departmental analytics, data lakes for data analysis, custom solutions for predictive analytics. Each of these requires hardware, a copy of the data, separate administration, know-how, and so on. As a rule, data warehouse solutions relieve you of the headaches and costs associated with managing traditional systems due to the need for copying data, administering the system, and handling errors.

The best-known cloud services are Microsoft Azure, Amazon AWS, and Google Cloud Platform. Look into new technological possibilities now and get started on the path of modernising your legacy infrastructure. For new businesses, however, I would recommend immediately seizing the opportunities offered by cloud services in the implementation and enhancement of business analytics solutions.

For more information, visit www.bi365.ee.

Manufacturing

Production management and Dynamics 365 Business Central

Eelmine uudis

järgmine uudis

Business Services Technology

Energy 4.0. The challenges of digitising an infrastructure company