How Incorta works and how it can fit into your enterprise BI strategy
Incorta is the new kid on the block in a very competitive enterprise business intelligence tools space. The company started around 6 years back and we have been using it for the last 2.5 years successfully for operational reporting.
This blog was written in October 2019 and represents Incorta features as of their 4.3.x version. Please keep in mind Incorta builds significant new capabilities in each of their quarterly releases and the product changes over time. I will try to add new articles to go over significant new capabilities in the platform.
Incorta has taken a very different approach to solve BI and analytics problems than any other vendor in the market. They are essentially four different integrated BI tools in one.
These are the -
1. Data Extraction Layer
2. Database layer
3. Data Modeling Layer
4. Data Visualization Layer
In addition to the above Incorta has a very close integration with Spark and comes with the same installation package. Spark integration provides 2 functions for Incorta -
1. Advanced calculations and joins
2. SQL Interface (SQLi) or Datahub for external BI tools to connect with Incorta
In a traditional BI environment, you have at least three tools (if not four if you have a separate semantic layer tool) to do the same set of tasks. You will have a ETL tool like Informatica or SSIS to load into a database like Oracle or SQL Server and then data model and visualize the data in a BI tool like MicroStrategy, OBIEE or Tableau. You must keep in mind that these are best of breed tools which have been around for a long time with tons of features.
Incorta does a lot of these tasks very well but it cannot match the same set of features as these best of breed solutions in the various categories. As a combined package it delivers a lot of capabilities for BI developers and users. But first let’s take a look at the various components of the Incorta platform.
Components of Incorta platform
Here is a very brief and simplified description of the various components of the Incorta platform and what is special or unique about them -
1. Data Extraction Layer
The first layer of Incorta platform is the Data Extraction layer. Here they have built a technology called Direct Data Mapping where you can extract data table by table from the source system both as a full load or as an incremental load. The data then goes into a parquet file in the Incorta platform and then gets loaded into memory. Each table can have a have a full load and incremental load SQL
Full Load SQL can look as simple as the following -
Select COL_A, COL_B, COL_C from TABLE_1 where CREATED_DATE>=’01–01–2016’
Incremental Load SQL can look like the following -
Select COL_A, COL_B, COL_C from TABLE_1 where LAST_UPDATE_DATE>?
When the load is kicked off the table gets populated for a full load and then in each incremental run it will get the incremental data from the source. Incorta can pull data from almost all traditional databases like Oracle, SQL Server, Cloud applications like Salesforce or from files stored on on-premise or cloud storage. The number of source applications supported increases in each release of the Incorta platform.
After loading data from source systems into parquet files, materialized views (MV) can be created using the Incorta UI and can be written in SQL or PySpark. During run time Incorta sends the request to Spark to read the data from Parquet files and compute the data and reload into a separate parquet file which then gets loaded into memory during the loading stage. The Spark component comes with the Incorta package but does involve some set up to be done for it to work properly.
The advantage of pulling table by table is that the performance of the loads is very fast as there is no need to perform the joins in the source system and the second advantage is that it becomes very easy to add new columns into Incorta as all you need to do is add a new column in the extract SQL and run a full load. Hence an addition of a new column in Incorta can be done in a few minutes compared to days or months using traditional ETL methods.
2. Database Layer
Once the data is extracted into parquet files it is then loaded into the in-memory database engine and then the joins defined in the schema are precomputed at the end of the load. This precomputation of the joins leads to the revolutionary performance of the reports in Incorta. Even if the report built in Incorta has 30 or 40 joins across large tables containing millions of rows the data still comes back in a few seconds. This kind of performance is unheard off in a traditional database.
3. Data Modeling Layer
Incorta data modeling layer comes in two forms — physical schemas and business schemas.
In the physical schema you can create aliases to base tables and materialized views and create joins between various tables and MVs. Business schemas can be created to present a flat, user friendly representation of the physical schema to the user building the reports. Columns can be brought in from one or more tables in different physical schemas and renamed. Formula columns can be added either in physical or business schemas.
4. Visualization Layer
Incorta comes with its own visualization layer. Insights (Reports) and Dashboards can be created on top of physical or business schemas. Analyzer users can create insights or reports with in-built filters, prompts and bookmarks that then end users can consume. There are multiple types of visualizations like charts, pivot tables as well as drill downs are available.
Incorta visualization is good enough for most purposes but when compared to a more best of breed product like Tableau, Power BI or MicroStrategy it does fall short a bit. The Incorta development are rapidly adding features in various releases and they should be able catch up to the other tools in terms of UI capabilities sooner than later.
In addition, Incorta provides a SQL Interface using which other BI tools like Tableau or Power BI can connect to the business schema, physical schema or the Parquet layer.
Where does Incorta fit in your BI architecture
Incorta comes with several distinct advantages compared to traditional BI tools –
- revolutionary performance of reports
- no tuning or DB maintenance needed
- can hold massive amounts of data (billions of rows of data) in memory
- complex data modeling is possible
- simple to develop and make changes as reshaping of data is not needed
- easy to learn and get started for any developer
- a platform where all the four layers of BI are available
- cross data source joins is very easy to achieve
There are three kinds of reporting in any company — Operational Reporting, Real-Time reporting and Analytical Reporting (snapshots, period over period analysis). Incorta’s sweet spot is operational reporting requirements where data can be pulled from various transaction systems and joined together to provide lighting fast reports. You can load data into Incorta several times a day from your source systems and provide operational insights that the users need to make daily decisions to run their business.
In summary, if you are looking for a solution to get lighting quick operational reporting with data joined from multiple complex source systems then Incorta is the one of the best options that is out there today.