Migration from Cube to Dataset

Modified on Thu, 20 Feb at 10:48 AM

Cube vs. Dataset: Why Move to Datasets Now?

A cube is a multi-dimensional data structure used for organizing data into different layers, known as dimensions (such as time, location, or product). Think of it as a 3D spreadsheet, where each axis represents a different way to analyse data. Cubes are optimized for fast, pre-defined queries but lack the flexibility required for new types of analysis.

In contrast, a dataset stores raw data in a simple tabular format, similar to a regular spreadsheet. Datasets offer much greater flexibility, allowing for different types of analysis, data combination from multiple sources, and the creation of custom reports. They are compatible with modern business intelligence (BI) tools, supporting real-time reporting and advanced analytics.

Why Shift to Datasets?

We are advising all clients to move to datasets now because they provide:

Greater flexibility and adaptability to changing business needs.
Real-time data processing capabilities.
Seamless integration with modern BI platforms.
A broader range of analytical possibilities.
Better performance with columnar data structure.

Migrating to datasets ensures more dynamic and comprehensive data management, and performance, aligning with the evolving demands of modern analytics.

While there is no specific timeline for discontinuing Cube support, Smarten will eventually transition to supporting only datasets in the future.

Step-by-Step Migration Process: Cube to Dataset

Migration Steps Summary

The following is a concise list of steps providing a structured guide for migrating elements from the cube to the dataset, with detailed instructions below.

Check Cube Configuration Details
Identify and review cube configuration details such as financial year, start month, and cube type.
Verifying Incremental Update Settings
Ensure settings like incremental updates or scratch updates are configured identically in the new dataset.
Extracting the Base Query and Build Dataset
Extract the base query from the cube, then use it to construct the dataset with the same data source.
Checking Cube Information
Examine and map cube information, including general details, columns, and objects.
Checking Other Cube Management Elements to Dataset
For each section within Cube Management, map essential components directly to the dataset:
- Dimension Maps: Identify and replicate dimension mappings from the cube to the dataset.
- Data Value Display Mappings (DDVMs): Transfer data mappings applied to the cube to ensure consistency.
- Global Variables: Note and replicate any global variables used in the cube for alignment with the dataset.
Replicating Front-End Elements (UDDCs and UDHCs)
Identify and replicate any front-end elements, such as User Defined Dimension Columns (UDDCs) and User Defined Header Columns (UDHCs), that are present in reports. This involves creating a "dummy" report to capture all necessary columns in the dataset before reassociation.
Reassociate and Validate Reports
Reassociate existing reports with the new dataset. Export and validate values to ensure data consistency after migration.
Validate and Delete the Existing Cube
After thorough validation of the dataset and associated reports, delete the original cube and its objects to complete the migration.

Cube To Dataset Migration: Step-By-Step

Step 1: Check Cube Configuration Details

First, we need to check the cube configuration details. Follow these steps:

Identify and Search the Cube - Navigate to the Cube Management section and go to the Cube Repository. Search for the respective cube. For our requirement, we are using the "Sales Data" cube as an example. Identify the cube by its icon to determine if it is built from scratch or is a real-time cube.

Edit and Verify Cube Details: Edit the cube “Sales Data” to check which financial year and month it starts from.

Image 1 – Edit option for the Cube

Image 2 – Financial Year configuration

Step 2: Verifying Incremental Update Settings:

To check if the cube is set for incremental updates or scratch updates, first access the cube repository and search for your specific cube. In this example, we will use the "Sales Data" cube. Select the "Sales Data" cube from the search results. Then, go to the "Cube Rebuild" option and review the configuration to determine whether the cube is set for incremental updates or scratch updates.

Image 3 – Cube Rebuild option

If the dataset is being created from scratch, as shown in the configuration example for the cube "Sales Data" in the image- 4 , replicate the same update configuration condition from scratch in the new dataset.

Image 4 - Cube update settings under Rebuild option

If the cube uses incremental updates, navigate to the option Increment and you will find two options “Append all rows retrieved from Data source” and "Append rows based on unique column". In this option “Append rows based on unique column.", a dropdown list will display the unique columns used in the selected cube. Note down the unique column specified here. Apply Same Settings to Dataset: Ensure the same unique column is selected in your new dataset.

Image 5 – Incremental rebuild option

Step 3: Extracting the Base Query of cube and build a new Dataset

After checking the cube rebuild configuration, click on the "Next" button to access the base query. This base query will be used to build your dataset by using the same table from the Data source that was used for the cube.

Image 6 – Base Query

We will use this query to create our Dataset by using the "Paste to Ready Query" option of the dataset, utilizing the same data source from which the cube was built.

Image 7 - Paste to ready query option for Dataset creation

Once the dataset is created, publish it and set up scheduled updates. Configure the rebuild type for the dataset to match the cube settings noted earlier. For instance, for our current example as the cube "Sales Data" is configured with a scratch rebuild type, applying the same rebuild type to the dataset to ensure consistency.

Image 8 – Created Dataset with Various features

Image 9 – Publish option under dataset

Image 10 – Rebuild method option under Publish option in dataset

If the cube is set for incremental updates, you will need to select the incremental option here. This includes two choices: " Append rows based on unique column" or " Append all rows retrieved from Data source “

Image 11 – Incremental updates option

Image 12 – Incremental updates based on unique column

Since our current rebuild configuration for the cube is set to scratch, we will configure the dataset rebuild method as "From scratch" as well.

Step 4: Checking Cube Information

We will check the cube information by navigating to the admin panel and then into the Cube Management section. From there, go to the Cube Repository, tick the checkbox for the "Sales Data" cube, and click on the "Cube Information" option above. This will provide all the necessary information in three tabs: General, Cube/Dataset Columns, and Objects.

Image 13 - Cube Management section

Image 14 - Cube information option

Image 15 - Cube Information Box with Three tabs

Step 5: Checking other Cube Management Sections

Next, review each section under the Cube Management section to transfer all the required changes into the dataset. Specifically, check the following sections:

Image 16 - Cube management section

Dimension Maps: To check the dimension maps, go to the Cube Management section and select the Dimension Maps option. There, use the dropdown menu to select the cube. After selecting it, you will see a list of dimension maps applied to the cube. Ensure all dimension mappings are correctly transferred by checking for the Cube “Sales Data” into the dataset.

Image 17 - Dimension Map list

Image 18 – Detail of “Product” Dimension Map

Repeat the same step for the Dataset to replicate the dimension maps from the cube as described above. Additionally, there is another option to add the dimension map directly within the created dataset under the Tools option.

Image 19 – Dimension map option under Dataset management

Image 20 – Dimension Map option under Dataset

Data Value Display Mappings: To check the list of Data Value Display Mappings (DDVMs) First, go to the Admin Panel and navigate to the Cube Management section. Under Data Value Display Mapping, select the required dataset from the dropdown menu. This will provide a list of all DDVMs applied to the cube.

Image 21 - Data value display mapping list

Image 22 - Data value display mapping Detail Under One Specific Dimension

We need to replicate the same data value display mapping in the dataset. To do this, go to the Admin Panel, navigate to the Dataset Management section, and follow the same steps used for the cube in the data value display mapping option

Image 23 – DDVM under Dataset management section

Global Variables: Review and replicate any global variables used in the cube by selecting required cube “Sales data” from the drop down given in “Select Cube/Dataset option” you will get the list of Global variables used in that cube as shown in the below image- 24

Image 24 - List of Global Variables

Image 25 – Detail of one of the Global variable of Sales Data cube

You need to repeat the same steps for the created dataset. Check the details of all global variables related to the Sales Data cube and replicate them into the dataset by using the options available in the Admin Panel under the Dataset Management section.

Image 26 – Global variable option under Dataset management section

Data access permissions and Column access permissions

After checking the Cube information, review the data access permissions and column access permissions assigned to the cube. Replicate these permissions in the dataset to ensure user-wise and role-wise data allocation.

To check this, go to the admin section, navigate to the Cube Management section, and find the Data Access and Column Access Permission sections. Here, you will find the user-wise and role-wise permissions assigned to users.

Image 27 - Access Permission options

When you go to the Column Access Permissions options, first use the 'Select Cube/Dataset’ dropdown and select the required cube, "Sales Data" and also you will find a dropdown menu for assigning permissions role-wise or user-wise. In the upper panel, there are different tabs where you can set permissions for dimensions and measures. Refer to the image below to see how to assign permissions role-wise or user-wise. After checking this, you need to perform the same checks in the Data Access Permissions section.

Image 28 - Column Access Permission

For example – Here in Sales data there is a permission for specific column to a specific user as given in Image below

Image 29 - Column Access Permission to a specific user

Similarly, we need to set access permissions for the dataset. For this, go to the Dataset Management section in admin panel, where you will find options for configuring access permissions as shown in image . Additionally, there is an option to set access permissions directly from menu provided above list of datasets under the data option as shown in image -.

Image 30 - Access Permissions configuration of dataset

Image 31 – Permissions options in dataset list

Image 32 – Permissions options dialog box

Geo Map Columns: Check the Geo map columns in the Geo Map columns under Cube management section to get the list of geo map columns. Mark these same columns as Geo columns in your dataset to ensure accurate mapping .

Image 33 - Geo Map columns

After identifying which columns are marked as geo map columns, replicate these steps in the dataset by right-clicking and marking the required columns as geo data columns. This option is also available under the Admin Panel in the Dataset Management section, similar to the Cube Management section

Image 34 - Right click operation for Geomap column Marking

We have an option in the Admin Panel under the Dataset Management section to mark geo map columns. Follow these steps:

Go to the Admin Panel and select the Dataset Management section and choose the Geo Map Column option.

Image 35 – GeoMap column option

Select Cube “sales Data” and add Geo Columns and use the dropdown menu to select the Dataset

Image 36 – Select Dataset dropdown

Add the respective column and mark it as state, country, etc., using the Geo Map Field Type dropdown provided in the Add Geo Column dialog box.

Image 36– Add option for GeoMap Column

Image 37 – Add geo column Box with GeoMap field type dropdown

Custom Cube Columns: Check for any custom columns that need to be replicated in the dataset. By editing the any column, you can get logic behind that column and replicate the same into the dataset

Image - 38 Custom Cube Columns

Image 39 - Custom Cube Column Logic

For custom columns in the Dataset, we have the SSDP feature available directly under the right-click operation as "Add Column." Here, you can add custom columns by replicating the logic of the custom columns from the cube into the dataset.

Image 40 – Add Column option under right click operation

Step 6: Replicating Front-End Elements

Now, we need to check for the elements used in the front end that we need to replicate. Follow these steps:

Check for any front-end columns, known as UDDC and UDHC in Smarten, and ensure they are included in the new dataset.
Verify if any filters have been applied in the reports and replicate them in the new reports.

UDDC (User Defined Dimension Column) and UDHC (User Defined Header Column)

To prepare UDDC (User Defined Dimension Column) and UDHC (User Defined Header Column) , follow these steps:

Check UDDCs and UDHCs in the existing report made from Cube by navigating into the UDDC and UDHC options you will get the list of this front-end columns and rows

Image 41 – Manage UDDC

Image 42 - Manage UDHC

Image 43 – "Manage UDDC" option to see list all UDDCs used in the report.

Image 44 – "Manage UDDC" option to see list all UDHCs used in the report.

To prepare for reassociation of existing reports with the new dataset, we need to create a dummy report that includes all columns utilized in the UDDCs. Replicate each UDDC's logic to ensure their presence in the dataset with which we are reassociating our reports.

Note - This dummy report guarantees that all necessary front-end columns (UDDCs) are established within the dataset beforehand. Consequently, during reassociation, every required column will be accounted for, ensuring the reports operate seamlessly with the new dataset without any missing front-end columns in them.

In addition to the above steps, verify the following front-end elements as shown in image below from the existing report derived from cube and replicate them into the dataset:

Image 45 – Settings option

Image 46 – Front end options

Filters: Ensure all applied filters accurately capture required data without omissions.

Image 47 – Front end filter in report

Spotlighters: check all the spotlighters applied on the data for accuracy and representation.

Image 48– Spotlighter option

Sorting: Confirm data sorting logic for consistency in the report presentation.

Image 49 – Sort option

Step 7 Reassociation and Validation of Reports

Now we need to reassociate the reports created from the cube with the new dataset. Before reassociation, we must validate the existing reports by replicating the conditions in the dataset and checking a specific value. This involves applying the necessary filters and logic in the dataset to match the conditions of the existing reports. Export the reports created from the cube to keep a record of their values. This will allow you to validate the reassociated reports with the new dataset by comparing them with the exported Excel reports from the cube-based reports. Additionally, check if any objects are left for reassociation by reviewing the objects tab under the cube information, where all the objects created from the cube are listed.

To demonstrate the reassociation process, we are taking one report from our existing reports as an example in the Image-50 below. To reassociate reports, go to the Admin Panel and navigate to the Repository. Search for the reports, select them, and choose the "Associate Data" option from the menu that appears, as shown in the image below.

Image- 50 Associate data option for reassociation

Image -51 Dropdown for selecting the new data for reassociation

After meticulously replicating filters, spotlighters, sorting logic, and front-end elements in the existing reports derived from cube data, proceed to reassociate these reports with the dataset. This step ensures a seamless migration from cube to dataset, guaranteeing complete accuracy and alignment between the newly prepared dataset and the front-end representations in your reports.

Step 8 - Validate and Delete the Existing Cube

Validation of Reports and Data: Before proceeding with the deletion, ensure that all reports and data have been thoroughly validated. This includes checking that every report, dashboard, and data object reassociated with the new dataset functions as expected. Even though this step might seem obvious, it is often skipped. Proper validation is crucial to prevent errors and ensure all data, customizations, and permissions have been accurately transferred.
Deleting the Cube and Its Objects: Once validation is complete, delete the existing "Sales Data" cube and all related objects. This step is essential to free up storage space and confirm that the migration to the dataset is fully executed. Navigate to the Admin Panel, go to the Cube Management section, and select the "Delete" option for the "Sales Data" cube. Ensure all associated objects, such as custom columns, global variables, and access permissions, are also removed to avoid redundancy and unnecessary storage use.

Warning: Test Thoroughly Before Deleting the Cube

Before deleting the existing cube, it is crucial to thoroughly test the new dataset and all associated objects (such as reports, dashboards, and other data-dependent components) to ensure they function correctly and meet all requirements. Deleting the cube is an irreversible action, and without proper testing, there is a risk of data loss or reporting issues. Ensure that all customizations, data access permissions, and front-end elements have been accurately replicated in the dataset. Only proceed with deleting the cube after confirming that everything is working as expected.

Tags: cube, Dataset, Multi-dimensional data, Data structure, BI tools, Business Intelligence tools, Real-time reporting, Analytics, Migration process, Cube configuration, Incremental updates, Base query, Data Source, Dataset Creation, publish dataset, Rebuild method, Data value display mapping, DDVM, Global Variables, Data access permissions, Column access permissions, Geo map columns, Custom cube columns, Front-end elements, User Defined Dimension Columns, UDDC, User Defined Header Columns, UDHC, Reassociation, Report validation, Export and validation, Spotlighters, Sorting logic, Cube management, Dataset management, Admin Panel, Validation process, Cube deletion, Data migration, Performance optimization, Reporting accuracy, Data Management, Analytical possibilities, Data flexibility, Data Mapping, Scratch rebuild, Data consistency, Scheduled updates,