Welcome!

Monday, August 9, 2010

How to Build OLAP Application Using Mondrian + XMLA + SpagoBI? (1)

Mondrian is an OLAP (Online Analytical Processing) server that enables business users to analyze large quantities of data in real-time. Users explore business data by drilling into and cross-tabulating information with speed-of-thought response times to complex analytical queries.

XMLA (XML for Analysis) is the industry standard for data access in analytical systems, such as OLAP and Data Mining. XMLA is based on other industry standards such as XML, SOAP and HTTP. In this tutorial, we use Mondrian OLAP server as XMLA provider.

SpagoBI use SpagoBIXMLAEngine to query external XMLA compliant server like Mondrian OLAP server.

This tutorial demonstrate the steps to build the OLAP application using Mondrian OLAP server as XMLA provider and SpagoBI as reporting frontend.


Environment
============
SpagoBI: version 2.6
Mondrian: 3.2.0.13661
Schema Workbench: 3.2.0
Aggregation Designer: 1.2.0
Database: Oracle 11g R2 + SH (Sales History) schema
Java: JDK 1.6
App Server: Tomcat 6.0.28


1. Prepare Data Warehouse

We use the Oracle sample database SH (Sales History) as our data warehouse.
The SH tables are in snowflake schema, as shown below:




2. Build OLAP Model

In Mondrian, we use schema file (in XML format) to define the OLAP model. The schema file contains a logical model, consisting of cubes, hierarchies, and members, and a mapping of this model onto a physical model.

There are two ways to build the schema file: 1) by using a text editor; or 2) by using schema workbench. In this demonstration, We use schema workbench to build the OLAP logical model. To learn more about the Mondrian schema syntax and how to use Mondrian Schema Workbench, please goto http://mondrian.pentaho.com




We can test the above schema by executing MDX query:
First, start the "MDX Query" window:



Then, test the OLAP schema as shown:



Finally, save the schema file as "SalesHistory.xml" and exit Schema Workbench.

Below is the detail schema file:


<Schema name="SalesHistory">
<Cube name="Sales" cache="true" enabled="true">
<Table name="SALES" schema="SH">
</Table>
<Dimension type="StandardDimension" foreignKey="CUST_ID" name="Gender">
<Hierarchy hasAll="true" allMemberName="All Genders" primaryKey="CUST_ID">
<Table name="CUSTOMERS" schema="SH">
</Table>
<Level name="Gender" column="CUST_GENDER" type="String"
uniqueMembers="true" levelType="Regular" hideMemberIf="Never">
</Level>
</Hierarchy>
</Dimension>
<Dimension type="TimeDimension" foreignKey="TIME_ID" name="Time">
<Hierarchy hasAll="true" primaryKey="TIME_ID">
<Table name="TIMES" schema="SH">
</Table>
<Level name="Year" column="CALENDAR_YEAR" type="Integer" uniqueMembers="true"
levelType="TimeYears" hideMemberIf="Never">
</Level>
<Level name="Quarter" column="CALENDAR_QUARTER_NUMBER" type="Integer"
uniqueMembers="false" levelType="TimeQuarters">
</Level>
<Level name="Month" column="CALENDAR_MONTH_NUMBER" uniqueMembers="false"
type="Integer" levelType="TimeMonths" hideMemberIf="Never">
</Level>
<Level name="Day" column="DAYS_IN_CAL_MONTH" type="Integer"
uniqueMembers="false" levelType="TimeDays" hideMemberIf="Never">
</Level>
</Hierarchy>
</Dimension>
<Dimension type="StandardDimension" foreignKey="CUST_ID" name="Location">
<Hierarchy hasAll="true" primaryKey="CUST_ID" primaryKeyTable="CUSTOMERS">
<Join leftKey="COUNTRY_ID" rightKey="COUNTRY_ID">
<Table name="CUSTOMERS" schema="SH">
</Table>
<Table name="COUNTRIES" schema="SH">
</Table>
</Join>
<Level name="Country" table="COUNTRIES" column="COUNTRY_NAME" type="String"
uniqueMembers="true" levelType="Regular" hideMemberIf="Never">
</Level>
<Level name="State" table="CUSTOMERS" column="CUST_STATE_PROVINCE" type="String"
uniqueMembers="true" levelType="Regular" hideMemberIf="Never">
</Level>
<Level name="City" table="CUSTOMERS" column="CUST_CITY" type="String"
uniqueMembers="true" levelType="Regular" hideMemberIf="Never">
</Level>
</Hierarchy>
</Dimension>
<Measure name="Quantity Sold" column="QUANTITY_SOLD" datatype="Numeric"
formatString="#,###" aggregator="sum" visible="true">
</Measure>
<Measure name="Amount Sold" column="AMOUNT_SOLD" datatype="Numeric"
formatString="#,###.##" aggregator="sum" visible="true">
</Measure>
</Cube>
</Schema>


(to be continued)

No comments:

Post a Comment