Hive2MySQL
The data includes logs, Dblogs, etc., which are connected to the big data platform HDFS through Sqoop batch or Kafka in real time. After the ETL processing is completed on the big data platform, it is written regularly every day through the big data scheduling system Ooize To the relational database MySQL, and then generate various reports based on the data in MySQL.
Cons:
- The entire platform is limited by the capabilities of MySQL. MySQL cannot support fast query and analysis of large amounts of data. Generally, after tens of millions, MySQL cannot support
- This method lacks the precipitation of common capabilities. Case by Case handles user needs and requires a long development time
OLAP platform based on kylin
The architecture of the OLAP platform is divided into 3 layers from bottom to top: OLAP engine layer, here is Apche Kylin; above Kylin is the indicator platform layer, which provides unified API, unified indicator definition and caliber management, and supports query of indicators. In addition, under Kylin is the data warehouse layer, which has ODS, DWD, DWS, and OLAP layer data warehouse tables. Cons:
- Kylin supports a limited number of dimensions, and the business side has more requirements for the number of indicator dimensions. Many indicators need to have 30 or 40 dimensions.
- It can’t meet the rapid changes of business requirements. It takes a long time to develop a new indicator and change indicator dimensions.
- Insufficient flexibility, you need to refresh the Cube every time you modify the Cube, which takes a long time
OLAP platform based on multi engines
The architecture of the OLAP platform is divided into 4 layers from bottom to top: OLAP layer, which includes various engines such as Kylin, and above the OLAP layer is the query engine layer, which is responsible for shielding the underlying OLAP engine query for the indicator platform The difference of the interface; on the top is the indicator platform, which provides a unified API, unified indicator definition and caliber management, and supports the query of indicators; on the top is the application layer and various Data application products, they obtain data through a unified indicator API, and do not directly use SQL to access OLAP.