MatrixOne从入门到实践——SSB性能测试

SSB 星型模式基准测试是 OLAP 数据库性能测试的常用场景,通过本篇教程,您可以了解到如何在 MatrixOne 中实现 SSB 测试。

测试环境

  • 机器配置

    机器数量 部署方式 CPU 内存 磁盘
    1 单节点 6 36G 100G
  • MO版本

    0.5.1版本

编译dbgen

  • 获取源码

    git clone https://github.com/vadimtk/ssb-dbgen.git

    如果因为网络问题导致clone失败,建议使用gitee将上述链接项目导入到自己的仓库,然后使用gitee的链接clone

  • 编译

    cd ssb-dbgen
    make

生成数据

当使用 -s 1dbgen 命令会生产近600万行数据(670MB),当使用-s 10时会生产近6000万行数据,会耗费大量时间。

./dbgen -s 1 -T c
./dbgen -s 1 -T l
./dbgen -s 1 -T p
./dbgen -s 1 -T s
./dbgen -s 1 -T d

生成完成后,会有以下数据文件,这里只生成了多表的数据。

[root@motest ssb]# ll -h |grep tbl
-r-sr-S--T. 1 root root 3.2M Oct 1 09:30 customer.tbl
-rw-r--r--. 1 root root 270K Oct 1 09:31 date.tbl
-rw-r--r--. 1 root root 641M Oct 1 09:31 lineorder.tbl
-rw-r--r--. 1 root root 20M Oct 1 09:31 part.tbl
-rw-r--r--. 1 root root 187K Oct 1 09:31 supplier.tbl

SSB的大宽表数据集请下载:

wget https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/lineorder_flat.tar.bz2

解压数据集:

 tar -jxvf lineorder_flat.tar.bz2

MatrixOne 准备工作

  • 安装 、 启动MatrixOne

    详情请参考MatrixOne部署一文

  • 建表

    create database if not exists ssb;
    use ssb;
    drop table if exists lineorder;
    drop table if exists part;
    drop table if exists supplier;
    drop table if exists customer;
    drop table if exists dates;
    drop table if exists lineorder_flat; create table lineorder (
    lo_orderkey bigint,
    lo_linenumber int,
    lo_custkey int,
    lo_partkey int,
    lo_suppkey int,
    lo_orderdate date,
    lo_orderpriority char (15),
    lo_shippriority tinyint,
    lo_quantity double,
    lo_extendedprice double,
    lo_ordtotalprice double,
    lo_discount double,
    lo_revenue double,
    lo_supplycost double,
    lo_tax double,
    lo_commitdate date,
    lo_shipmode char (10)
    ) ; create table part (
    p_partkey int,
    p_name varchar (22),
    p_mfgr char (6),
    p_category char (7),
    p_brand char (9),
    p_color varchar (11),
    p_type varchar (25),
    p_size int,
    p_container char (10)
    ) ; create table supplier (
    s_suppkey int,
    s_name char (25),
    s_address varchar (25),
    s_city char (10),
    s_nation char (15),
    s_region char (12),
    s_phone char (15)
    ) ; create table customer (
    c_custkey int,
    c_name varchar (25),
    c_address varchar (25),
    c_city char (10),
    c_nation char (15),
    c_region char (12),
    c_phone char (15),
    c_mktsegment char (10)
    ) ; create table dates (
    d_datekey date,
    d_date char (18),
    d_dayofweek char (9),
    d_month char (9),
    d_yearmonthnum int,
    d_yearmonth char (7),
    d_daynuminweek varchar(12),
    d_daynuminmonth int,
    d_daynuminyear int,
    d_monthnuminyear int,
    d_weeknuminyear int,
    d_sellingseason varchar (12),
    d_lastdayinweekfl varchar (1),
    d_lastdayinmonthfl varchar (1),
    d_holidayfl varchar (1),
    d_weekdayfl varchar (1)
    ) ; CREATE TABLE lineorder_flat(
    LO_ORDERKEY bigint key,
    LO_LINENUMBER int,
    LO_CUSTKEY int,
    LO_PARTKEY int,
    LO_SUPPKEY int,
    LO_ORDERDATE date,
    LO_ORDERPRIORITY char(15),
    LO_SHIPPRIORITY tinyint,
    LO_QUANTITY double,
    LO_EXTENDEDPRICE double,
    LO_ORDTOTALPRICE double,
    LO_DISCOUNT double,
    LO_REVENUE int unsigned,
    LO_SUPPLYCOST int unsigned,
    LO_TAX double,
    LO_COMMITDATE date,
    LO_SHIPMODE char(10),
    C_NAME varchar(25),
    C_ADDRESS varchar(25),
    C_CITY char(10),
    C_NATION char(15),
    C_REGION char(12),
    C_PHONE char(15),
    C_MKTSEGMENT char(10),
    S_NAME char(25),
    S_ADDRESS varchar(25),
    S_CITY char(10),
    S_NATION char(15),
    S_REGION char(12),
    S_PHONE char(15),
    P_NAME varchar(22),
    P_MFGR char(6),
    P_CATEGORY char(7),
    P_BRAND char(9),
    P_COLOR varchar(11),
    P_TYPE varchar(25),
    P_SIZE int,
    P_CONTAINER char(10)
    );
  • 导入数据

    请根据自己生成的数据的路径,自行调整导入语句中的路径参数

    如我生成数据的路径为:/home/ssb/ssb/***.tbl

    则对应单表导入语句为:

    load data infile '/home/ssb/ssb/supplier.tbl' into table supplier FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';
    
    load data infile '/home/ssb/ssb/customer.tbl' into table customer FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';
    
    load data infile '/home/ssb/ssb/date.tbl' into table dates FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';
    
    load data infile '/home/ssb/ssb/part.tbl' into table part FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';
    
    load data infile '/home/ssb/ssb/lineorder.tbl' into table lineorder FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';

    大宽表导入语句为:

    load data infile '/home/ssb/ssb/lineorder_flat.tbl' into table lineorder_flat FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';

运行SSB测试

  • 大宽表查询

    查询语句

    --Q1.1
    SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorder_flat WHERE year(LO_ORDERDATE)=1993 AND LO_DISCOUNT BETWEEN 1 AND 3 AND LO_QUANTITY < 25; --Q1.2
    SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorder_flat WHERE year(LO_ORDERDATE)=1994 AND LO_DISCOUNT BETWEEN 4 AND 6 AND LO_QUANTITY BETWEEN 26 AND 35; --Q1.3
    SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorder_flat WHERE year(LO_ORDERDATE)=1994 AND LO_DISCOUNT BETWEEN 5 AND 7 AND LO_QUANTITY BETWEEN 26 AND 35; --Q2.1
    SELECT sum(LO_REVENUE),year(LO_ORDERDATE) AS year,P_BRAND FROM lineorder_flat WHERE P_CATEGORY = 'MFGR#12' AND S_REGION = 'AMERICA' GROUP BY year(LO_ORDERDATE), P_BRAND ORDER BY year,P_BRAND; --Q2.2
    SELECT sum(LO_REVENUE), year(LO_ORDERDATE) AS year, P_BRAND FROM lineorder_flat WHERE P_BRAND BETWEEN 'MFGR#2221' AND 'MFGR#2228' AND S_REGION = 'ASIA' GROUP BY year(LO_ORDERDATE), P_BRAND ORDER BY year, P_BRAND; --Q2.3
    SELECT sum(LO_REVENUE), year(LO_ORDERDATE) AS year, P_BRAND FROM lineorder_flat WHERE P_BRAND = 'MFGR#2239' AND S_REGION = 'EUROPE' GROUP BY year(LO_ORDERDATE), P_BRAND ORDER BY year, P_BRAND; --Q3.1
    SELECT C_NATION, S_NATION, year(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM lineorder_flat WHERE C_REGION = 'ASIA' AND S_REGION = 'ASIA' AND year(LO_ORDERDATE) between 1992 AND 1997 GROUP BY C_NATION, S_NATION, year(LO_ORDERDATE) ORDER BY year asc, revenue desc; --Q3.2
    SELECT C_CITY, S_CITY, year(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM lineorder_flat WHERE C_NATION = 'CHINA' AND S_NATION = 'CHINA' AND year(LO_ORDERDATE) between 1992 AND 1997 GROUP BY C_CITY, S_CITY, year(LO_ORDERDATE) ORDER BY year asc, revenue desc; --Q3.3
    SELECT C_CITY, S_CITY, year(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM lineorder_flat WHERE (C_CITY = 'UNITED KI0' OR C_CITY = 'UNITED KI7') AND (S_CITY = 'UNITED KI0' OR S_CITY = 'UNITED KI7') AND year(LO_ORDERDATE) between 1992 AND 1997 GROUP BY C_CITY, S_CITY, year(LO_ORDERDATE) ORDER BY year asc, revenue desc; --Q3.4
    SELECT C_CITY, S_CITY, year(LO_ORDERDATE) AS year, sum(LO_REVENUE) AS revenue FROM lineorder_flat WHERE (C_CITY = 'UNITED KI0' OR C_CITY = 'UNITED KI7') AND (S_CITY = 'MOZAMBIQU1' OR S_CITY = 'KENYA 4') AND year(LO_ORDERDATE)= 1997 GROUP BY C_CITY, S_CITY, year(LO_ORDERDATE) ORDER BY year asc, revenue desc; --Q4.1
    SELECT year(LO_ORDERDATE) AS year, C_NATION, sum(LO_REVENUE - LO_SUPPLYCOST) AS profit FROM lineorder_flat WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND (P_MFGR = 'MFGR#1' OR P_MFGR = 'MFGR#2') GROUP BY year(LO_ORDERDATE), C_NATION ORDER BY year, C_NATION; --Q4.2
    SELECT year(LO_ORDERDATE) AS year, S_NATION, P_CATEGORY, sum(LO_REVENUE - LO_SUPPLYCOST) AS profit FROM lineorder_flat WHERE C_REGION = 'AMERICA' AND S_REGION = 'AMERICA' AND (year(LO_ORDERDATE) = 1997 OR year(LO_ORDERDATE) = 1998) AND (P_MFGR = 'MFGR#1' OR P_MFGR = 'MFGR#2') GROUP BY year(LO_ORDERDATE), S_NATION, P_CATEGORY ORDER BY year, S_NATION, P_CATEGORY; --Q4.3
    SELECT year(LO_ORDERDATE) AS year, S_CITY, P_BRAND, sum(LO_REVENUE - LO_SUPPLYCOST) AS profit FROM lineorder_flat WHERE S_NATION = 'UNITED STATES' AND (year(LO_ORDERDATE) = 1997 OR year(LO_ORDERDATE) = 1998) AND P_CATEGORY = 'MFGR#14' GROUP BY year(LO_ORDERDATE), S_CITY, P_BRAND ORDER BY year, S_CITY, P_BRAND;
  • 多表查询

    查询语句

    --Q1.1
    select sum(lo_revenue) as revenue
    from lineorder join dates on lo_orderdate = d_datekey
    where year(d_datekey) = 1993 and lo_discount between 1 and 3 and lo_quantity < 25; --Q1.2
    select sum(lo_revenue) as revenue
    from lineorder
    join dates on lo_orderdate = d_datekey
    where d_yearmonthnum = 199401
    and lo_discount between 4 and 6
    and lo_quantity between 26 and 35; --Q1.3
    select sum(lo_revenue) as revenue
    from lineorder
    join dates on lo_orderdate = d_datekey
    where d_weeknuminyear = 6 and year(d_datekey) = 1994
    and lo_discount between 5 and 7
    and lo_quantity between 26 and 35; --Q2.1
    select sum(lo_revenue) as lo_revenue, year(d_datekey) as year, p_brand
    from lineorder
    join dates on lo_orderdate = d_datekey
    join part on lo_partkey = p_partkey
    join supplier on lo_suppkey = s_suppkey
    where p_category = 'MFGR#12' and s_region = 'AMERICA'
    group by year(d_datekey), p_brand
    order by year, p_brand; --Q2.2
    select sum(lo_revenue) as lo_revenue, year(d_datekey) as year, p_brand
    from lineorder
    join dates on lo_orderdate = d_datekey
    join part on lo_partkey = p_partkey
    join supplier on lo_suppkey = s_suppkey
    where p_brand between 'MFGR#2221' and 'MFGR#2228' and s_region = 'ASIA'
    group by year(d_datekey), p_brand
    order by year, p_brand; --Q2.3
    select sum(lo_revenue) as lo_revenue, year(d_datekey) as year, p_brand
    from lineorder
    join dates on lo_orderdate = d_datekey
    join part on lo_partkey = p_partkey
    join supplier on lo_suppkey = s_suppkey
    where p_brand = 'MFGR#2239' and s_region = 'EUROPE'
    group by year(d_datekey), p_brand
    order by year, p_brand; --Q3.1
    select c_nation, s_nation, year(d_datekey) as year, sum(lo_revenue) as lo_revenue
    from lineorder
    join dates on lo_orderdate = d_datekey
    join customer on lo_custkey = c_custkey
    join supplier on lo_suppkey = s_suppkey
    where c_region = 'ASIA' and s_region = 'ASIA' and year(d_datekey) between 1992 and 1997
    group by c_nation, s_nation, year(d_datekey)
    order by year asc, lo_revenue desc; --Q3.2
    select c_city, s_city, year(d_datekey) as year, sum(lo_revenue) as lo_revenue
    from lineorder
    join dates on lo_orderdate = d_datekey
    join customer on lo_custkey = c_custkey
    join supplier on lo_suppkey = s_suppkey
    where c_nation = 'UNITED STATES' and s_nation = 'UNITED STATES'
    and year(d_datekey) between 1992 and 1997
    group by c_city, s_city, year(d_datekey)
    order by year asc, lo_revenue desc; --Q3.3
    select c_city, s_city, year(d_datekey) as year, sum(lo_revenue) as lo_revenue
    from lineorder
    join dates on lo_orderdate = d_datekey
    join customer on lo_custkey = c_custkey
    join supplier on lo_suppkey = s_suppkey
    where (c_city='UNITED KI1' or c_city='UNITED KI5')
    and (s_city='UNITED KI1' or s_city='UNITED KI5')
    and year(d_datekey) between 1992 and 1997
    group by c_city, s_city, year(d_datekey)
    order by year asc, lo_revenue desc; --Q3.4
    select c_city, s_city, year(d_datekey) as year, sum(lo_revenue) as lo_revenue
    from lineorder
    join dates on lo_orderdate = d_datekey
    join customer on lo_custkey = c_custkey
    join supplier on lo_suppkey = s_suppkey
    where (c_city='UNITED KI1' or c_city='UNITED KI5') and (s_city='UNITED KI1' or s_city='UNITED KI5') and d_yearmonth = '199712'
    group by c_city, s_city, year(d_datekey)
    order by year(d_datekey) asc, lo_revenue desc; --Q4.1
    select year(d_datekey) as year, c_nation, sum(lo_revenue) - sum(lo_supplycost) as profit
    from lineorder
    join dates on lo_orderdate = d_datekey
    join customer on lo_custkey = c_custkey
    join supplier on lo_suppkey = s_suppkey
    join part on lo_partkey = p_partkey
    where c_region = 'AMERICA' and s_region = 'AMERICA' and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2')
    group by year(d_datekey), c_nation
    order by year, c_nation; --Q4.2
    select year(d_datekey) as year, s_nation, p_category, sum(lo_revenue) - sum(lo_supplycost) as profit
    from lineorder
    join dates on lo_orderdate = d_datekey
    join customer on lo_custkey = c_custkey
    join supplier on lo_suppkey = s_suppkey
    join part on lo_partkey = p_partkey
    where c_region = 'AMERICA'and s_region = 'AMERICA'
    and (year(d_datekey) = 1997 or year(d_datekey) = 1998)
    and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2')
    group by year(d_datekey), s_nation, p_category
    order by year, s_nation, p_category; --Q4.3
    select year(d_datekey) as year, s_city, p_brand, sum(lo_revenue) - sum(lo_supplycost) as profit, c_region, s_nation, p_category
    from lineorder
    join dates on lo_orderdate = d_datekey
    join customer on lo_custkey = c_custkey
    join supplier on lo_suppkey = s_suppkey
    join part on lo_partkey = p_partkey
    where
    (year(d_datekey) = 1997 or year(d_datekey) = 1998)
    and s_nation='ALGERIA'
    group by year(d_datekey), s_city, p_brand, c_region, s_nation, p_category
    order by year, s_city, p_brand;

查询报告

  • 大宽表

    查询ID 第一次查询 第二次查询 第三次查询 第四次查询 第五次查询 平均查询速度
    Q1.1 0.14 sec 0.13 sec 0.14 sec 0.13 sece 0.13 sec 0.134 sec
    Q1.2 0.17 sec 0.14 sec 0.14 sec 0.14 sece 0.15 sec 0.148 sec
    Q1.3 0.14 sec 0.17 sec 0.14 sec 0.16 sec 0.14 sec 0.15 sec
    Q2.1 0.64 sec 0.61 sec 0.59 sec 0.65 sec 0.62 sec 0.622 sec
    Q2.2 0.77 sec 0.77 sec 0.77 sec 0.74 sec 0.75 sec 0.76 sec
    Q2.3 0.58 sec 0.56 sec 0.55 sec 0.60 sec 0.54 sec 0.566 sec
    Q3.1 0.79 sec 0.74 sec 0.76 sec 0.73 sec 0.75 sec 0.754 sec
    Q3.2 0.71 sec 0.72 sec 0.70 sec 0.74 sec 0.70 sec 0.714 sec
    Q3.3 0.99 sec 0.97 sec 0.97 sec 1.10 sec 0.97 sec 1 sec
    Q3.4 1.07 sec 0.98 sec 0.98 sec 0.97 sec 0.97 sec 0.994 sec
    Q4.1 1.13 sec 1.10 sec 1.04 sec 1.04 sec 1.07 sec 1.076 sec
    Q4.2 1.19 sec 1.16 sec 1.18 sec 1.19 sec 1.19 sec 1.182 sec
    Q4.3 0.72 sec 0.74 sec 0.71 sec 0.76 sec 0.69 sec 0.724 sec
  • 多表查询

    查询ID 第一次查询 第二次查询 第三次查询 第四次查询 第五次查询 平均查询速度
    Q1.1 0.08 sec 0.11 sec 0.08 sec 0.09 sec 0.06 sec 0.084 sec
    Q1.2 0.12 sec 0.07 sec 0.07 sec 0.09 sec 0.07 sec 0.084 sec
    Q1.3 0.09 sec 0.08 sec 0.09 sec 0.07 sec 0.10 sec 0.086 sec
    Q2.1 0.28 sec 0.26 sec 0.27 sec 0.26 sec 0.28 sec 0.27 sec
    Q2.2 0.29 sec 0.28 sec 0.28 sec 0.28 sec 0.28 sec 0.282 sec
    Q2.3 0.26 sec 0.26 sec 0.28 sec 0.26 sec 0.23 sec 0.258 sec
    Q3.1 0.40 sec 0.34 sec 0.43 sec 0.36 sec 0.40 sec 0.386 sec
    Q3.2 0.12 sec 0.15 sec 0.10 sec 0.09 sec 0.15 sec 0.122 sec
    Q3.3 0.13 sec 0.06 sec 0.11 sec 0.07 sec 0.10 sec 0.094 sec
    Q3.4 0.08 sec 0.12 sec 0.07 sec 0.11 sec 0.07 sec 0.09 sec
    Q4.1 0.45 sec 0.39 sec 0.46 sec 0.42 sec 0.44 sec 0.432 sec
    Q4.2 0.32 sec 0.40 sec 0.32 sec 0.39 sec 0.31 sec 0.348 sec
    Q4.3 0.36 sec 0.32 sec 0.36 sec 0.32 sec 0.37 sec 0.346 sec
  • 查询对比报告

    通过SSB测试发现,MatrixOne多表join查询性能异常强大,超过了传统的大宽表模型,这意味着在MatrixOne中我们可以使用丰富的数据模型来满足我们的业务需要。

最新文章

  1. Json在PHP与JS之间传输
  2. 配置比对脚本(python)
  3. 使用lftp传输文件的shell脚本
  4. 用widthStep的方法来增加某范围的像素----与imageROI对比
  5. Java Annotation 机制源码分析与使用
  6. IE6中布局常见问题
  7. struct 理解 (需要经常理解)
  8. BZOJ 1631==USACO 2007== POJ 3268 Cow Party奶牛派对
  9. Python操作列表的常用方法
  10. bash中声明变量方法
  11. c/c++ 线性表之单向链表
  12. Alpha 冲刺 (4/10)
  13. Windows环境安装tesseract-ocr 4.00并配置环境变量
  14. iOS实时查看App运行日志
  15. 12.15 Daily Scrum
  16. Node.js + Express 接口请求(GET、POST、PUT)事例
  17. MSSQL索引视图(indexed view)之简述及使用
  18. HUST 1605 Gene recombination(广搜,位运算)
  19. geoserver 文件系统
  20. 2018.09.16 codeforces1041C. Coffee Break(双端队列)

热门文章

  1. github访问较慢问题初步解决方案
  2. YII事件EVENT示例
  3. 6.8 NOI 模拟
  4. P4315 月下“毛景树”(树链剖分)
  5. Apache DolphinScheduler 2.X保姆级源码解析,中国移动工程师揭秘服务调度启动全流程
  6. 软件装在D盘,实测有效
  7. Excel 运算符(三):文本连接符
  8. HTTPS 一定是安全的吗?
  9. 牛客IOI周赛26-提高组 A. 逆序对
  10. 【java】学习路径22-关于BigInteger类,大数字类