Greeplum 系列(四) 实战

表结构

(1) 拉链表结构

create table public.member_fatdt0 (
member_id varchar(64),
phoneno varchar(64),
dw_beg_date date,
dw_end_date date,
dtype char(1),
dw_status char(1),
dw_ins_date date
) with (appendonly=true, compresslevel=5)
distributed by(member_id)
partition by range (dw_end_date)
(
partition p20111201 start (date '2011-12-01') inclusive,
partition p20111202 start (date '2011-12-02') inclusive,
partition p20111203 start (date '2011-12-03') inclusive,
partition p20111204 start (date '2011-12-04') inclusive,
partition p20111205 start (date '2011-12-05') inclusive,
partition p20111206 start (date '2011-12-06') inclusive,
partition p20111207 start (date '2011-12-07') inclusive,
partition p30001231 start (date '3000-12-31') inclusive
end (date '3001-01-01') exclusive
);

(2) 增量表结构

create table public.member_delta (
member_id varchar(64),
phoneno varchar(64),
action char(1),
dw_ins_date date
) with (appendonly=true, compresslevel=5)
distributed by(member_id);

(3) 临时表结构

create table public.member_tmp0 (
member_id varchar(64),
phoneno varchar(64),
dw_beg_date date,
dw_end_date date,
dtype char(1),
dw_status char(1),
dw_ins_date date
) with (appendonly=true, compresslevel=5)
distributed by(member_id)
partition by list (dtype)
(
partition phis values('H'),
partition pcur values('C'),
default partition other
);

(4) 临时表结构

create table public.member_tmp1 (
member_id varchar(64),
phoneno varchar(64),
dw_beg_date date,
dw_end_date date,
dtype char(1),
dw_status char(1),
dw_ins_date date
) with (appendonly=true, compresslevel=5)
distributed by(member_id);

数据导入

1. insert

insert into public.member_delta values('mem006', '13100000006', 'I', date'2011-12-03');
insert into public.member_delta values('mem002', '13100000002', 'D', date'2011-12-03');
insert into public.member_delta values('mem003', '13100000003', 'U', date'2011-12-03');

2. insert

mem001,13100000001,2011-12-01,3000-12-31,C,I,2011-12-01
mem002,13100000002,2011-12-01,3000-12-31,C,I,2011-12-01
mem003,13100000003,2011-12-01,3000-12-31,C,I,2011-12-01
mem004,13100000004,2011-12-01,3000-12-31,C,I,2011-12-01
mem005,13100000005,2011-12-01,3000-12-31,C,I,2011-12-01
copy public.member_fatdt0_1_prt_p30001231 from '/home/gpadmin/member_his_init.dat' with delimiter ',';

3. 外部表

首先,启动 gpfdist 服务

nohup gpfdist -d /home/gpadmin/data -p 8888 -l /home/gpadmin/data/gpfdist.log &

其次,创建外部表

drop external table if exists public.member_ext;
create external table public.member_ext (
member_id varchar(64),
phoneno varchar(64),
action char(1),
dw_ins_date date
)
location ('gpfdist://localhost:8888/member_delta.dat')
format 'text' (delimiter ',' null as '' escape 'off')
encoding 'utf-8'
log errors into member_err segment reject limit 2 rows;

再编辑 /home/gpadmin/data/member_delta.dat 文件

mem001,13100000001,I,2011-12-01

最后,将外部表的数据加载到内部表

insert into public.member_delta select * from public.member_ext;

4. gpload

gpload 是对外部表的一层封装,首先编写 gpload 控制文件 gpload.yml,代码如下:

---
VERSION: 1.0.0.1
DATABASE: test
USER: gpadmin
HOST: localhost
PORT: 5432
GPLOAD:
INPUT:
- SOURCE:
LOCAL_HOSTNAME:
- master
PORT: 8888
FILE:
- /home/gpadmin/data/member_delta.dat
- COLUMNS:
- member_id: varchar(64),
- phoneno: varchar(20),
- action: char(1),
- dw_ins_date: date
- FORMAT: text
- DELIMITER: ','
- ERROR_LIMIT: 2
- ERROR_TABLE: public.member_err
OUTPUT:
- TABLE: public.member_delta
- MODE: INSERT
SQL:
- BEFORE: "truncate table public.member_delta"
- AFTER: "analyze public.member_delta"

执行 gpload 命令,结果如下:

gpload -f gpload.yml
2018-05-18 20:53:10|INFO|gpload session started 2018-05-18 20:53:10
2018-05-18 20:53:10|INFO|started gpfdist -p 8888 -P 8889 -f "/home/gpadmin/data/member_delta.dat" -t 30
2018-05-18 20:53:11|INFO|running time: 0.85 seconds
2018-05-18 20:53:11|INFO|rows Inserted = 1
2018-05-18 20:53:11|INFO|rows Updated = 0
2018-05-18 20:53:11|INFO|data formatting errors = 0
2018-05-18 20:53:11|INFO|gpload succeeded

5. 可执行的外部表

drop external table if exists public.member_ext;
create external web table public.member_ext (
member_id varchar(64),
phoneno varchar(64),
action char(1),
dw_ins_date date
)
execute 'cat /home/gpadmin/data/member_delta.dat' on master
format 'text' (delimiter ',' null as '' escape 'off')
encoding 'utf-8';

create table public.test_partition_list (

member_id numeric,

city character varying(32)

) distributed by(member_id)

partition by list(city)

(

partition guangzhou values('guangzhou'),

partition hangzhou values('hangzhou'),

partition shanghai values('shanghai'),

partition beijing values('beijing'),

default partition other_city

);


每天用心记录一点点。内容也许不重要,但习惯很重要!

最新文章

  1. Linux下Source Insight的安装和汉化
  2. 延时调用的php代码
  3. python redis使用心得
  4. git 常用
  5. spider
  6. 深入学习golang(3)—类型方法
  7. python 使用多线程进行压力测试
  8. 20145227《Java程序设计》课程总结
  9. JDBC第一次学习
  10. 实用Linux命令,不求最全但求实用-------磁盘使用情况du,df
  11. JQuery - 提交表单
  12. HTML5 在canvas绘制一个矩形
  13. zendstudio 安装 手册
  14. 【转】缓存淘汰算法系列之1——LRU类
  15. Spring Boot项目构建
  16. iBatis基础知识
  17. JavaScript之实例
  18. Python——cmd调用(os.system阻塞处理)
  19. 极简科普 1:什么是 VOIP
  20. 【Go】优雅的读取http请求或响应的数据

热门文章

  1. 在Google的GKE上创建支持Internal Load Balancer的Service
  2. IDEA開發 java web 初步
  3. AngularJS.js: 杂项
  4. [Java][Web]利用 referer 防盗链
  5. Solr 6.0 学习(五)solr基本查询和高级查询
  6. centos 安装php7
  7. ansible初识二
  8. multiprocessing.dummy
  9. Daemon进程
  10. Android 判断字符串是否相等