dbt 0.13 添加了一个新的功能sources 我呢可以用来做以下事情

  • 从基础模型的源表中进行数据选择
  • 测试对于源数据的假设
  • 计算源数据的freshness

source 操作

  • 定义source 模版格式

    注意对于pg 等类型的,如果包含了schema 的可能需要配置额外参数,或者通过schema 约定

# This example defines a source called `source_1` containing one table
# called `table_1`. This is a minimal example of a source definition.
version: 2
sources:
  - name: source_1
    tables:
      - name: table_1
      - name: table_2
  - name: source_2
    tables:
      - name: table_1
 
 
  • schema 配置数据源格式
# This source entry describes the table:
# "raw"."public"."Orders_"
#
# It can be referenced with:
# {{ source('ecommerce', 'orders') }}
version: 2
sources:
  - name: ecommerce
    database: raw # Tell dbt to look for the source in the "raw" database
    schema: public # You wouldn't put your source data in public, would you?
    tables:
      - name: orders
        identifier: Orders_ # To alias table names to account for strange casing or naming of tables
 
 

一个简单例子

我配置的source 直接在model 文件夹中 可以参考https://github.com/rongfengliang/dbt-source-demo,关于表数据结构
也可以参考此项目

  • 环境准备(使用python venv 管理)
python3 -m venv venv 
source venv/bin/activate
pip install dbt
  • 测试数据库准备(使用docker-compose)
version: '3.6'
services:
  postgres:
    image: postgres:9.6.11
    ports: 
    - "5432:5432"
    environment:
    - "POSTGRES_PASSWORD:dalong"
  graphql-engine:
    image: hasura/graphql-engine:v1.0.0-beta.2
    ports:
    - "8080:8080"
    depends_on:
    - "postgres"
    environment:
    - "HASURA_GRAPHQL_DATABASE_URL=postgres://postgres:dalong@postgres:5432/postgres"
    - "HASURA_GRAPHQL_ENABLE_CONSOLE=true"
    - "HASURA_GRAPHQL_ENABLE_ALLOWLIST=true"
  • model source 配置
models
├── apps
│ ├── app_summary.sql
│ └── sources.yml
└── users
    ├── sources.yml
    ├── user_summary.sql
    └── user_summary2.sql
  • source 内容

    内容很简单,就是配置table

version: 2
sources:
  - name: apps
    schema: public
    tables:
      - name: apps
  • 运行效果
dbt run

效果

Running with dbt=0.13.1
Found 3 models, 0 tests, 0 archives, 0 analyses, 94 macros, 0 operations, 0 seed files, 2 sources
17:43:42 | Concurrency: 3 threads (target='dev')
17:43:42 | 
17:43:42 | 1 of 3 START view model public.app_summary........................... [RUN]
17:43:42 | 2 of 3 START view model public.user_summary.......................... [RUN]
17:43:42 | 3 of 3 START table model public.user_summary2........................ [RUN]
17:43:44 | 2 of 3 OK created view model public.user_summary..................... [CREATE VIEW in 0.26s]
17:43:45 | 1 of 3 OK created view model public.app_summary...................... [CREATE VIEW in 0.27s]
17:43:46 | 3 of 3 OK created table model public.user_summary2................... [SELECT 2 in 0.27s]
17:43:46 | 
17:43:46 | Finished running 2 view models, 1 table models in 4.46s.
Completed successfully
Done. PASS=3 ERROR=0 SKIP=0 TOTAL=3

参考资料

https://github.com/rongfengliang/dbt-source-demo

最新文章

  1. 1-1 Linux系统安装
  2. Linux Core Dump
  3. SQL Server附加数据库问题
  4. 你是否还在质疑EF的性能
  5. pt-query-digest使用介绍【转】
  6. hdu 4784 Dinner Coming Soon(spfa + 优先队列)
  7. scala学习笔记(1)
  8. zjuoj 3603 Draw Something Cheat
  9. [.ashx檔?泛型处理例程?]基础入门#1....能否用中文教会我?别说火星文?
  10. 利用android studio gsonformat插件快速解析复杂json
  11. Java中的Stringbuffer类解析
  12. postman定义公共函数
  13. 使用JavaEE的ServerAuthModule模块和web.xml进行相应配置,实现对用户的权限控制
  14. (转)Maven使用
  15. WCF(二)三种通信模式
  16. Python的数据库操作
  17. mysql 数据库磁盘占用量统计
  18. selenium.common.exceptions.WebDriverException: Message: unknown Error: cannot find Chrome binary
  19. day19 IO编程
  20. windows 域的安装方法

热门文章

  1. OracleVM桥接网卡无法获取本地连接网卡
  2. (转)数据库_不懂数据库索引的底层原理?那是因为你心里没点BTree
  3. C# vb .net实现焦距柔化特效滤镜
  4. C#测试代码、函数、方法执行时间,方便进行系统性能评估
  5. FORM表单 onclick()与onsubmit()
  6. 关于justify-content属性的再学习(区分三个属性)
  7. js删除对象里的某一个属性
  8. Guava Cache用法介绍
  9. gitlab中clone项目时,IP地址是一串数字
  10. 记一次IntelliJ IDEA中文乱码问题