DW(六):polybase访问Azure Blob Storage
2024-10-16 06:41:32
目录:
- 连接hadoop配置语法
- 配置hadoop连接
- Pushdown配置
- Create external tables for Azure blob storage
连接hadoop配置语法:
global configuration settings for PolyBase Hadoop and Azure blob storage connectivity, Syntax:
- sp_configure: List all of the configuration options
- 配置与hadoop连接语法:
sp_configure [ @configname = ] 'hadoop connectivity',
[ @configvalue = ] { 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 }
[;] RECONFIGURE
[;]
- Option 0: Disable Hadoop connectivity
- Option 1: Hortonworks HDP 1.3 on Windows Server
- Option 1: Azure blob storage (WASB[S])
- Option 2: Hortonworks HDP 1.3 on Linux
- Option 3: Cloudera CDH 4.3 on Linux
- Option 4: Hortonworks HDP 2.0 on Windows Server
- Option 4: Azure blob storage (WASB[S])
- Option 5: Hortonworks HDP 2.0 on Linux
- Option 6: Cloudera 5.1, 5.2, 5.3, 5.4, and 5.5 on Linux
- Option 7: Hortonworks 2.1, 2.2, and 2.3 on Linux
- Option 7: Hortonworks 2.1, 2.2, and 2.3 on Windows Server
- Option 7: Azure blob storage (WASB[S])
- 示例代码:
- 显示所有可用的配置列表: EXEC sp_configure;
- 显示某个配置项的配置信息: EXEC sp_configure @configname='hadoop connectivity';
- 设置配置项示例语法如下:
sp_configure @configname = 'hadoop connectivity', @configvalue = 7;
GO RECONFIGURE
GO
配置hadoop连接
- 在本节示例中,连接HDInsight3.2集,对应关系如下:
- 配置polybase连接配置项信息信息如下:
sp_configure @configname = 'hadoop connectivity', @configvalue = 7;
GO RECONFIGURE
GO- 命令行: services.msc, 重启下面的服务
- SQL Server PolyBase Data Movement Service
- SQL Server PolyBase Engine
Pushdown配置:
- To improve query performance, enable pushdown computation to a Hadoop cluster
- Find the file yarn-site.xml in the installation path of SQL Server. Typically, the path is: C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Binn\Polybase\Hadoop\conf
- On the Hadoop machine, find the analogous(类似) file in the Hadoop configuration directory. In the file, find and copy the value of the configuration key yarn.application.classpath
- On the SQL Server machine, in the yarn-site.xml file, find the yarn.application.classpath property. Paste the value from the Hadoop machine into the value element.
$HADOOP_CONF_DIR,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*
Create external tables for Azure blob storage
- The Elastic(弹性) Database query feature relies on(依靠) the these four DDL statements. Typically, these DDL statements are used once or rarely when the schema of your application changes
- [CREATE MASTER KEY] (https://msdn.microsoft.com/library/ms174382.aspx)
- [CREATE CREDENTIAL] (https://msdn.microsoft.com/library/ms189522.aspx)
- [CREATE DATABASE SCOPED CREDENTIAL] (https://msdn.microsoft.com/library/mt270260.aspx)
- [CREATE/DROP EXTERNAL DATA SOURCE] (https://msdn.microsoft.com/library/dn935022.aspx)
- [CREATE/DROP EXTERNAL TABLE] (https://msdn.microsoft.com/library/dn935021.aspx)
- You can use the following syntax to drop the master key and credentials
- DROP CREDENTIAL <credential_name> ON DATABASE;
- DROP DATABASE SCOPED CREDENTIAL <credential_name>;
- DROP MASTER KEY;
- 模拟简单文本数据,如下:
- 将文本文件上传到 Azure blob storage, 如下图:
- 远程连接ploybase1, 打开ssms(注:此版本需单独安装),执行脚本如下:
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'Password01!'; CREATE DATABASE SCOPED CREDENTIAL JNAzureCredit
WITH IDENTITY = 'zhushy', Secret = XXXX'; /*存储访问Key*/ CREATE EXTERNAL DATA SOURCE JNAzureStorage with (
TYPE = HADOOP,
LOCATION ='wasb://<blob_container_name>@<azure_storage_account_name>.blob.core.chinacloudapi.cn',
CREDENTIAL = JNAzureCredit
); CREATE EXTERNAL FILE FORMAT TextFileFormat WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (FIELD_TERMINATOR =' ', USE_TYPE_DEFAULT = TRUE)) CREATE EXTERNAL TABLE [dbo].[hospital] (
[ID] varchar(128) NULL,
[name] varchar(128) null
)
WITH (LOCATION='/zhu/data1.txt',
DATA_SOURCE = JNAzureStorage,
FILE_FORMAT = TextFileFormat
);- 验证数据,输入如下SQL: select * from [dbo].[hospital] ,如下图,则OK
最新文章
- 同步、更新、下载Android Source &; SDK from 国内镜像站(转载)
- vs2008注册组件
- 在Windows server 2008 R2上安装Python3.5
- iOS关于启动页自定义特殊处理
- MongoDB 介绍
- [课程设计]Scrum日记本项目进度
- POI设置边框
- vector list array deque
- C# Hashtable 使用说明 以及 Hashtable和HashMap的区别
- CODEFORCES#274 DIV2
- Web SCADA 电力接线图工控组态编辑器
- [HAOI 2009]逆序对数列
- python,列表,元祖,字典
- 洛谷 P1141 01迷宫
- Postgresql导出数据报版本不对
- 利用神经网络进行网络流量识别——特征提取的方法是(1)直接原始报文提取前24字节,24个报文组成596像素图像CNN识别;或者直接去掉header后payload的前1024字节(2)传输报文的大小分布特征;也有加入时序结合LSTM后的CNN综合模型
- .net core和.net 4.7区别和联系笔记
- GO介绍,环境的配置和安装 简单使用
- Tengine 添加第三方监控模块nginx-module-vts
- idea中Hibernate错误:无法解析表
热门文章
- 转载:CancellationToken
- Day01_JAVA语言基础第一天
- dede文章摘要字数的设置方法
- 多层CCLayer的touch冲突解决
- JAVA学习之Ecplise IDE 使用技巧(2)第二章:键盘小快手,代码辅助
- What are the 10 algorithms one must know in order to solve most algorithm challenges/puzzles?
- ANTLR3完全参考指南读书笔记[04]
- tyvj1013 - 找啊找啊找GF ——二维背包变种
- 工作中遇到的问题--Hibernate注解添加在一方和多方的区别
- json对象和字符串互相转换