博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Data Transfer By Sqoop2
阅读量:6342 次
发布时间:2019-06-22

本文共 8163 字,大约阅读时间需要 27 分钟。

本文主要描述了HDFS和RDBMS数据库之间的数据迁移。

一、创建Link
这里主要描述下Oracle、PostgreSQL、MySQL以及HDFS的link创建。
1.1 JDBC驱动安装
从官网下载Oracle和PostgreSQL的jdbc驱动,拷贝到$SQOOP_HOME/server/lib,然后重启sqoop2服务器。我这里用到的Oracle jdbc驱动为ojdbc6.jar,PostgreSQL为postgresql-42.1.4.jar,MySQL为mysql-connector-java-5.1.44-bin.jar。
1.2 创建HDFS Link

sqoop:000> create link -c hdfs-connectorCreating link for connector with name hdfs-connectorPlease fill following values to create new link objectName: HDFS_LinkHDFS clusterURI: hdfs://192.168.120.96:9000Conf directory: /u01/hadoop/etc/hadoopAdditional configs:: There are currently 0 values in the map:entry# New link was successfully created with validation status OK and name HDFS_Link

1.3 创建MySQL Link

sqoop:000> create link -c  generic-jdbc-connectorCreating link for connector with name generic-jdbc-connectorPlease fill following values to create new link objectName: MySQL_LinkDatabase connectionDriver class: com.mysql.jdbc.DriverConnection String: jdbc:mysql://192.168.120.92:3306/smsqw?useSSL=falseUsername: rootPassword: *********Fetch Size: Connection Properties: There are currently 0 values in the map:entry# SQL DialectIdentifier enclose: New link was successfully created with validation status OK and name MySQL_Link

1.4 创建PostgreSQL link

sqoop:000> create link -c generic-jdbc-connectorCreating link for connector with name generic-jdbc-connectorPlease fill following values to create new link objectName: PostgreSQL_LinkDatabase connectionDriver class: org.postgresql.DriverConnection String: jdbc:postgresql://192.168.120.93:5432/rhndbUsername: rhnuserPassword: ******Fetch Size: Connection Properties: There are currently 0 values in the map:entry# SQL DialectIdentifier enclose:  New link was successfully created with validation status OK and name PostgreSQL_Link

1.5 创建Oracle Link

sqoop:000> create link -c generic-jdbc-connectorCreating link for connector with name generic-jdbc-connectorPlease fill following values to create new link objectName: Oracle_linkDatabase connectionDriver class: oracle.jdbc.driver.OracleDriverConnection String: jdbc:oracle:thin:@192.168.120.121:1521:rhndbUsername: spwuserPassword: ******Fetch Size: Connection Properties: There are currently 0 values in the map:entry# SQL DialectIdentifier enclose:                                            New link was successfully created with validation status OK and name Oracle_link

> 注意:关系型数据库在Identifier enclose这里,必须先空格再回车。

Data Transfer By Sqoop2
二、创建关系型数据库到HDFS作业
2.1 MySQL To HDFS作业

sqoop:000> create job -f MySQL_Link -t HDFS_LinkCreating job for links with from name MySQL_Link and to name HDFS_LinkPlease fill following values to create new job objectName: MySQL_To_HDFSDatabase sourceSchema name: smsqwTable name: tbMessageSQL statement: Column names: There are currently 0 values in the list:element# Partition column: Partition column nullable: Boundary query: Incremental readCheck column: Last value: Target configurationOverride null value: Null value: File format:   0 : TEXT_FILE  1 : SEQUENCE_FILE  2 : PARQUET_FILEChoose: 0Compression codec:   0 : NONE  1 : DEFAULT  2 : DEFLATE  3 : GZIP  4 : BZIP2  5 : LZO  6 : LZ4  7 : SNAPPY  8 : CUSTOMChoose: 1Custom codec: Output directory: /user/DataSourceAppend mode: Throttling resourcesExtractors: Loaders: Classpath configurationExtra mapper jars: There are currently 0 values in the list:element# New job was successfully created with validation status OK  and name MySQL_To_HDFS

2.2 PostgreSQL To HDFS作业

sqoop:000> create job -f PostgreSQL_Link -t HDFS_LinkCreating job for links with from name PostgreSQL_Link and to name HDFS_LinkPlease fill following values to create new job objectName: PostgreSQL_to_HDFSDatabase sourceSchema name: rhndbTable name: rhnpackagechangelogdataSQL statement: Column names: There are currently 0 values in the list:element# Partition column: id                    --如果表没有主键,就指定一个字段Partition column nullable: Boundary query: Incremental readCheck column: Last value: Target configurationOverride null value: Null value: File format:   0 : TEXT_FILE  1 : SEQUENCE_FILE  2 : PARQUET_FILEChoose: 0Compression codec:   0 : NONE  1 : DEFAULT  2 : DEFLATE  3 : GZIP  4 : BZIP2  5 : LZO  6 : LZ4  7 : SNAPPY  8 : CUSTOMChoose: 1Custom codec: Output directory: /user/DataSource/PostgreSQL/RHNPACKAGECHANGELOGDATAAppend mode: Throttling resourcesExtractors: Loaders: Classpath configurationExtra mapper jars: There are currently 0 values in the list:element# New job was successfully created with validation status OK  and name PostgreSQL_to_HDFS

2.3 创建Oracle To HDFS作业

sqoop:000> create job -f Oracle_link -t HDFS_LinkCreating job for links with from name Oracle_link and to name HDFS_LinkPlease fill following values to create new job objectName: Oracle_To_HDFSDatabase sourceSchema name: spwuserTable name: RHNCHANNELPACKAGESQL statement: Column names: There are currently 0 values in the list:element# Partition column: Partition column nullable: Boundary query: Incremental readCheck column: Last value: Target configurationOverride null value: Null value: File format:   0 : TEXT_FILE  1 : SEQUENCE_FILE  2 : PARQUET_FILEChoose: 0Compression codec:   0 : NONE  1 : DEFAULT  2 : DEFLATE  3 : GZIP  4 : BZIP2  5 : LZO  6 : LZ4  7 : SNAPPY  8 : CUSTOMChoose: 1Custom codec: Output directory: /user/DataSource/Oracle/RHNCHANNELPACKAGEAppend mode: Throttling resourcesExtractors: Loaders: Classpath configurationExtra mapper jars: There are currently 0 values in the list:element# New job was successfully created with validation status OK  and name Oracle_To_HDFS

> 注意:如果表没有主键,则需在"Partition column: "处指定一个字段。

三、创建HDFS To 关系型数据库作业
3.1 HDFS To MySQL作业

sqoop:000> create job -f HDFS_Link -t MySQL_LinkCreating job for links with from name HDFS_Link and to name MySQL_LinkPlease fill following values to create new job objectName: HDFS_To_MySQLInput configurationInput directory: /user/DataSource/MySQL/tbDeliverMsgOverride null value: Null value: Incremental importIncremental type:   0 : NONE  1 : NEW_FILESChoose: 0Last imported date: Database targetSchema name: smsqwTable name: tbDeliverMsg2Column names: There are currently 0 values in the list:element# Staging table: Clear stage table: Throttling resourcesExtractors: Loaders: Classpath configurationExtra mapper jars: There are currently 0 values in the list:element# New job was successfully created with validation status OK  and name HDFS_To_MySQL

3.2 HDFS To PostgreSQL作业

sqoop:000> create job -f HDFS_Link -t PostgreSQL_LinkCreating job for links with from name HDFS_Link and to name PostgreSQL_LinkPlease fill following values to create new job objectName: HDFS_To_PostgreSQLInput configurationInput directory: /user/DataSource/PostgreSQL/RHNPACKAGECHANGELOGDATAOverride null value: Null value: Incremental importIncremental type:   0 : NONE  1 : NEW_FILESChoose: 0Last imported date: Database targetSchema name: publicTable name: rhnpackagechangelogdataColumn names: There are currently 0 values in the list:element# Staging table: Clear stage table: Throttling resourcesExtractors: Loaders: Classpath configurationExtra mapper jars: There are currently 0 values in the list:element# New job was successfully created with validation status OK  and name HDFS_To_PostgreSQL

3.3 HDFS To Oracle作业

sqoop:000> create job -f HDFS_Link -t Oracle_linkCreating job for links with from name HDFS_Link and to name Oracle_linkPlease fill following values to create new job objectName: HDFS_To_OracleInput configurationInput directory: /user/DataSource/Oracle/QRTZ_CRON_TRIGGERSOverride null value: Null value: Incremental importIncremental type:   0 : NONE  1 : NEW_FILESChoose: 0Last imported date: Database targetSchema name: spwuserTable name: QRTZ_CRON_TRIGGERSColumn names: There are currently 0 values in the list:element# Staging table: Clear stage table: Throttling resourcesExtractors: Loaders: Classpath configurationExtra mapper jars: There are currently 0 values in the list:element# New job was successfully created with validation status OK  and name create job -f HDFS_Link -t Oracle_link

Data Transfer By Sqoop2

另外,可以查看指定作业的详细信息,如下:
Data Transfer By Sqoop2
四、运行作业
4.1 MySQL_To_HDFS
Data Transfer By Sqoop2
4.2 PostgreSQL_To_HDFS
Data Transfer By Sqoop2
4.3 Oracle_To_HDFS
Data Transfer By Sqoop2
4.4 HDFS_To_MySQL
Data Transfer By Sqoop2
4.5 HDFS_To_PostgreSQL
Data Transfer By Sqoop2
4.6 HDFS_To_Oracle
Data Transfer By Sqoop2
参考文献:
1、

转载地址:http://mwkla.baihongyu.com/

你可能感兴趣的文章
windows窗口底色渐变
查看>>
开发实战|如何用15行代码发布 token(附操作视频)
查看>>
【转】Java并发基础:了解无锁CAS就从源码分析
查看>>
20.2 shell脚本结构和执行
查看>>
map集合的遍历
查看>>
TeeChart Pro VCL/FMX教程(三):图表分页
查看>>
这可能是史上最全 Redis 高可用解决方案总结
查看>>
技巧 | 编程只用NumPy码一个神经网络
查看>>
深入浅出聊聊 Redis 高级特性
查看>>
Pandas中聚类计算与相关性统计
查看>>
C语言#define拼接宏定义实现
查看>>
使用TFTP进行文件传输
查看>>
Java springcloud B2B2C o2o多用户商城 springcloud架构-docker-feign配置(五)
查看>>
浅谈:为什么python没有自增运算符?
查看>>
云计算安全需要控制加密密钥
查看>>
mac下安装lamp环境
查看>>
自己写NodeJs开发规范
查看>>
ubuntu12.04/14.04不能启动
查看>>
Linux下安装php开发框架yaf
查看>>
公钥和私钥
查看>>