Wednesday, 8 June 2016

Sqoop & SQL Server #12 - Sqoop Cheat Sheet

Theme
I want to…
Sqoop Example
General
See what version of Sqoop is installed.
Sqoop version
General
List the databases I have permission to see.
sqoop list-databases --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];username=[UserName];password=[Password]"
General
List the tables I have permission to see for a specific database.
sqoop list-tables --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]"
Import
Import a specific table into HDFS.
sqoop import --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –table [TableName]
Import
Import a specified table from a named instance into HDFS.
sqoop import --connect "jdbc:sqlserver://[IPAddress]\\[InstanceName];database][Database];username=[UserName];password=[Password]" –table [TableName]
Import
Import a specific table from a specified schema into HDFS.
sqoop import --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –table [TableName] -- --schema [SchemaName]
Import
Import all table from a database.
sqoop import-all-tables --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]"
Import
Import all table from a database with specified exceptions.
sqoop import-all-tables --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –exclude-tables [TableName1],[TableName2]
Import
Import a specific table into a defined HDFS location.
sqoop import --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –table [TableName] –target-dir [Location]
Import
Import specified columns from a table into HDFS.
sqoop import --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –table [TableName] –columns “[Column1], [Column2], [Column3]”
Import
Import into HDFS using a T-SQL query.
sqoop import \
--query 'SELECT a.*, b.* FROM a JOIN b on (a.id = b.id) WHERE $CONDITIONS' \
--split-by a.id

Import
Spread the load of an import across the cluster nodes
sqoop import --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –table [TableName] –m [NumberOfNodes]
Import
Import a specific table into a parquet format file in HDFS.
sqoop import --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –table [TableName] –as-parquetfile
General
Define an Options File for code reuse
#
# Options File for Sqoop Import
#

# Command
import

# Connection String
jdbc:sqlserver://[IPAddress]:[PortNumber]
--database [DatabaseName]

# Credentials
--username [UserName]
--password [Password]

# End

Import
Use an Options File to import data into HDFS
sqoop --options-file [LoactionOfFile]/[NameOfFile] --table Customers

Import
Import a table into Hive
sqoop import --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –table [TableName] –hive-import
Import
Overwrite an existing Hive table during an Import
sqoop import --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –table [TableName] –hive-import –hive-overwrite
Import
Import a table into Hive and handle NULLs
sqoop import --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –table [TableName] –hive-import –null-string ‘\\N’ –null-non-string ‘\\N’
Import
Import a table into Hive and remove Hive delimiters from the data
sqoop import --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –table [TableName] –hive-import –hive-drop-import-delims
Import
Import a table into a Hive partition
sqoop import --connect "jdbc:sqlserver://[IPAddress]:[PortNumber];database][Database];username=[UserName];password=[Password]" –table [TableName] –hive-import –hive-partition-key [PartitionColumn] –hive-partition-value “[ParitionValue]”
Export
Export data from HDFS to SQL Server
sqoop export --connect ‘jdbc:sqlserver://[IPAddress];database=[DatabaseName];instancename=[InstanceName];username=[UserName];password=[Password]’ --table [SQLServerTableName] --export-dir [HDFSTableLocation]

Export
Export specific columns from HDFS to SQL Server
sqoop export --connect ‘jdbc:sqlserver://[IPAddress];database=[DatabaseName];instancename=[InstanceName];username=[UserName];password=[Password]’ --table [SQLServerTableName] –columns ‘[Column1],[Column2],[Column3]’--export-dir [HDFSTableLocation]

Export
Export data from HDFS to SQL Server and handle NULLs
sqoop export --connect ‘jdbc:sqlserver://[IPAddress];database=[DatabaseName];instancename=[InstanceName];username=[UserName];password=[Password]’ --table [SQLServerTableName] --export-dir [HDFSTableLocation] –input-null-string –input-non-null-string

Export
Update SQL Server data from HDFS using an Export command
sqoop export --connect ‘jdbc:sqlserver://[IPAddress];database=[DatabaseName];instancename=[InstanceName];username=[UserName];password=[Password]’ --table [SQLServerTableName] --export-dir [HDFSTableLocation] –update-key ‘[KeyColumn]’

Export
Update SQL Server data and Insert new rows from HDFS using an Export command
sqoop export --connect ‘jdbc:sqlserver://[IPAddress];database=[DatabaseName];instancename=[InstanceName];username=[UserName];password=[Password]’ --table [SQLServerTableName] --export-dir [HDFSTableLocation] –update-key ‘[KeyColumn]’
 --allowinsert
Export
Export data from HDFS to SQL Server in one batch process
sqoop export --connect ‘jdbc:sqlserver://[IPAddress];database=[DatabaseName];instancename=[InstanceName];username=[UserName];password=[Password]’ --table [SQLServerTableName] --export-dir [HDFSTableLocation] --batch

No comments:

Post a Comment