Load Data Statements #
Description #
The LOAD DATA statement is used to load the data into a Hive table from the user specified directory or file. The load operation are currently pure copy/move operations that move data files into locations corresponding to Hive tables.
Syntax #
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]; Parameters #
-
filepath
The
filepathcan be:- a relative path, such as
warehouse/data1 - an absolute path, such as
/user/hive/warehouse/data1 - a full URL with schema and (optionally) an authority, such as
hdfs://namenode:9000/user/hive/warehouse/data1
The
filepathcan refer to a file (in which case, only the single file is loaded) or it can be a directory (in which case, all the files from the directory are loaded). - a relative path, such as
-
LOCALIf specify
LOCALkeyword, then:- it will look for
filepathin the local file system. If a relative path is specified, it will be interpreted relative to the users’ current working directory. The user can specify a full URI for local files as well - for example: file:///user/hive/warehouse/data1 - it will try to copy all the files addressed by
filepathto the target file system. The target file system is inferred by looking at the location attribution. The copied data files will then be moved to the location of the table.
If not, then:
- if schema or authority are not specified, it’ll use the schema and authority from the hadoop configuration variable
fs.default.namethat specifies the NameNode URI. - if the path is not absolute, then it’ll be interpreted relative to /user/
- It will try to move the files addressed by
filepathinto the table (or partition).
- it will look for
-
OVERWRITEBy default, the files addressed by
filepathwill be appended to the table (or partition). If specificOVERWRITE, the original data will be replaced by the files. -
PARTITION ( ... )An option to specify load data into table’s specific partitions. If the
PARTITIONclause is specified, the table should be a partitioned table.
NOTE:
For loading data into partition, the partition specifications must be full partition specifications. Partial partition specification is not supported yet.
Examples #
-- load data into table LOAD DATA LOCAL INPATH '/user/warehouse/hive/t1' OVERWRITE INTO TABLE t1; -- load data into partition LOAD DATA LOCAL INPATH '/user/warehouse/hive/t1/p1=1' INTO TABLE t1 PARTITION (p1=1);