The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. It helps in projecting structure on data already present in a storage.
Apache Hive supports following data types:
Numeric Types: TINYINT, SMALLINT, INT, BIGINT, FLOAT
Date/Time Types: TIMESTAMP, DATE, INTERVAL
String Types: STRING, VARCHAR, CHAR
Misc Types: BOOLEAN, BINARY
Complex Types: arrays, maps, structs, union
Numeric Types in Apache Hive are:
TINYINT: (1-byte signed integer, from -128 to 127)
SMALLINT: (2-byte signed integer, from -32,768 to 32,767)
INT/INTEGER: (4-byte signed integer, from -2,147,483,648 to 2,147,483,647)
BIGINT: (8-byte signed integer, from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807)
Float Types in Apache Hive are:
FLOAT (4-byte single precision floating point number)
DOUBLE (8-byte double precision floating point number)
DOUBLE PRECISION (alias for DOUBLE, only available starting with Hive 2.2.0)
DECIMAL Introduced in Hive 0.11.0 with a precision of 38 digits
NUMERIC (same as DECIMAL, starting with Hive 3.0.0)
Date/Time Types in Apache Hive are:
TIMESTAMP: Only available starting with Hive 0.8.0
DATE: Only available starting with Hive 0.12.0
INTERVAL: Only available starting with Hive 1.2.0
String Types in Apache Hive are:
STRING
VARCHAR: Only available starting with Hive 0.12.0
CHAR: Only available starting with Hive 0.13.0
Misc Types in Apache Hive are:
BOOLEAN
BINARY: Only available starting with Hive 0.8.0
Complex Types in Apache Hive are:
arrays: ARRAY<data_type> (Note: negative values and non-constant expressions are allowed as of Hive 0.14.)
maps: MAP<primitive_type, data_type> (Note: negative values and non-constant expressions are allowed as of Hive 0.14.)
structs: STRUCT<col_name : data_type [COMMENT col_comment], …>
union: UNIONTYPE<data_type, data_type, …> (Note: Only available starting with Hive 0.7.0.)
HiveQL DDL statements are as follows:
CREATE DATABASE/SCHEMA, TABLE, VIEW, FUNCTION, INDEX
DROP DATABASE/SCHEMA, TABLE, VIEW, INDEX
TRUNCATE TABLE
ALTER DATABASE/SCHEMA, TABLE, VIEW
HiveQL DDL statements are as follows:
MSCK REPAIR TABLE (or ALTER TABLE RECOVER PARTITIONS)
SHOW DATABASES/SCHEMAS, TABLES, TBLPROPERTIES, VIEWS, PARTITIONS, FUNCTIONS, INDEX[ES], COLUMNS, CREATE TABLE
DESCRIBE DATABASE/SCHEMA, table_name, view_name, materialized_view_name
Hive supports following DML statements:
LOAD
INSERT
UPDATE
DELETE
MERGE
A SELECT statement can be part of a union query or a subquery of another query.
table_reference indicates the input to the query.
It can be a regular table, a view, a join construct or a subquery.
Table names and column names are case insensitive.
Hive supports following in SELECT query:
GROUP BY
SORT/ORDER/CLUSTER/DISTRIBUTE BY
JOIN
UNION
TABLESAMPLE
Subqueries
Virtual Columns
Refer official documents on Apache Hive here:
Hive Documentation: https://hive.apache.org
Language Manual: https://cwiki.apache.org/confluence/display/Hive/LanguageManual