[Hive] overview: distributed data warehouse

📚 데이터베이스/빅데이터 2022. 8. 1. 22:24

Apache Hive is a fault-tolerant distributed data warehouse that allows for massive-scale analytics.

- Hive is built on top of Apache Hadoop, an open-source platform for storing and processing large amounts of data.

-As a result, Hive is inextricably linked to Hadoop and is designed to process petabytes of data quickly.

- Using SQL, Hive allows users to read, write, and manage petabytes of data.

-Hive is distinguished by its ability to query large datasets with a SQL-like interface utilizing Apache Tez or MapReduce.

[AzureVM] How to use Azure VM with PuTTy (0)	2022.08.02
[Flume] collecting streaming data (0)	2022.08.01
[concept] batch processing vs parallelism (0)	2022.07.29
[HBase] Hbase shell command (0)	2022.07.29
[Hadoop] 하둡 HDFS Shell commands (0)	2022.07.26

써니 노트 써니 노트