📚 데이터베이스/빅데이터
[Hive] overview: distributed data warehouse
써니(>_<)
2022. 8. 1. 22:24
Apache Hive is a fault-tolerant distributed data warehouse that allows for massive-scale analytics.
- Hive is built on top of Apache Hadoop, an open-source platform for storing and processing large amounts of data.
-As a result, Hive is inextricably linked to Hadoop and is designed to process petabytes of data quickly.
- Using SQL, Hive allows users to read, write, and manage petabytes of data.
-Hive is distinguished by its ability to query large datasets with a SQL-like interface utilizing Apache Tez or MapReduce.