Apache Hive Installation – Install Hive on Ubuntu in terminal

3/16/2025

Extracting Apache Hive 4.0.0 tar file in Ubuntu terminal

Go Back

Apache Hive Installation – Install Hive on Ubuntu in 5 Minutes

Apache Hive is a data warehousing infrastructure tool built on top of Hadoop. This guide will help you install the latest version of Apache Hive (4.0.0) on Ubuntu, configure HiveServer2, and interact with Hive using the Beeline Command shell.

What is Apache Hive?

Apache Hive is a data warehouse system that facilitates data query and analysis using SQL-like syntax. To install Hive, you need to have Java and Hadoop properly installed and functioning on your Linux OS.

Prerequisites

  • Java installed and configured
  • Hadoop up and running
Extracting Apache Hive 4.0.0 tar file in Ubuntu terminal

Step-by-step guide to installing Apache Hive 4.0.0 on Ubuntu

1. Download Hive

Step 1: Download the latest Hive 4.0.0 from the official Apache website:

wget https://dlcdn.apache.org/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz

Step 2: Extract the downloaded tar file:

tar -xzf apache-hive-4.0.0-bin.tar.gz

2. Configuring Hive Files

Step 3: Set Hive environment variables in the .bashrc file:

nano ~/.bashrc

Add the following lines:

export HIVE_HOME="/home/ubuntu/apache-hive-4.0.0-bin"
export PATH=$PATH:$HIVE_HOME/bin

Save the file by pressing CTRL+O, then exit with CTRL+X.

Step 4: Load the updated environment variables:

source ~/.bashrc

Step 5: Configure core-site.xml file located in Hadoop’s configuration directory (/etc/hadoop/):

nano /etc/hadoop/core-site.xml

Add the following configuration:

<configuration>
  <property>
    <name>hadoop.proxyuser.hive.groups</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.hive.hosts</name>
    <value>*</value>
  </property>
</configuration>

Save and exit.

3. Create Hive Warehouse Directory in HDFS

Step 6: Create Hive directories in HDFS:

hadoop fs -mkdir /tmp
hadoop fs -mkdir /user
hadoop fs -mkdir /user/hive
hadoop fs -mkdir /user/hive/warehouse

Step 7: Grant write permissions:

hadoop fs -chmod g+w /tmp
hadoop fs -chmod g+w /user/hive/warehouse

4. Initialize Derby Database

Step 8: Initialize the Derby database:

$HIVE_HOME/bin/schematool -dbType derby -initSchema

5. Launch HiveServer2

Step 9: Start HiveServer2:

$HIVE_HOME/bin/hiveserver2

Step 10: Open a new terminal and connect using Beeline:

$HIVE_HOME/bin/beeline -u jdbc:hive2://localhost:10000 -n hive

6. Verify Hive Installation

Step 11: Run a simple Hive query to list databases:

show databases;

Conclusion

You have successfully installed Apache Hive 4.0.0 on Ubuntu. You can now run SQL queries on your Hadoop data warehouse efficiently. For further Hive tutorials and advanced query operations, stay tuned to orientalguru.co.in!

Table of content