Sponsor

Install Apache Hive di Cluster Hadoop CentOS 7

Apache Hive merupakan project data warehouse dan analytics yang dibangun di atas platform Apache Hadoop. Tools ini awalnya dikembangkan oleh developer Facebook yang menilai cara akses/query data yang disimpan di platform Hadoop terlalu susah karena masih harus menggunakan konsep Map and Reduce yang implementasinya menggunakan programming Java. Untuk itu sebagai solusinya mereka menciptakan sebuah interface yang memungkinkan untuk melakukan retrieval dan manipulasi data menggunakan bahasa yang serupa dengan syntax SQL.

Pada tutorial sebelumnya kita sudah melakukan install apache hadoop multi cluster pada server CentOS 7, tutorial kali ini membahas cara menghubungkan apache hive dengan hadoop cluster yang sudah kita buat.

Download Hive Source

$ wget https://www-us.apache.org/dist/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz
$ tar -zxvf apache-hive-3.1.2-bin.tar.gz
$ sudo mv apache-hive-3.1.2-bin /usr/local/hive

Setup Environment Variable

$ su - hadoop
$ vi ~/.bashrc
# Sesuaikan dengan teks di bawah ini
# Java
export JAVA_HOME=$(dirname $(dirname $(readlink $(readlink $(which javac)))))

# Hadoop
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_COMMON_LIB_NATIVE_DIR"
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar

# Yarn
export YARN_HOME=$HADOOP_HOME

# Hive
HIVE_HOME=/usr/local/hive
HIVE_CONF_DIR=/usr/local/hive/conf

# Classpath
export CLASSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar:$HADOOP_HOME/lib/*:$HIVE_HOME/lib/*

# Path
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin

Konfigurasi File hive-env.sh

$ cd $HIVE_CONF_DIR
$ sudo cp hive-env.sh.template hive-env.sh
$ sudo chmod +x hive-env.sh
$ sudo vi hive-env.sh
# tambahkan pada akhir baris
HIVE_HOME=/usr/local/hive
HIVE_CONF_DIR=/usr/local/hive/conf

Install MySQL dan JDBC MySQL Connector (Hive Metastore)

$ sudo yum install mariadb-server
$ sudo systemctl enable mariadb
$ sudo systemctl start mariadb
$ sudo mysql_secure_installation
$ wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.30.tar.gz
$ tar -zxvf mysql-connector-java-5.1.30.tar.gz
$ sudo cp mysql-connector-java-5.1.30/mysql-connector-java-5.1.30-bin.jar $HIVE_HOME/lib/
$ sudo chmod +x $HIVE_HOME/lib/mysql-connector-java-5.1.30-bin.jar

Buat database MySQL dengan nama metastore untuk Hive (initSchema)

$ mysql -u root -p
CREATE DATABASE metastore;
USE metastore;
SOURCE /usr/local/hive/scripts/metastore/upgrade/mysql/hive-schema-3.1.0.mysql.sql;
CREATE USER 'hive'@localhost IDENTIFIED BY '';
GRANT ALL ON *.* TO 'hive'@localhost;
FLUSH PRIVILEGES;

Konfigurasi File hive-site.xml

$ cp $HIVE_CONF_DIR/hive-default.xml.template $HIVE_CONF_DIR/hive-site.xml
$ vi $HIVE_CONF_DIR/hive-site.xml
# hive-site.xml
<property>
    <name>hive.exec.local.scratchdir</name>
    <value>/tmp/hive_temp</value>
    <description>Local scratch space for Hive jobs</description>
  </property>
  <property>
    <name>hive.execution.engine</name>
    <value>mr</value>
    <description>
      Expects one of [mr, tez, spark].
      Chooses execution engine. Options are: mr (Map reduce, default)</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true&useSSL=false</value>
    <description>metadata is stored in a MySQL server</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
    <description>Username to use against metastore database</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>***</value>
    <description>password to use against metastore database</description>
  </property>
  <property>
    <name>hive.exec.scratchdir</name>
    <value>/tmp/hive_tmp</value>
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission</description>
  </property>

Install HBase

$ cd ~/
$ wget http://apache.mirror.gtcomm.net/hbase/stable/hbase-2.2.3-bin.tar.gz
$ tar -zxvf hbase-2.2.3-bin.tar.gz
$ sudo mv hbase-2.2.3-bin /usr/local/hbase
$ sudo chown -R hadoop:hadoop /usr/local/hbase
$ sudo chmod -R g+rwx /usr/local/hbase
$ vi ~/.bashrc
# tambahkan
export HBASE_HOME=/usr/local/hbase
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$HBASE_HOME/bin
$ source ~/.bashrc
$ cd $HBASE_HOME/conf
$ vi hbase-env.sh
# atur JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.242.b08-0.el7_7.x86_64/
$ vi hbase-site.xml
# tambahkan
<configuration>
   <property>
      <name>hbase.rootdir</name>
      <value>hdfs://localhost:8030/hbase</value>
   </property>
	
   <property>
      <name>hbase.zookeeper.property.dataDir</name>
      <value>/hadoop/zookeeper</value>
   </property>
   
   <property>
     <name>hbase.cluster.distributed</name>
     <value>true</value>
   </property>
</configuration>
$ sudo mkdir -p /hadoop/zookeeper
$ sudo chown -R hadoop:hadoop /hadoop/
$ start-hbase.sh 

Buat direktori database hive di HDFS

$ hdfs dfs -mkdir /user/
$ hdfs dfs -mkdir /user/hive
$ hdfs dfs -mkdir /user/hive/warehouse
$ hdfs dfs -mkdir /tmp
$ hdfs dfs -chmod -R a+rwx /user/hive/warehouse
$ hdfs dfs -chmod g+w /tmp

Test Hive Console

$ hive
hive> show tables;
HiveQL on Hive Console
HiveQL on Hive Console

Running Metastore dan Hive as Daemon (Background)

$ mkdir $HIVE_HOME/logs
$ hive --service metastore --hiveconf hive.log.dir=$HIVE_HOME/logs --hiveconf hive.log.file=metastore.log >/dev/null 2>&1 &
$ hive --service hiveserver2 --hiveconf hive.log.dir=$HIVE_HOME/logs --hiveconf hive.log.file=hs2.log >/dev/null 2>&1 &

Testing Hive Web UI

$ hiveserver2
Akses ke browser http://hadoop02:10002
Hiveserver2 web ui
HiveServer2 Web UI

Posting Komentar

0 Komentar