Using Liquibase with Impala/Hive

Written by: Evgenii Seliavka

The purpose of this document is to guide you through the process of creating a new Liquibase project for Impala/Hive. In this tutorial, you will learn how to install the required database drivers and configure the liquibase.properties file to establish a database connection to Impala/Hive.

Support for Impala/Hive is provided via a Liquibase extension. The extension provides basic database operations including:

  • DML statements
  • DDL statements

Prerequisites

  • Ensure that you have installed the latest version of Liquibase. If not, go to https://www.liquibase.org/download to install it.
  • Ensure that the Liquibase executable location is in the PATH environment variable.

Note: We will refer to the location of the Liquibase executable as $LIQUIBASE_HOME in this tutorial.

Tutorial

Before you can start with database update you have to build a Impala/Hive Liquibaseplugin, you must use Maven and perform several prerequisites steps:

  • Upload Impala/Hive JDBC driver to you local maven repository with the following command:
  • mvn install:install-file -Dfile=${file} -DgroupId=${groupId} -DartifactId=${artifactId} -Dversion=${version} -Dpackaging=jar

Where:

file groupid artifactid version
ImpalaJDBC41.jar com.cloudera.impala.jdbc ImpalaJDBC41 2.6.4
HiveJDBC41.jar com.cloudera.hive.jdbc HiveJDBC41 2.6.2
  • (optional, but recommended) Deploy the above artifacts to an internal, private Maven repository such as Nexus or Artifactory, for subsequent use
  • Build liquibase-impala by executing mvn clean install. This will install liquibase-impala in your local Maven repo and create a liquibase-impala.jar fat-jar in the target/ directory.
  • (optional, but recommended) Deploy liquibase-impala to your internal, private Maven repository.

After all the above steps were been completed you have to build extension with the command below, from the extension project directory root:

mvn clean package

By default, this will create a fat-jar under the target folder with the name liquibase-impala.jar, which includes both JDBC drivers, so you no need to copy JDBC drivers to the Liquibase/lib folder, liquibase-impala.jar is self sufficient.

To configure a Liquibase project for Impala/Hive, perform the following steps:

  1. Install the Liquibase extension.
Copy the extension jar file into the $LIQUIBASE_HOME/lib directory.
  1. Create a project folder.
The project folder will be used to store the Liquibase configuration files and changelogs.
  1. Create a liquibase.properties file in the project folder.

Note: If you are not familiar with configuring the liquibase.properties file, review the Creating and configuring a liquibase.properties file tutorial.

The liquibase.properties file should contain the following properties:

  • Impala:
changeLogFile: db.changelog.xml
driver: com.cloudera.impala.jdbc41.Driver
url: jdbc:impala://localhost:21050/impala_test
classpath: ../../Liquibase_ExtJars/liquibase-impala.jar
  • Hive:
changeLogFile: db.changelog.xml
driver: com.cloudera.hive.jdbc41.HS2Driver
url: jdbc:hive2://localhost:10000/hive_test;UID=eselyavka;UseNativeQuery=1
classpath: ../../Liquibase_ExtJars/liquibase-impala.jar

Use the following format for the URL property:

jdbc:databaseXYZ://<server>/<full_db_path>
  • If you placed your jar files in the Liquibase/lib install directory, there is no need to specify the classpath property in the liquibase.properties file. Otherwise, put the path to your extension as it is shown in the preceding example.
  • If you have a Liquibase Pro key and want to apply it to your project, add the following property to your liquibase.properties file
liquibaseProLicenseKey: <paste license key>
  1. Create a dbchangelog.xml file in the project folder.

The changelog files contain a sequence of changesets, each of which makes small changes to the structure of your database. See the following to learn more about changelogs: https://docs.liquibase.com/concepts/basic/changelog.html.

Liquibase provides a changelog template located at $LIQUIBASE_HOME/examples/xml/blank.changelog.xml.

Copy the blank.changelog.xml file to the project folder and rename it to dbchangelog.xml.

The contents of the changelog will be similar to what you see below. Note that each version of Liquibase will have an updated XML schema, so make sure to use the one that matches the version of Liquibase you have installed.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" 
xmlns:ext="http://www.liquibase.org/xml/ns/dbchangelog-ext"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog-ext 
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-ext.xsd
http://www.liquibase.org/xml/ns/dbchangelog 
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.4.xsd">
		
<changeSet author="eselyavka" id="create-table-example">
	<createTable  tableName="person">
	<column  name="firstname"  type="STRING"/>
	<column  name="lastname"  type="STRING"/>
	</createTable>
</changeSet>

<changeSet author="eselyavka" id="insert-example">
	<insert tableName="person">
	<column name="firstname" value="Foo"/>
	<column  name="lastname"  value="Bar"/>
	</insert>
</changeSet>

</databaseChangeLog>
  1. Verify the project configuration

Run the Liquibase status command to verify that the configuration is complete.

Open a command prompt and navigate to the project folder. Run the command as shown below.

[19:41:04][0][eselyavka@ThinkPad-T510:~]$ /opt/liquibase/liquibase status
Liquibase Community 3.10.3 by Datical
####################################################
##   _     _             _ _                      ##
##  | |   (_)           (_) |                     ##
##  | |    _  __ _ _   _ _| |__   __ _ ___  ___   ##
##  | |   | |/ _` | | | | | '_ \ / _` / __|/ _ \  ##
##  | |___| | (_| | |_| | | |_) | (_| \__ \  __/  ##
##  \_____/_|\__, |\__,_|_|_.__/ \__,_|___/\___|  ##
##              | |                               ##
##              |_|                               ##
##                                                ##
##  Get documentation at docs.liquibase.com       ##
##  Get certified courses at learn.liquibase.com  ##
##  Get advanced features and support at          ##
##      liquibase.com/protrial                    ##
##                                                ##
####################################################
Starting Liquibase at Thu, 22 Oct 2020 19:41:10 PDT (version 3.10.3 #32 built at Mon Oct 12 07:24:42 UTC 2020)
2 change sets have not been applied to User@jdbc:impala://localhost:21050/impala_test
Liquibase command 'status' was executed successfully.

Note: Check the corresponding JDBC URL for Impala which should start with User@jdbc:impala. For Hive it is: User@jdbc:hive2.

  1. Generate and execute SQL

Open the command prompt, navigate to the project directory, and then run the following command to get the SQL output of the changelogs:

liquibase updateSQL

To execute the changeset in your schema, run the following:

liquibase update
  1. Verify that the tables have been created

From an impala-shell/beeline CLI tool, check your database changes. You will see a new person and table added to the database. For example:

impala-shell --quiet -i localhost -d impala_test -q "SELECT * FROM person;"

beeline --silent=true -u jdbc:hive2://localhost:10000/hive_test -n "$(whoami)" -e  "SELECT * FROM PERSON;"

  1. Verify that the DATABASECHANGELOG and DATABASECHANGELOGLOCK tables were created.

Using an impala-shell/beeline CLI tool, you should see the new Liquibase tables.

impala-shell --quiet -i localhost -d impala_test -q "SHOW TABLES"

beeline --silent=true -u jdbc:hive2://localhost:10000/hive_test -n "$(whoami)" -e  "SHOW TABLES;"

You should also see two more tables:

  • DATABASECHANGELOG
  • DATABASECHANGELOGLOCK

DATABASECHANGELOG Tracking Table

This table keeps a record of all the changesets that were deployed. The next time you deploy, the changesets in the changelog will be compared with the DATABASECHANGELOG tracking table, and only the new changesets not found in the DATABASECHANGELOG will be deployed. You will notice that a new row was created in that table with the changeset information we have just deployed.

Example:

DATABASECHANGELOGLOCK Tracking Table

This table is used internally by Liquibase to manage access to the changelog table during deployment.

Congratulations!

You have successfully configured your project and can begin creating changesets to migrate changes to your database.

See also