Using Liquibase with Impala/Hive

Apache Impala is an open source massively parallel processing (MPP) database for Apache Hadoop. For more information, see Apache Impala Documentation.

Apache Hive is a data warehouse that queries distributed data integrated with Apache Hadoop. For more information, see Apache Hive Documentation.

Support for Impala/Hive is provided via a Liquibase extension. The extension provides basic database operations, including DML and DDL statements.

Supported versions

  • Impala: 2.6.4
  • Hive: 2.6.2

Prerequisites

Install drivers

To use Liquibase and Impala/Hive, you need:

Place your JAR file(s) in the liquibase/lib directory. For more information, see Adding and Updating Liquibase Drivers.

If you use Maven, you also need to download the driver JAR file and configure your Maven pom.xml file to use the local copy of the driver. For example:

Impala:

<dependency>
    <groupId>com.cloudera.impala.jdbc</groupId>
    <artifactId>ImpalaJDBC41</artifactId>
    <version>2.5.41</version>
</dependency>

Hive:

<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>3.1.3</version>
</dependency>

Maven Fat JAR

To bundle the Impala and Hive driver JAR files into a single Fat JAR, follow these steps:

  1. Upload the Impala/Hive JDBC driver to your local maven repository with the following command:
  2. mvn install:install-file -Dfile=${file} -DgroupId=${groupId} -DartifactId=${artifactId} -Dversion=${version} -Dpackaging=jar

    Where:

    file groupid artifactid version
    ImpalaJDBC41.jar com.cloudera.impala.jdbc ImpalaJDBC41 2.5.41
    HiveJDBC41.jar com.cloudera.hive.jdbc HiveJDBC41 3.1.3
  3. (Recommended) Deploy the above artifacts to an internal, private Maven repository such as Nexus or Artifactory.
  4. Run mvn clean install. This installs liquibase-impala in your local Maven repository and creates a liquibase-impala.jar Fat JAR in the target directory.
  5. (Recommended) Deploy liquibase-impala to your internal, private Maven repository.
  6. From the root folder of the extension project directory, build the extension with this command:
  7. mvn clean package

By default, this creates a Fat JAR called liquibase-impala.jar in the target folder that include both the Impala and Hive JDBC drivers. If you use the Fat JAR, you do not also need to copy the individual JDBC drivers to the liquibase/lib folder.

Test your connection

  1. Ensure your Impala/Hive database is configured. See Post-Installation Configuration for Impala and Configuration Management Overview for Hive for more information.
  2. Specify the database URL in the Liquibase properties file:
  3. url: jdbc:impala://localhost:21050/impala_test
    url: jdbc:hive2://localhost:10000/hive_test;UID=your_uid;UseNativeQuery=1

    Tip: To apply a Liquibase Pro key to your project, add the following property to the Liquibase properties file: liquibaseProLicenseKey: <paste code here>

  1. Create a text file called changelog (.xml, .sql, .json, or .yaml) in your project directory and add a changeset.
  2. Navigate to your project folder in the CLI and run the Liquibase status command to see whether the connection is successful:
  3. liquibase --username=test --password=test --changelog-file=<changelog.xml> status

    Note: You can pass arguments in the CLI or keep them in the Liquibase properties file.

  4. Inspect the SQL with the update-sql command. Then make changes to your database with the update command.
  5. liquibase --changelog-file=<changelog.xml> update-sql
    liquibase --changelog-file=<changelog.xml> update
  6. From a database UI tool, ensure that your database contains the test_table you added along with the DATABASECHANGELOG table and DATABASECHANGELOGLOCK table.

For example, using impala-shell or beeline in the CLI:

impala-shell --quiet -i localhost -d impala_test -q "SELECT * FROM person;"

beeline --silent=true -u jdbc:hive2://localhost:10000/hive_test -n "$(whoami)" -e  "SELECT * FROM PERSON;"

impala-shell --quiet -i localhost -d impala_test -q "SHOW TABLES"

beeline --silent=true -u jdbc:hive2://localhost:10000/hive_test -n "$(whoami)" -e  "SHOW TABLES;"

Related links