Using Liquibase with Databricks Platforms

Databricks SQL is a fully managed analytics data warehouse that enables users to run SQL queries on large datasets stored in cloud-based data lakes. It is designed for data analysts and engineers who need to perform data exploration, transformation, and analysis. For more information, see Databricks SQL Documentation.

You can deploy most standard Liquibase Change Types to Databricks. Liquibase supports Databricks-specific functionality with some additional Change Types available in the Liquibase extension for Databricks. For a list of supported features, see the "Features" section.

Liquibase offers two extensions for Databricks: a commercial (Pro) extension for Liquibase Pro users and an open-source (OSS) extension for Liquibase Open Source users. The Databricks Pro extension contains all functionality in the Databricks OSS extension, as well as many additional features. To access Liquibase Pro features, you need a Liquibase Pro license key.

Getting started tutorial

To learn how to install, configure, and use the Liquibase Databricks extension with your instance of Databricks, see Using Liquibase with Databricks SQL. This page contains driver download links, permissions guidance, and a sample changelog to use for a test deployment.

Features

Supported Liquibase Pro features include:

  • Policy Checks: automatically analyze your changelogs for desired format and behavior to increase deployment success rates and uphold security best practices
  • Secrets Management: keep your authentication data secure by integrating with third-party secrets vaults
  • Structured Logging: improve your database observability by easily reading Liquibase data in your favorite analytics tool
  • Operation Reports: generate reports of operations you perform on your database
  • Flow Files: create repeatable, portable, and platform-independent Liquibase workflows to run in any of your CI/CD tools
  • DATABASECHANGELOGHISTORY table (DBCLH): record a history of all changes you make to the database, including changes that are not tracked by the DATABASECHANGELOG table (DBCL)
  • Remote file access: centralize file management with AWS S3 to build a reusable repository of Liquibase files you can update and retrieve
  • Targeted rollback: avoid collateral damage by specifying which changesets in your changelog to undo
  • Stored Logic: capture stored logic objects like functions

Supported Change Types

Databricks-specific Change Types available with the Liquibase Open Source Databricks extension and Liquibase Pro Databricks extension:

Databricks-specific Change Types that require the Liquibase Pro Databricks extension:

Liquibase Open Source Change Types that accept Databricks-specific attributes or sub-tags:

  • createTable: create a table 
    • Databricks sub-tag: extendedTableProperties: specify additional properties on a table, including clusters and partitions
  • createView: create a view on a Databricks table
    • Databricks attribute: tblProperties: similar to extendedTableProperties, but for a view

For more information, see Liquibase Change Types for Databricks.

Liquibase Change Types:

Supported parameters

You can use all normal Liquibase Parameters with Databricks. The Liquibase Pro extension for Databricks offers additional parameters for more advanced functionality.

Note: The following parameters, including databricks-catalog and databricks-schema, only work with OIDC authentication. They will not work for token authentication.

To configure Databricks-specific behavior, set the following Liquibase parameters in the CLI, in flow files, in your liquibase.properties file, or as environment variables:

For more information, see Liquibase Parameters for Databricks.

Supported commands

You can use the Liquibase CLI commands or Liquibase Maven goals with Databricks:

Data type handling

For information about how Liquibase handles data types in Databricks, see Liquibase Data Type Handling.

Limitations

  • Liquibase does not support the followingChange Types:
  • You can deploy changes to Databricks using formatted SQL, YAML, JSON, and XML changelogs, but not unformatted SQL changelogs. To write SQL directly, you can use the sql and sqlFile Change Types.
  • Liquibase does not include the values of tableFormat and partitionColumns in files generated by the database inspection commands diff-changelog and generate-changelog. Databricks cannot add new partition columns to an existing table, nor can it change the table's format or location. Running these attributes on existing tables would cause failures, so Liquibase omits them from changesets it generates.
    • However, the outputs of diff and snapshot do include tableFormat and partitionColumns.
  • Liquibase does not include all auto-increment (identity) information in files generated by diff-changelog. This information is omitted when a table has the same name in both the source and target databases, but different values for autoIncrement. Also, the addAutoIncrement Change Type is not supported for Databricks.  
    • However, files generated by the diff command do include auto-increment information even when the tables have different values for autoIncrement.
  • Databricks does not support directly adding a new column with a default value to a table that already contains columns with default values. In Liquibase Databricks 1.4.0, you cannot use the addDefaultValue Change Type in an YAML, JSON, or XML changelog to add a default value to an existing Databricks table column. Instead, you must specify the SQL queries for addDefaultValue in a Formatted SQL changelog or use the sql or sqlFile Change Types in a YAML, JSON, or XML changelog. For syntax examples, see the "Troubleshooting" section of the addDefaultValue page.

Verified database versions

Liquibase Pro Databricks extension

Liquibase Databricks Pro extension version Liquibase Pro version required Databricks SQL versions verified
1.0.0 Liquibase Pro 4.31.0+ (requires Liquibase Pro license key)

Cloud

  • AWS

  • Azure

  • Google cloud platforms

 

 

Note: The Liquibase Pro Databricks extension also contains the open-source extension binary.

Liquibase Open Source Databricks extension

Liquibase Databricks Open Source extension version Liquibase version required Databricks SQL versions verified
1.4.1 4.31.0+

Cloud

  • AWS

  • Azure

  • Google cloud platforms

1.4.0 4.30.0+
1.3.0 4.29.1+
1.2.0 4.27.0+
1.1.3 4.26.0+

1.1.0–1.1.2

4.25.0+
1.0.0–1.0.1 4.23.2+

For release notes, see Liquibase Open Source Databricks Extension Release Notes.

Related links