Using Liquibase with Databricks Platforms

Databricks SQL is a fully managed analytics data warehouse that enables users to run SQL queries on large datasets stored in cloud-based data lakes. It is designed for data analysts and engineers who need to perform data exploration, transformation, and analysis. For more information, see Databricks SQL Documentation.

You can deploy most standard Liquibase Change Types to Databricks. Liquibase supports Databricks-specific functionality with some additional Change Types available in the Liquibase extension for Databricks. For a list of supported features, see the "Features" section.

Getting started tutorial

To learn how to install, configure, and use the Liquibase Databricks extension with your instance of Databricks, see Using Liquibase with Databricks SQL. This page contains driver download links, permissions guidance, and a sample changelog to use for a test deployment.

Verified database versions

Liquibase Databricks extension version Liquibase version required Databricks SQL versions verified
1.4.0 4.30.0+
  • Cloud
1.3.0 4.29.1+
1.2.0 4.27.0+
1.1.3 4.26.0+

1.1.0–1.1.2

4.25.0+
1.0.0–1.0.1 4.23.2+

Features

Supported Liquibase Pro features include:

  • Policy Checks: automatically analyze your changelogs for desired format and behavior to increase deployment success rates and uphold security best practices
  • Secrets Management: keep your authentication data secure by integrating with third-party secrets vaults
  • Structured Logging: improve your database observability by easily reading Liquibase data in your favorite analytics tool
  • Operation Reports: generate reports of operations you perform on your database
  • Flow Files: create repeatable, portable, and platform-independent Liquibase workflows to run in any of your CI/CD tools
  • DATABASECHANGELOGHISTORY table (DBCLH): record a history of all changes you make to the database, including changes that are not tracked by the DATABASECHANGELOG table (DBCL)
  • Remote file access: centralize file management with AWS S3 to build a reusable repository of Liquibase files you can update and retrieve
  • Targeted rollback: avoid collateral damage by specifying which changesets in your changelog to undo
  • Stored Logic: capture stored logic objects like procedures, functions, packages, and triggers

Supported Change Types

Databricks-specific Change Types available in Liquibase Open Source and Liquibase Pro:

Liquibase Change Types that accept Databricks attributes or sub-tags:

  • createTable: create a table 
    • Databricks sub-tag: extendedTableProperties: specify additional properties on a table, including clusters and partitions
  • createView: create a view on a Databricks table
    • Databricks attribute: tblProperties: similar to extendedTableProperties, but for a view

Liquibase Change Types:

Supported commands

You can use the Liquibase CLI commands or Liquibase Maven goals with Databricks:

Data type handling

For information about how Liquibase handles data types in Databricks, see Liquibase Data Type Handling.

Limitations

  • You can deploy changes to Databricks using YAML, JSON, and XML changelogs, but not SQL changelogs. To write SQL directly, you can use the sql and sqlFile Change Types.
  • Liquibase does not include the values of tableFormat and partitionColumns in files generated by the database inspection commands diff-changelog and generate-changelog. Databricks cannot add new partition columns to an existing table, nor can it change the table's format or location. Running these attributes on existing tables would cause failures, so Liquibase omits them from changesets it generates.
    • However, the outputs of diff and snapshot do include tableFormat and partitionColumns.
  • Liquibase does not include all auto-increment (identity) information in files generated by diff-changelog. This information is omitted when a table has the same name in both the source and target databases, but different values for autoIncrement. Also, the addAutoIncrement Change Type is not supported for Databricks.  
    • However, files generated by the diff command do include auto-increment information even when the tables have different values for autoIncrement.
  • Databricks does not support directly adding a new column with a default value to a table that already contains columns with default values. In Liquibase Databricks 1.4.0, you cannot use the addDefaultValue Change Type in an YAML, JSON, or XML changelog to add a default value to an existing Databricks table column. Instead, you must specify the SQL queries for addDefaultValue in a Formatted SQL changelog or use the sql or sqlFile Change Types in a YAML, JSON, or XML changelog. For syntax examples, see the "Troubleshooting" section of the addDefaultValue page.

Related links