extendedTableProperties

extendedTableProperties is a tag in the Liquibase Open Source Databricks extension that lets you specify additional properties on a table you're creating. It is a sub-tag of the Liquibase createTable Change Type.

To alter existing properties on a table, see alterTableProperties.

Note: To specify additional properties on a view, you can use tblProperties directly on the Liquibase createView Change Type instead of on the extendedTableProperties sub-tag. To alter existing properties on a view, see alterViewProperties.

Uses

You can use this Change Type to specify additional properties about a table you're creating. You can create a table in your database using createTable and then specify Databricks-specific behavior using extendedTableProperties. Also, Liquibase does not have built-in fields for every possible key you can use in Databricks, so the tblProperties attribute lets you specify your keys freely.

Run extendedTableProperties

Note: This tag is a sub-tag of the createTable Change Type. It cannot be used outside of createTable.

To run this Change Type, follow these steps:

  1. Add the Change Type to your changeset, as shown in the examples on this page.
  2. Specify any required attributes. Use the table on this page to see which ones your database requires.
  3. Deploy your changeset by running the update command:
  4. liquibase update

Available attributes

Tip: You must specify all top-level attributes marked as required. If you specify an optional attribute, you must also specify any nested attributes that it requires.

Name Type Description Requirement
clusterColumns String

The columns to cluster. Clusters are an alternative to partitions. The column order does not matter. Using clusters can help optimize performance for some database queries, such as read operations on some table filters. Databricks recommends using clustering for all new tables you create. For more information, see Use liquid clustering for Delta tables and CLUSTER BY clause (TABLE).

Note: You can either specify clusterColumns or partitionColumns, but not both.

Tip: Databricks does not allow you to drop tables containing clustered columns. You must use alterCluster to remove clustering before you can drop the table.

Optional
tableFormat String The format of the table to create in Databricks. Valid values are: avro, binaryfile, csv, delta, json, orc, parquet, and text. For more information, see What is a table? and CREATE TABLE [USING]. Default: delta. Optional
tableLocation String The file path to the external location where you want to create the table. You must first create an external location for your table in Databricks. Using the Databricks interface, you can do this at Catalog -> + ["add" icon] -> Add an external location. For more information, see External locations and Create an external location to connect cloud storage to Databricks. Optional
tblProperties String

The table properties you want to specify. Specify properties using the format 'key'='value'. Separate multiple properties using commas.

Optional
partitionColumns String

The columns to partition. The column order does not matter. Using partitions can speed up table queries and data manipulation. Partitions are an alternative to clusters. For more information, see Partitions and When to partition tables on Databricks.

Note: You can either specify clusterColumns or partitionColumns, but not both.

Optional

Examples

With clustered columns:

databaseChangeLog:
  - changeSet:
      id: 1
      author: your.name
      changes:
        - createTable:
            tableName: test_table_properties
            columns:
              - column:
                  name: id
                  type: int
                  constraints:
                    primaryKey: true
                    nullable: false
              - column:
                  name: some_column
                  type: int
            extendedTableProperties:
              clusterColumns: id, some_column
              tableFormat: delta
              tableLocation: s3://databricks-external-folder/test_table_properties
              tblProperties: 'this.is.my.key'=12,'this.is.my.key2'=true
      rollback:
        dropTable:
          tableName: test_table_properties

With partitioned columns:

databaseChangeLog:
  - changeSet:
      id: 1
      author: your.name
      changes:
        - createTable:
            tableName: test_table_properties
            columns:
              - column:
                  name: id
                  type: int
                  constraints:
                    primaryKey: true
                    nullable: false
              - column:
                  name: some_column
                  type: int
              - column
                  name: some_other_column
                  type: int
            extendedTableProperties:
              partitionColumns: id, some_column
              tableFormat: delta
              tableLocation: s3://databricks-external-folder/test_table_properties
              tblProperties: 'this.is.my.key'=12,'this.is.my.key2'=true
      rollback:
        dropTable:
          tableName: test_table_properties

With clustered columns:

{
  "databaseChangeLog": [
    {
      "changeSet": {
        "id": "1",
        "author": "your.name",
        "changes": [
          {
            "createTable": {
              "tableName": "test_table_properties",
              "columns": [
                {
                  "column": {
                    "name": "id",
                    "type": "int",
                    "constraints": {
                      "primaryKey": "true",
                      "nullable": "false"
                    }
                  }
                },
                {
                  "column": {
                    "name": "some_column",
                    "type": "int"
                  }
                }
              ],
              "extendedTableProperties": {
                "clusterColumns": "id, some_column",
                "tableFormat": "delta",
                "tableLocation": "s3://databricks-external-folder/test_table_properties",
                "tblProperties": "'this.is.my.key'=12,'this.is.my.key2'=true"
              }
            }
          }
        ],
        "rollback": [
          {
            "dropTable": {
              "tableName": "test_table_properties"
            }
          }
        ]
      }
    }
  ]
}

With partitioned columns:

{
  "databaseChangeLog": [
    {
      "changeSet": {
        "id": "1",
        "author": "your.name",
        "changes": [
          {
            "createTable": {
              "tableName": "test_table_properties",
              "columns": [
                {
                  "column": {
                    "name": "id",
                    "type": "int",
                    "constraints": {
                      "primaryKey": "true",
                      "nullable": "false"
                    }
                  }
                },
                {
                  "column": {
                    "name": "some_column",
                    "type": "int"
                  }
                },
                {
                  "column": {
                    "name": "some_other_column",
                    "type": "int"
                  }
                }
              ],
              "extendedTableProperties": {
                "partitionColumns": "id, some_column",
                "tableFormat": "delta",
                "tableLocation": "s3://databricks-external-folder/test_table_properties",
                "tblProperties": "'this.is.my.key'=12,'this.is.my.key2'=true"
              }
            }
          }
        ],
        "rollback": [
          {
            "dropTable": {
              "tableName": "test_table_properties"
            }
          }
        ]
      }
    }
  ]
}

With clustered columns:

<databaseChangeLog
    xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:databricks="http://www.liquibase.org/xml/ns/databricks"
    xmlns:pro="http://www.liquibase.org/xml/ns/pro"
    xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
        http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd
        http://www.liquibase.org/xml/ns/databricks
        http://www.liquibase.org/xml/ns/databricks/liquibase-databricks-latest.xsd
        http://www.liquibase.org/xml/ns/pro
        http://www.liquibase.org/xml/ns/pro/liquibase-pro-latest.xsd">

    <changeSet id="1" author="your.name">
        <createTable tableName="test_table_properties">
            <column name="id" type="int" >
                <constraints primaryKey="true" nullable="false"/>
            </column>
            <column name="some_column" type="int"/>

            <databricks:extendedTableProperties clusterColumns="id, some_column"
                                                tableFormat="delta"
                                                tableLocation="s3://databricks-external-folder/test_table_properties"
                                                tblProperties="'this.is.my.key'=12,'this.is.my.key2'=true"/>
        </createTable>

        <rollback>
            <dropTable tableName="test_table_properties"/>
        </rollback>
    </changeSet>

</databaseChangeLog>

With partitioned columns:

<databaseChangeLog
    xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:databricks="http://www.liquibase.org/xml/ns/databricks"
    xmlns:pro="http://www.liquibase.org/xml/ns/pro"
    xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
        http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd
        http://www.liquibase.org/xml/ns/databricks
        http://www.liquibase.org/xml/ns/databricks/liquibase-databricks-latest.xsd
        http://www.liquibase.org/xml/ns/pro
        http://www.liquibase.org/xml/ns/pro/liquibase-pro-latest.xsd">

    <changeSet id="1" author="your.name">
        <createTable tableName="test_table_properties">
            <column name="id" type="int" >
                <constraints primaryKey="true" nullable="false"/>
            </column>
            <column name="some_column" type="int"/>
            <column name="some_other_column" type="int"/>

            <databricks:extendedTableProperties partitionColumns="id, some_column"
                                                tableFormat="delta"
                                                tableLocation="s3://databricks-external-folder/test_table_properties"
                                                tblProperties="'this.is.my.key'=12,'this.is.my.key2'=true"/>
        </createTable>

        <rollback>
            <dropTable tableName="test_table_properties"/>
        </rollback>
    </changeSet>

</databaseChangeLog>

Troubleshooting

Clustered and partitioned columns collision

If you specify values for clusterColumns and partitionColumns on the same table, Liquibase throws this error:

Databricks does not support CLUSTER columns AND PARTITION BY columns, please pick one.

Instead, you must specify either clusterColumns or partitionColumns, but not both.

Extended table properties double initialization

It is technically possible to specify some Databricks-specific attributes directly in createTable instead of in extendedTableProperties. This is not a best practice. If you specify Databricks-specific attributes in both places, Liquibase throws this error:

Double initialization of extended table properties is not allowed. Please avoid using both EXT createTable attributes and Databricks specific extendedTableProperties element. Element databricks:extendedTableProperties is preferred way to set databricks specific configurations.

Instead, you must specify all Databricks-specific attributes in extendedTableProperties.

All columns in table are partitioned

If you try to create a partition using every column in a table, Databricks throws an error like:

Cannot use all columns for partition columns.
Data written into Delta needs to contain at least one non-partitioned column.

To resolve this, ensure that your table contains at least column that does not have a partition. For more information, see When to partition tables on Databricks.

Database support

This tag is only supported for Databricks. The createTable Change Type supports auto rollback.

Related links