extendedTableProperties
extendedTableProperties
is a tag in the Liquibase Open Source Databricks extension that lets you specify additional properties on a table you're creating. It is a sub-tag of the Liquibase createTable
Change Type.
To alter existing properties on a table, see alterTableProperties
.
Note: To specify additional properties on a view, you can use tblProperties
directly on the Liquibase createView
Change Type instead of on the extendedTableProperties
sub-tag. To alter existing properties on a view, see alterViewProperties
.
Uses
You can use this Change Type to specify additional properties about a table you're creating. You can create a table in your database using createTable
and then specify Databricks-specific behavior using extendedTableProperties
. Also, Liquibase does not have built-in fields for every possible key you can use in Databricks, so the tblProperties
attribute lets you specify your keys freely.
Run extendedTableProperties
Note: This tag is a sub-tag of the createTable
Change Type. It cannot be used outside of createTable
.
To run this Change Type, follow these steps:
- Add the Change Type to your changeset, as shown in the examples on this page.
- Specify any required attributes. Use the table on this page to see which ones your database requires.
- Deploy your changeset by running the
update
command:
liquibase update
Available attributes
Tip: You must specify all top-level attributes marked as required. If you specify an optional attribute, you must also specify any nested attributes that it requires.
Name | Type | Description | Requirement |
---|---|---|---|
clusterColumns
|
String |
The columns to cluster. Clusters are an alternative to partitions. The column order does not matter. Using clusters can help optimize performance for some database queries, such as read operations on some table filters. Databricks recommends using clustering for all new tables you create. For more information, see Use liquid clustering for Delta tables and CLUSTER BY clause (TABLE). Note: You can either specify Tip: Databricks does not allow you to drop tables containing clustered columns. You must use |
Optional |
tableFormat
|
String | The format of the table to create in Databricks. Valid values are: avro , binaryfile , csv , delta , json , orc , parquet , and text . For more information, see What is a table? and CREATE TABLE [USING]. Default: delta . |
Optional |
tableLocation
|
String | The file path to the external location where you want to create the table. You must first create an external location for your table in Databricks. Using the Databricks interface, you can do this at Catalog -> + ["add" icon] -> Add an external location. For more information, see External locations and Create an external location to connect cloud storage to Databricks. | Optional |
tblProperties
|
String |
The table properties you want to specify. Specify properties using the format |
Optional |
partitionColumns
|
String |
The columns to partition. The column order does not matter. Using partitions can speed up table queries and data manipulation. Partitions are an alternative to clusters. For more information, see Partitions and When to partition tables on Databricks. Note: You can either specify |
Optional |
Examples
With clustered columns:
databaseChangeLog:
- changeSet:
id: 1
author: your.name
changes:
- createTable:
tableName: test_table_properties
columns:
- column:
name: id
type: int
constraints:
primaryKey: true
nullable: false
- column:
name: some_column
type: int
extendedTableProperties:
clusterColumns: id, some_column
tableFormat: delta
tableLocation: s3://databricks-external-folder/test_table_properties
tblProperties: 'this.is.my.key'=12,'this.is.my.key2'=true
rollback:
dropTable:
tableName: test_table_properties
With partitioned columns:
databaseChangeLog:
- changeSet:
id: 1
author: your.name
changes:
- createTable:
tableName: test_table_properties
columns:
- column:
name: id
type: int
constraints:
primaryKey: true
nullable: false
- column:
name: some_column
type: int
- column
name: some_other_column
type: int
extendedTableProperties:
partitionColumns: id, some_column
tableFormat: delta
tableLocation: s3://databricks-external-folder/test_table_properties
tblProperties: 'this.is.my.key'=12,'this.is.my.key2'=true
rollback:
dropTable:
tableName: test_table_properties
With clustered columns:
{
"databaseChangeLog": [
{
"changeSet": {
"id": "1",
"author": "your.name",
"changes": [
{
"createTable": {
"tableName": "test_table_properties",
"columns": [
{
"column": {
"name": "id",
"type": "int",
"constraints": {
"primaryKey": "true",
"nullable": "false"
}
}
},
{
"column": {
"name": "some_column",
"type": "int"
}
}
],
"extendedTableProperties": {
"clusterColumns": "id, some_column",
"tableFormat": "delta",
"tableLocation": "s3://databricks-external-folder/test_table_properties",
"tblProperties": "'this.is.my.key'=12,'this.is.my.key2'=true"
}
}
}
],
"rollback": [
{
"dropTable": {
"tableName": "test_table_properties"
}
}
]
}
}
]
}
With partitioned columns:
{
"databaseChangeLog": [
{
"changeSet": {
"id": "1",
"author": "your.name",
"changes": [
{
"createTable": {
"tableName": "test_table_properties",
"columns": [
{
"column": {
"name": "id",
"type": "int",
"constraints": {
"primaryKey": "true",
"nullable": "false"
}
}
},
{
"column": {
"name": "some_column",
"type": "int"
}
},
{
"column": {
"name": "some_other_column",
"type": "int"
}
}
],
"extendedTableProperties": {
"partitionColumns": "id, some_column",
"tableFormat": "delta",
"tableLocation": "s3://databricks-external-folder/test_table_properties",
"tblProperties": "'this.is.my.key'=12,'this.is.my.key2'=true"
}
}
}
],
"rollback": [
{
"dropTable": {
"tableName": "test_table_properties"
}
}
]
}
}
]
}
With clustered columns:
<databaseChangeLog
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:databricks="http://www.liquibase.org/xml/ns/databricks"
xmlns:pro="http://www.liquibase.org/xml/ns/pro"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd
http://www.liquibase.org/xml/ns/databricks
http://www.liquibase.org/xml/ns/databricks/liquibase-databricks-latest.xsd
http://www.liquibase.org/xml/ns/pro
http://www.liquibase.org/xml/ns/pro/liquibase-pro-latest.xsd">
<changeSet id="1" author="your.name">
<createTable tableName="test_table_properties">
<column name="id" type="int" >
<constraints primaryKey="true" nullable="false"/>
</column>
<column name="some_column" type="int"/>
<databricks:extendedTableProperties clusterColumns="id, some_column"
tableFormat="delta"
tableLocation="s3://databricks-external-folder/test_table_properties"
tblProperties="'this.is.my.key'=12,'this.is.my.key2'=true"/>
</createTable>
<rollback>
<dropTable tableName="test_table_properties"/>
</rollback>
</changeSet>
</databaseChangeLog>
With partitioned columns:
<databaseChangeLog
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:databricks="http://www.liquibase.org/xml/ns/databricks"
xmlns:pro="http://www.liquibase.org/xml/ns/pro"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd
http://www.liquibase.org/xml/ns/databricks
http://www.liquibase.org/xml/ns/databricks/liquibase-databricks-latest.xsd
http://www.liquibase.org/xml/ns/pro
http://www.liquibase.org/xml/ns/pro/liquibase-pro-latest.xsd">
<changeSet id="1" author="your.name">
<createTable tableName="test_table_properties">
<column name="id" type="int" >
<constraints primaryKey="true" nullable="false"/>
</column>
<column name="some_column" type="int"/>
<column name="some_other_column" type="int"/>
<databricks:extendedTableProperties partitionColumns="id, some_column"
tableFormat="delta"
tableLocation="s3://databricks-external-folder/test_table_properties"
tblProperties="'this.is.my.key'=12,'this.is.my.key2'=true"/>
</createTable>
<rollback>
<dropTable tableName="test_table_properties"/>
</rollback>
</changeSet>
</databaseChangeLog>
Troubleshooting
Clustered and partitioned columns collision
If you specify values for clusterColumns
and partitionColumns
on the same table, Liquibase throws this error:
Databricks does not support CLUSTER columns AND PARTITION BY columns, please pick one.
Instead, you must specify either clusterColumns
or partitionColumns
, but not both.
Extended table properties double initialization
It is technically possible to specify some Databricks-specific attributes directly in createTable
instead of in extendedTableProperties
. This is not a best practice. If you specify Databricks-specific attributes in both places, Liquibase throws this error:
Double initialization of extended table properties is not allowed. Please avoid using both EXT createTable attributes and Databricks specific extendedTableProperties element. Element databricks:extendedTableProperties is preferred way to set databricks specific configurations.
Instead, you must specify all Databricks-specific attributes in extendedTableProperties
.
All columns in table are partitioned
If you try to create a partition using every column in a table, Databricks throws an error like:
Cannot use all columns for partition columns.
Data written into Delta needs to contain at least one non-partitioned column.
To resolve this, ensure that your table contains at least column that does not have a partition. For more information, see When to partition tables on Databricks.
Database support
This tag is only supported for Databricks. The createTable
Change Type supports auto rollback.