alterCluster

alterCluster is a Change Type in the Liquibase Open Source Databricks extension that alters a cluster on a table.

To create a new table with a cluster, see extendedTableProperties.

Uses

Clustered columns can help optimize performance for some database queries. If you have previously created a table with one or more clustered columns, you can modify which columns are clustered using alterCluster. Specify which columns to de-cluster by using clusterBy. You can also specify a new column to override existing clustering logic.

Changing which columns are clustered can be useful if your data changes significantly or if you begin using different filters to query your data. Better clustering can improve the read efficiency of the new queries.

Databricks does not allow you to drop tables containing clustered columns. You can use alterCluster to remove clustering and then drop the table.

For more information, see Use liquid clustering for Delta tables and ALTER TABLE.

Run alterCluster

To run this Change Type, follow these steps:

  1. Add the Change Type to your changeset, as shown in the examples on this page.
  2. Specify any required attributes. Use the table on this page to see which ones your database requires.
  3. Deploy your changeset by running the update command:
  4. liquibase update

Available attributes

Tip: You must specify all top-level attributes marked as required. If you specify an optional attribute, you must also specify any nested attributes that it requires.

Name Type Description Requirement
tableName String Name of the table to alter the cluster on. Required

clusterBy

Optional.

Specifies how to cluster the table. Use this only to remove clustering from a column, not add clustering.

clusterBy has the following nested attributes:

  • none (Boolean) (required): if true, turns off clustering for the table being altered. If false, Liquibase throws an error.

columns/column

Optional.

An array of column objects that describes columns in the table. The column order does not matter. Use this to overwrite an existing CLUSTER BY SQL clause or add clustering to a column.

column has the following nested attributes:

  • name (string) (required): the name of the column to alter.

Examples

databaseChangeLog:
  - changeSet:
      id: 2
      author: your.name
      changes:
        - alterCluster:
            tableName: test_table_alter_cluster
            columns:
              - column:
                  name: test_id
      rollback:
        - alterCluster:
            tableName: test_table_alter_cluster
            clusterBy:
              none: "true"
{
  "databaseChangeLog": [
    {
      "changeSet": {
        "id": "2",
        "author": "your.name",
        "changes": [
          {
            "alterCluster": {
              "tableName": "test_table_alter_cluster",
              "columns": [
                {
                  "column": {
                    "name": "test_id"
                  }
                }
              ]
            }
          }
        ],
        "rollback": [
          {
            "alterCluster": {
              "tableName": "test_table_alter_cluster",
              "clusterBy": {
                "none": "true"
              }
            }
          }
        ]
      }
    }
  ]
}
<databaseChangeLog
    xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:databricks="http://www.liquibase.org/xml/ns/databricks"
    xmlns:pro="http://www.liquibase.org/xml/ns/pro"
    xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
        http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd
        http://www.liquibase.org/xml/ns/databricks
        http://www.liquibase.org/xml/ns/databricks/liquibase-databricks-latest.xsd
        http://www.liquibase.org/xml/ns/pro
        http://www.liquibase.org/xml/ns/pro/liquibase-pro-latest.xsd">

    <changeSet id="2" author="your.name">
        <databricks:alterCluster tableName="test_table_alter_cluster">
            <databricks:column name="test_id"/>
        </databricks:alterCluster>

        <rollback>
            <databricks:alterCluster tableName="test_table_alter_cluster">
                <databricks:clusterBy none="true"/>
            </databricks:alterCluster>
        </rollback>
    </changeSet>

</databaseChangeLog>

Troubleshooting

clusterBy parsing error

If you set the clusterBy attribute none=false, Liquibase throws this error:

Unexpected error running Liquibase: Error parsing line 13 column 49 of generated.xml: cvc-enumeration-valid: Value 'false' is not facet-valid with respect to enumeration '[true]'. It must be a value from the enumeration.

The purpose of this attribute is to remove clustering from a column, so it only accepts none=true.

If you were trying to add a clustered column to an existing table, you must simply use alterCluster to specify column. In this case, you must omit clusterBy.

If you were trying to create a new table with clustering, you must use createTable to specify extendedTableProperties. Then, you can use clusterColumns to specify the columns you want to cluster.

clusterBy and columns both null

If you specify neither clusterBy nor columns, Liquibase throws this error:

Alter Cluster change require list of columns or element 'ClusterBy', please add at least one option.

If you were trying to change the name of your table, you must use renameTable instead.

Database support

This Change Type is only supported for Databricks. It does not support auto rollback.

Related links