Trimming changelog files
Large changelog files may become unwieldy over time. This guide offers best practices on how to address this common occurrence.
You add individual change sets to your changelog file to make changes to your database. Over time, those changes can build up to thousands of entries, many of which are now redundant or inefficient. For example, a table was created and then deleted, or a table was created and then columns were added individually instead of creating a table with all the columns.
Warning: Before continuing, verify whether simplification is necessary. Your changelog was built, run, and tested over a long period of time. Manual edits to the changelog file risk database stability. Consider whether the performance or file size concerns you have outweigh the risk of altering a functional script.
Scenario 1: changelog file size
Often the changelog file is so large that it causes performance issues in your editor, or there are too many merge conflicts. The best way to handle this is to divide your changelog file into multiple files. Instead of having a single changelog.xml
file with everything in it, create a main.changelog.xml
file which uses the include tag to reference other changelog files.
<databaseChangeLog
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ext="http://www.liquibase.org/xml/ns/dbchangelog-ext"
xmlns:pro="http://www.liquibase.org/xml/ns/pro"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd
http://www.liquibase.org/xml/ns/dbchangelog-ext
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-ext.xsd
http://www.liquibase.org/xml/ns/pro
http://www.liquibase.org/xml/ns/pro/liquibase-pro-latest.xsd">
<include file="com/example/news/news.changelog.xml"/>
<include file="com/example/directory/directory.changelog.xml"/>
</databaseChangeLog>
When you run liquibase update
against the main.changelog.xml
file, changesets in com/example/news/news.changelog.xml
will run and then the changesets in com/example/directory/directory.changelog.xml
will run. You can break up changesets by feature, release, or other ways.
Scenario 2: liquibase update
time efficiency
Other times, the problem is that liquibase update
takes too long. Liquibase tries to be as efficient as possible when comparing the contents of the DATABASECHANGELOG table with the current changelog file and even if there are thousands of already ran changesets, an update
command should take just seconds to run. If you find that update
takes longer than it should, watch the Liquibase log to determine why.
Perhaps there is an old runAlways="true"
changeset that no longer needs to run or there are preconditions which are no longer needed. Running Liquibase with --log-level=INFO
or even --log-level=DEBUG
can give additional output which can help you determine which changesets are slow. Once you know what is slowing down your update, try to alter just those changesets rather than throwing out the whole changelog and starting from scratch. You will still want to retest your changelog in-depth, but it is a far less risky change.
Scenario 3: surgically alter existing changesets
Some users find that liquibase update
works well for incremental updates, but creating a database from scratch takes far too long. Are you re-creating databases often enough that the risk of a change to the creation script makes sense? If you are, look for problem changesets as described above. Indexes often cause performance bottlenecks during database creation, so start with those. If you are creating and updating indexes frequently in your creation process, you may be able to combine those changesets into something more efficient.
Note: When you need to surgically alter your existing changesets, remember how Liquibase works: each changeset has an id
, an author
, and a file path which together uniquely identifies it. If the DATABASECHANGELOG table has an entry for that changeset it will not run it. If it has an entry, it throws an error if the checksum for the changeset in the file doesn't match what was stored on the last run.
Modify
How you modify your existing changesets will also depend on your environment and where in the changelog the problem changesets are. If you are modifying changesets that have been applied to all of your environments and are now only used on fresh database builds you can treat them differently than if they have been applied to some databases but not yet to others.
Merge
To merge or modify existing changesets you will be doing a combination of editing existing changesets, removing old changesets, and creating new ones.
Remove
Removing unneeded changesets is easy because Liquibase doesn't care about DATABASECHANGELOG rows with no corresponding changesets. Just delete out of date changesets and you are done. For example, if you have a changeset that creates the table cart
and then another that drops it, just remove both changesets from the file. You must make sure, however, that there are no changesets between the create and the delete that make use of that table or they will fail on a fresh database build. That is an example of how you are introducing risk when changing your changelog file.
Example scenario
Suppose instead you have a cart
table that is created in one changeset, then a promo_code
column is created in another and an abandoned
flag is created in another.
<databaseChangeLog
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ext="http://www.liquibase.org/xml/ns/dbchangelog-ext"
xmlns:pro="http://www.liquibase.org/xml/ns/pro"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd
http://www.liquibase.org/xml/ns/dbchangelog-ext
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-ext.xsd
http://www.liquibase.org/xml/ns/pro
http://www.liquibase.org/xml/ns/pro/liquibase-pro-latest.xsd">
<changeSet author="nvoxland" id="1">
<createTable tableName="cart">
<column name="id" type="int"/>
</createTable>
</changeSet>
<changeSet author="nvoxland" id="2">
<addColumn tableName="cart">
<column name="promo_code" type="varchar(10)"/>
</addColumn>
</changeSet>
<changeSet author="nvoxland" id="3">
<addColumn tableName="cart">
<column name="abandoned" type="boolean"/>
</addColumn>
</changeSet>
</databaseChangeLog>
Option 1
This will work well if all existing databases have the cart table with the promo_code
and abandoned columns already added.
- Combine everything into a new changeset using the existing
id="1"
. - Delete the other changesets.
<databaseChangeLog
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ext="http://www.liquibase.org/xml/ns/dbchangelog-ext"
xmlns:pro="http://www.liquibase.org/xml/ns/pro"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd
http://www.liquibase.org/xml/ns/dbchangelog-ext
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-ext.xsd
http://www.liquibase.org/xml/ns/pro
http://www.liquibase.org/xml/ns/pro/liquibase-pro-latest.xsd">
<changeSet author="nvoxland" id="1">
<validCheckSum>7:f24b25ba0fea451728ffbade634f791d</validCheckSum>
<createTable tableName="cart">
<column name="id" type="int"/>
<column name="promo_code" type="varchar(10)"/>
<column name="abandoned" type="boolean"/>
</createTable>
</changeSet>
</databaseChangeLog>
Note: Running Liquibase against existing databases just sees that id="1"
already ran and does not commit any new script.
Option 2
Running Liquibase against a blank database will create the cart
table with all the columns right away.
You must apply the validCheckSum
tag in the error message to notify Liquibase that you know it changed and the new value is valid. If you omit the tag, existing databases will throw an error stating that id="1"
has changed since it was run.
Option 3
If you have some databases where the promo_code
and/or abandoned columns have not yet been added, follow the below instructions:
- Update the original
createTable
as before. - Use preconditions with
onFail="MARK_RAN"
to handle cases where the old changesets ran. - Do not add the columns again if the new changesets ran.
<databaseChangeLog
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ext="http://www.liquibase.org/xml/ns/dbchangelog-ext"
xmlns:pro="http://www.liquibase.org/xml/ns/pro"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd
http://www.liquibase.org/xml/ns/dbchangelog-ext
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-ext.xsd
http://www.liquibase.org/xml/ns/pro
http://www.liquibase.org/xml/ns/pro/liquibase-pro-latest.xsd">
<changeSet author="nvoxland" id="1">
<validCheckSum>7:f24b25ba0fea451728ffbade634f791d</validCheckSum>
<createTable tableName="cart">
<column name="id" type="int"/>
<column name="promo_code" type="varchar(10)"/>
<column name="abandoned" type="boolean"/>
</createTable>
</changeSet>
<changeSet author="nvoxland" id="2">
<preConditions onFail="MARK_RAN">
<not><columnExists tableName="cart" columnName="promo_code"/></not>
</preConditions>
<addColumn tableName="cart">
<column name="promo_code" type="varchar(10)"/>
</addColumn>
</changeSet>
<changeSet author="nvoxland" id="3">
<preConditions onFail="MARK_RAN">
<not><columnExists tableName="cart" columnName="abandoned"/></not>
</preConditions>
<addColumn tableName="cart">
<column name="abandoned" type="boolean"/>
</addColumn>
</changeSet>
</databaseChangeLog>
Option 4
On existing databases that have all three changesets already ran, Liquibase will just continue on as before. For existing databases that have the old cart definition, it will see that the columns don't exist for id="2"
and id="3"
and execute as usual. For blank databases, it will create the table with the promo_code
and abandoned columns and then in id="2"
and id="3"
it will see that they are already there and mark that they have ran without re-adding the columns.
Preconditions
Note: Preconditions add a performance overhead to your update executions and are ignored in update-sql
mode because Liquibase cannot know how applicable they are when changesets have not actually executed. It is best to use them only when needed.
Preconditions also add complexity to your changelog which will require additional testing so keep that in mind when deciding whether to modify your changelog logic. Sometimes it is easiest and safest to wait until all your databases have the columns and then modify the changesets to avoid the preconditions.
The cart
/promo_code
/abandoned
example shows some basic patterns you can use when modifying existing changesets. Similar patters can be used to optimize different types of bottleneck issues. Just remember when one changeset is changed, it can affect other changesets below which may need to be modified as well. This can easily spiral out of control so be mindful of what you are doing.
If you end up finding that it will work best to completely restart your changelog, see How to set up Liquibase with an Existing Project and Multiple Environments, which describes how to add Liquibase to an existing project (even if that project was previously managed by Liquibase).