loadData

Loads data from a CSV file into an existing table when you add it to your changelog.

Uses

After you create a table in your database using the Liquibase createTable Change Type, you can populate it with data from an external CSV (comma-separated value) file using loadData. This may be useful when setting up a new application or migrating data from another system.

Syntax rules

  • A value of NULL (regardless of capitalization) in a cell will be converted to a database NULL rather than the string NULL. Lines starting with a number sign (#) are treated as comments.

  • You can change the comment pattern by using the commentLineStartsWith attribute. To disable comments, set commentLineStartsWith to an empty value.

If the data type for a load column is set to NUMERIC, numbers are parsed in the US locale (for example: 123.45). Date/Time values included in the CSV file should be in ISO format to be parsed correctly by Liquibase.

Liquibase initially sets the date format to yyyy-MM-dd'T'HH:mm:ss and then it checks for two special cases, which will override the data format string:

  • If the string representing the date/time includes a period (.), then the date format is changed to yyyy-MM-dd'T'HH:mm:ss.SSS.

  • If the string representing the date/time includes a space, then the date format is changed to yyyy-MM-dd HH:mm:ss.

Once the date format string is set, Liquibase will then call the SimpleDateFormat.parse() method attempting to parse the input string so that it can return a date/time. If problems occur, then a ParseException is thrown and the input string is treated as a String for the INSERT command to be generated. If UUID type is used, UUID value is stored as string and NULL in cell is supported.

Non-ISO date format

To use a non-ISO format for your dates:

  1. Ensure you are using Liquibase 4.4.0 or later.

  2. Add an ALTER SESSION command in SQL to your changeset.

  3. Specify the DATE type in the affected columns of your loadData changeset.

For example, in an Oracle database using MM/DD/YYYY HH:MI:SS AM:

<changeSet author="Liquibase" id="0"> <sql> ALTER SESSION SET NLS_DATE_FORMAT = 'MM/DD/YYYY HH:MI:SS AM'; </sql> <loadData> <column name="EXPIRATION_DATE" type="DATE 'MM/DD/YYYY HH:MI:SS AM'"/> </loadData> <changeSet>


Load data with the loadData tag

Including specific columns

All CSV columns are used by default while generating SQL, even if they are not described in the columns property. If you want to skip specific headers in the CSV file, set the value of the type property to skip.

Imagine that you have a table with columns col_1,col_2,col_3. To load only columns col_1 and col_2, specify col_3 in the column tag and set its type to skip: <column name="col_3" header="col_3" type="skip" />

Creating a loadable CSV

You can load data from an existing CSV file (names.csv):

id,first,last 0,john,doe 1,eric,smith 2,cat,jones

In your changelog, create a table called populated. Then create a loadData changeset to insert data from the CSV into that table. For example:

<changeSet author="your.name" id="1::emptyTable"> <createTable tableName="populated"> <column name="id" type="int" autoIncrement="true"> <constraints primaryKey="true" nullable="false"/> </column> <column name="first" type="varchar(50)"/> <column name="last" type="varchar(50)"/> </createTable> </changeSet> <changeSet author="your.name" id="2::dataForTable" labels="data"> <loadData file="names.csv" tableName="populated"/> </changeSet>

In your command line, deploy your change: liquibase update

Note: Liquibase 4.4.0+ assumes that spaces in the CSV header are part of the column names and does not strip them. If your header contains spaces and the column names in your changeset don't, Liquibase throws an error. To resolve this, use the --trim-load-data-file-header parameter in Liquibase 4.29.0.

Generating loadData changesets and a CSV from your current database

You can also capture the current state of your database as a CSV with the generate-changelog command and dataOutputDirectory attribute. This generates the loadData changesets with the columns from the CSV, and formats your data as a CSV in the folder you specified for dataOutputDirectory:

liquibase generate-changelog --diffTypes=tables,columns,data --dataOutputDirectory=myData --changelog-file=output_changelog.xml

Example generated changelog (with dataOutputDirectory)

<?xml version="1.1" encoding="UTF-8" standalone="no"?> <databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ext="http://www.liquibase.org/xml/ns/dbchangelog-ext" xmlns:pro="http://www.liquibase.org/xml/ns/pro" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd http://www.liquibase.org/xml/ns/dbchangelog-ext http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-ext.xsd http://www.liquibase.org/xml/ns/pro http://www.liquibase.org/xml/ns/pro/liquibase-pro-latest.xsd"> <changeSet author="adrian (generated)" id="1667231983827-1"> <loadData commentLineStartsWith="#" encoding="UTF-8" file="myData/populated.csv" quotchar=""" separator="," tableName="POPULATED"> <column header="ID" name="ID" type="NUMERIC"/> <column header="FIRST" name="FIRST" type="STRING"/> <column header="LAST" name="LAST" type="STRING"/> </loadData> </changeSet> </databaseChangeLog>

If you don't use the --dataOutputDirectory flag while running the command, Liquibase uses insert statements in your generated changelog instead of columns in the loadData tag, and also formats your data as a CSV in the folder you specified for dataOutputDirectory:

liquibase --diffTypes=tables,columns,data --changelog-file=myChangelog.xml generate-changelog

Example generated changelog (without dataOutputDirectory)

<?xml version="1.1" encoding="UTF-8" standalone="no"?> <databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ext="http://www.liquibase.org/xml/ns/dbchangelog-ext" xmlns:pro="http://www.liquibase.org/xml/ns/pro" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd http://www.liquibase.org/xml/ns/dbchangelog-ext http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-ext.xsd http://www.liquibase.org/xml/ns/pro http://www.liquibase.org/xml/ns/pro/liquibase-pro-latest.xsd"> <changeSet author="adrian (generated)" id="1667315146413-1"> <insert tableName="POPULATED"> <column name="ID" valueNumeric="0"/> <column name="FIRST" value="john"/> <column name="LAST" value="doe"/> </insert> <insert tableName="POPULATED"> <column name="ID" valueNumeric="1"/> <column name="FIRST" value="eric"/> <column name="LAST" value="smith"/> </insert> <insert tableName="POPULATED"> <column name="ID" valueNumeric="2"/> <column name="FIRST" value="cat"/> <column name="LAST" value="jones"/> </insert> </changeSet> </databaseChangeLog>


Placeholder values

In Liquibase 4.31.0 and later, you can specify a null placeholder value in your loadData changeset. This allows Liquibase to translate corresponding values in your CSV into NULL values. This is an alternative to leaving a CSV cell blank, which Liquibase translates as an empty string.

For example, consider this changeset:

<changeSet id="nullPlaceHolderTest" author="your.name"> <loadData tableName="nullPlaceholderTable" file="nullPlaceHolderTestData.csv" relativeToChangelogFile="true"> <column name="test_col" type="COMPUTED" nullPlaceholder="testNullPlaceholder"/> <column name="other_col" type="varchar(100)"/> </loadData> </changeSet>

And this CSV file nullPlaceHolderTestData.csv:

colKey,col2 1+1,testNullPlaceholder 5,there

When you deploy this changeset, Liquibase searches for testNullPlaceholder in your CSV file and replaces any instances of that string with a NULL SQL reference.


Available attributes

Name

Description

Required for

Supports

Since

catalogName

Name of the catalog

all

3.0

commentLineStartsWith

Lines starting with this are treated as comments and ignored.

all

encoding

Encoding of the CSV file (defaults to UTF-8)

all

file

CSV file to load

all

all

quotchar

The quote character for string fields containing the separator character.

all

relativeToChangelogFile

Specifies whether the file path is relative to the changelog file rather than looked up in the search path. Default: false.

all

schemaName

Name of the schema

all

separator

Character separating the fields.

all

tableName

Name of the table to insert or update data in

all

all

usePreparedStatements

Use prepared statements instead of insert statement strings if the database supports it.

all

Nested tags

Name

Description

Required for

Supports

Multiple allowed

column

Column mapping and defaults can be defined. The header or index attributes need to be defined. If the header name in the CSV is different than the column, name needs to be inserted. If no column is defined at all, the type is taken from a database. Otherwise, for non-string columns, the type definition might be required.

Note: YAML and JSON changelogs using the column tag must nest it within a columns tag.

all

yes

Nested property attributes

Name

Description

name

Name of the column (Required).

header

Name of the column in the CSV file from which the value for the column will be taken if it's different from the column name. Ignored if index is also defined.

index

Index of the column in the CSV file from which the value for the column will be taken.

nullPlaceholder

Liquibase 4.31.0+. Specifies a placeholder string for Liquibase to search for in a loadData CSV file. If the CSV contains the value you specified for nullPlaceholder, Liquibase replaces that value with a NULL SQL reference.

type

Data type of the column. Its value has to be one of the accepted values of LOAD_DATA_TYPE. If you want to skip loading a specific column, use the skip data type described earlier. Otherwise, all columns in the CSV file will be used.

Database support

Database

Notes

Auto Rollback

DB2/LUW

Supported

No

DB2/z

Supported

No

Derby

Supported

No

Firebird

Supported

No

Google BigQuery

Supported

No

H2

Supported

No

HyperSQL

Supported

No

INGRES

Supported

No

Informix

Supported

No

MariaDB

Supported

No

MySQL

Supported

No

Oracle

Supported

No

PostgreSQL

Supported

No

Snowflake

Supported

No

SQL Server

Supported

No

SQLite

Supported

No

Sybase

Supported

No

Sybase Anywhere

Supported

No

Troubleshooting

CLOB data type handling

In Liquibase 4.29.2 and earlier, all CSV entries in a clob type column are assumed to be a path to a file. If Liquibase finds that file, it inserts the contents of the file into the table you're updating. If not, the execution stops with an error:

Caused by: java.io.FileNotFoundException: The file ExampleFileName was not found in the configured search path:

In Liquibase 4.30.0 and later, all CSV entries in a clob type column are assumed to be a path to a file by default. If Liquibase finds that file, it inserts the contents of the file into the table you're updating. If not, Liquibase inserts the string value as-is.

loadData examples

databaseChangeLog:
  - changeSet:
      id: loadData-example
      author: liquibase-docs
      changes:
        - loadData:
            catalogName: cat
            columns:
              - column:
                  header: header1
                  name: id
                  type: NUMERIC
              - column:
                  index: 3
                  name: name
                  type: BOOLEAN
            commentLineStartsWith: //
            encoding: UTF-8
            file: example/users.csv
            quotchar: ''''
            relativeToChangelogFile: true
            schemaName: public
            separator: ;
            tableName: person
            usePreparedStatements: true

loadData - Liquibase