Introduction
SQLite is not a real database system. It’s more like a big file of text. Using SQLite to manage any kind of system can be dangerous when your database grows. The best advise is to loose a bit more time to learn how to set up a real database system like MySQL or PostgreSQL.
However, sometimes you started with a SQLite database. Then you may want to migrate to a real database system. Let’s do that.
In this article, a SQLite 3 database owncloud.db
will be migrated into a
PostgreSQL 9.1 database (with username owncloud
and database name owncloud
).
The use case is migrating the database of a instance of OwnCloud 6.0.3 with
basic functionalities. Some applications have also been installed and the only
noticeable one is News (allows you
to subscribe to RSS flux).
Export the SQLite database
This is a pretty easy step since sqlite3
tool can give you a dump of your
database as SQL commands. Let’s put the output into a file dump.sql
.
sqlite3 owncloud.db .dump > dump.sql
Seems easy, right? Let’s go to the next step.
Import the dump into PostgreSQL
The nomal way
Since SQL is standard, we can try to feed PostgreSQL with this dump.sql
file
with the following command.
psql -U owncloud -d owncloud < owncloud.db
If it works for you, great! However, as we will see, each database system… has its own SQL standard.
Converting the dump
There is a few things to adapt between SQLite and PostgreSQL. An entry point can be this article which explains what should be adapted. Let’s sum up everything that should be adapted:
-
Remove
PRAGMA
calls -
Convert
DATETIME
intoTIMESTAMP
(DATETIME
is for SQLite whenTIMESTAMP
seems to be the one for PostgreSQL, see documentation) -
Remove
UNSIGNED
(every integer is unsigned in PostgreSQL, see documentation) -
Remove
AUTOINCREMENT
then replaceINTEGER
withSERIAL
(SERIAL
is the autoincrement integer in PostgreSQL) -
Convert
LONGTEXT
andCLOB
intoTEXT
orVARCHAR
(see documentation) -
Convert
BLOB
intoBYTEA
(and you might need more work for these ones) -
Convert boolean values from respectively
0
and1
to'0'
and'1'
orTRUE
andFALSE
(see documentation)
Except the last 2 points that requires more attention, we can do everything with
sed
. Let’s do a sed
script migrate.sed
for automatically adapting our
dump dump.sql
.
# Remove every line starting with `PRAGMA' /^PRAGMA/d # Change types only in the scope of a table creation /^CREATE TABLE/ { # Remove `UNSIGNED' keyword and spaces following s/\<UNSIGNED\> *//g # Replace `DATETIME' type (SQLite) with `TIMESTAMP' type (PostgreSQL) s/\<DATETIME\>/TIMESTAMP/g # `LONGTEXT' and `CLOB' replaced by `TEXT' (could use `VARCHAR' too) s/\<LONGTEXT\|CLOB\>/TEXT/g # Replace any integer with `AUTOINCREMENT' with `SERIAL' s/\<INTEGER\>\([^,]*\)[ ]*AUTOINCREMENT/SERIAL\1/g # Replace `BLOB' with `BYTEA' (binary blob in PostgreSQL) s/\<BLOB\>/BYTEA/g }
Some notes about that script.
Note
|
|
Note
|
We want to replace types only for table creation. For example, we want to avoid
to remove |
Note
|
|
Note
|
|
After these few notes about the sed
script, let’s apply it.
sed -f migrate.sed dump.sql > dump-adapted.sql
BLOB
and BOOLEAN
adptations
This is a more tricky problem since we will need to replace or encapsulate value
that will be inserted into our database (lines that starts with INSERT INTO
).
For this purpose, I made a awk
script to parse lines with CREATE TABLE
and
transform them, for each table, into sed
replacement command that will target
only the needed field (for example, the third one if the third one is a
boolean, and the fifth one if it’s a BLOB
).
#!/bin/awk -f
# Before program
BEGIN {
}
# Program
{
print "# " $0;
header = 0;
lastpatt = "";
for(k=1; k<=NF; k++) {
if($k ~ /BLOB|BOOLEAN/) {
subin="\\(^INSERT INTO \042\\?" ENVIRON["DB"] "\042\\? VALUES(";
subout = "";
can_be_null=0;
subnum=1;
for(i=1; i<=NF; i++) {
patt = "";
if(($i !~ /KEY/) && (i != 1)) {
subin = subin ",";
}
if($i ~ /INTEGER/) {
patt = "[[:digit:]]\\+";
}
if($i ~ /NUMERIC/) {
patt = "[[:digit:].-]\\+";
}
if(($i ~ /VARCHAR|(LONG)?TEXT|BLOB/)) {
patt = "\047.*\047";
}
if(($i !~ /NOT NULL|KEY/)) {
patt = "\\(NULL\\|" patt "\\)";
subnum = subnum + 1;
}
if($i ~ /BLOB/) {
if (i < k) {
patt = "decode(" patt ",\047hex\047)";
}
if(i == k) {
if(patt ~ /NULL/) {
bdecnum = 1;
decnum = subnum;
adecnum = decnum + 1;
patt = "\\)" patt "\\(";
} else {
bdecnum = 1;
decnum = subnum + 1;
adecnum = decnum + 1;
patt = "\\)\\(" patt "\\)\\(";
}
subout = "\\" bdecnum "decode(\\" decnum ",\047hex\047)\\" adecnum;
}
}
if($i ~ /BOOLEAN/) {
patt = "[01]";
if(i == k) {
bdecnum = 1;
decnum = subnum + 1;
adecnum = decnum + 1;
patt = "\\)\\(" patt "\\)\\(";
subout = "\\" bdecnum "\047\\" decnum "\047\\" adecnum;
}
}
if(($i !~ /KEY/)) {
lastpatt = patt;
}
subin = subin patt;
can_be_null = 0;
}
subin = subin ");$\\)";
if(header == 0) {
startpatt = ENVIRON["START_PATT"];
endpatt = lastpatt ENVIRON["END_PATT"];
print "/" startpatt "/,/" endpatt "/ {";
print "/" startpatt "/h";
print "/" startpatt "/!H";
print "/" endpatt "/ {";
print "g";
header = 1;
}
print "s/" subin "/" subout "/g";
}
}
print "p";
print "}";
print "d";
print "}";
}
# After program
END {
}
It’s a pretty unreadable awk
and sed
collaboration so you may try to understand only if you needed to.
Otherwise, you may want to try the bash
script I wrote which will run the
awk
script to produce the sed
script then run the sed
script to adapt the
dump file.
#!/bin/sh
# The original database
SRC_FILE=owncloud.db
# A list of SQL commands that should create a copy or the source database
DUMP_FILE=dump.sql
# A list of `sed` actions to convert the SQLite dump into PostgreSQL
SED_FILE=migrate.sed
# Awk program to create sed actions for the BYTEA
AWK_FILE=migrate.awk
if [ -f $DUMP_FILE ]
then
rm $DUMP_FILE
fi
if [ -f $SED_FILE ]
then
rm $SED_FILE
fi
sqlite3 $SRC_FILE .dump > $DUMP_FILE
function to_sed() {
echo $1 >> $SED_FILE
}
to_sed '#!sed -f'
to_sed '/^PRAGMA/d'
to_sed '/^CREATE TABLE/ {'
to_sed 's/\<UNSIGNED\> *//g'
to_sed 's/\<DATETIME\>/TIMESTAMP/g'
to_sed 's/\<LONGTEXT\>/TEXT/g'
to_sed 's/\<INTEGER\>\([^,]*\)[ ]*AUTOINCREMENT/SERIAL\1/g'
to_sed 's/\<CLOB\>/TEXT/g'
to_sed 's/\<BLOB\>/BYTEA/g'
to_sed '}'
cat $DUMP_FILE | grep '^CREATE TABLE.*\<\(BLOB\|BOOLEAN\)\>'| while read LINE
do
DB=$( echo $LINE | awk '{print $3}' | sed -e 's/"//g' -e "s/'//g" )
START_PATT='^INSERT INTO "\?\<'$DB'\>"\?'
END_PATT=');$'
export DB START_PATT END_PATT
echo $LINE | sed -e 's/^CREATE TABLE "\?\<'$DB'\>"\? (//g' -e 's/);$//g' -e 's/([^)]*)//g' | awk -F ' *, *' -f $AWK_FILE >> $SED_FILE
done
to_sed "s/decode(NULL,'hex')/NULL/g"
to_sed 'p'
sed -n -f $SED_FILE -i.bak $DUMP_FILE
Remember, you may need these scripts if you have either BOOLEAN
and/or BLOB
.
If not, then you probably don’t wwant them and just use the simple sed
script
I provided in the previous section.
Conclusion
To conclude to this article, even if I do a lot of research and it tooks me time
to write these scripts, I didn’t success well in migrating from SQLite to
PostgreSQL. In the meanwhile, I learned a lot of interesting stuff about awk
and sed
scripting, about playing with PostgreSQL. I also learned that you
want probably to use PostgreSQL or MySQL from the start and avoid any migration.
This article may helps you to understand some important things about the
migration process, to find information but it will probably not be enough to
make it well. It’s can only be a starting point. Good luck!