Tuesday, 21 August 2007

Database sources management plans

We think that the database source management plans for Openbravo ERP will offer great enhancements to our community of users and developers. Database structure and data development is very complicated: you cannot develop a database the same way you do with java source code, you cannot use a version control system like CVS or Subversion to track the changes, and you cannot deploy the changes you made into a working database easily. For those reasons we are working to simplify the development and deployment of database changes.

Our main target is to get rid of the two database dumps we provide for Oracle and for PostgreSql in each version and to provide the structure and data of the database in several XML text files that define the sources of the Openbravo ERP database. These XML files must be database independent (they must be the same for Oracle and for PostgreSQL) and must create, when "imported" to an empty database, a working Openbravo ERP environment, in the same way that the current database dumps do.

This way, we will be able to manage database changes in a Subversion repository in the same way as the rest of the Openbravo ERP source files, because XML are, in fact, plain text files. With a Subversion repository, we will be able to integrate the development of new Openbravo ERP functionality in an standard process, commit all changes - including those done in the database - to the Subversion repository, create log changes based on commit comments, link a commit to a tracker entry, create nightly builds based on the last commits of a day, etc, etc...

As an additional benefit, if we have the database schema of two versions of Openbravo ERP described in XML format, with the right tool we can generate a SQL script that upgrades the schema from one version to the next. By that, I mean a SQL script with the appropriate DDL and DML statements (i.e. "ALTER TABLE ...", "INSERT ... ", "UPDATE ..." commands) that upgrade the database.

There are many issues related to database migration/versioning problems, but we do not need to solve them all; we just want to solve the major issue. The problem we are going to focus on is to include all the database changes we make from one version to another, and all the database changes our developers and users do when developing new Openbravo ERP functionality, when fixing bugs, and when customizing for customers... We will release documentation for developers about what this tool can do and what this tool cannot do, and methodology about how to implement the most common database changes a developer needs when developing Openbravo ERP functionality.

Once we are able to create database scripts that upgrade a database from one version to another, it will be easier to upgrade Openbravo ERP installations in production environments, everything at one time: sources and database. To create patches of Openbravo ERP functionality. And to keep a development environment updated with the latest commits from the Subversion repository of Openbravo ERP.

And now, what tool do we think is the best? We are using a tool by the Apache Software Foundation called DDLUtils: http://db.apache.org/ddlutils/ . This is the definition of DDLUtils taken from its home page: "DDLUtils is a small, easy-to-use component for working with Database Definition (DDL) files. These are XML files that contain the definition of a database schema, e.g. tables and columns. These files can be fed into DDLUtils via its Ant task or programmatically in order to create the corresponding database or alter it so that it corresponds to the DDL. Likewise, DDLUtils can generate a DDL file for an existing database.". Exactly the tool we need.

Since DDLUtils only supports tables and not altering data - it only works for the database schema - we are hacking this tool to include support for all the database objects included in an Openbravo ERP database: views, check constraints, procedures, functions, triggers and sequences. DDLUtils support a lot of database engines but we are only adding support of these database objects only for Oracle and PostgreSQL.

We are also adding support to merge data allowing insert, update and delete of data in the corresponding database. This will allow us to upgrade the metadata model of Openbravo ERP from one version to another: Tables, Windows, Reports, etc.

We made a lot of progress hacking DDLUtils and creating the sources XML text files and we expect to release a first preview of the database versioning project soon. We know that there will be a lot of problems and things to polish before the tool we are building becomes production quality, but we are confident that we will be done it in time and that this will be a great improvement for all the Openbravo community.

There is an open discussion of this topic in the following thread of the Openbravo forums: http://sourceforge.net/forum/forum.php?thread_id=1790871&forum_id=549512

Everybody is invited to participate.