Migrations in v8

Wednesday, December 12, 2018 4:23 PM , edited Saturday, December 29, 2018 12:14 PM umbraco

(update: continues in Migrations in v8 (2))

One area of Umbraco that has been refactored in v8 is migrations—for the same reason other areas have been refactored: to make things simpler. Migrations in v7 are relatively easy, but have some "interesting" sides that can render things quite complicated.

In v7, migrations are automatically discovered, and ordered by "weight". They are tied to Umbraco's versions (e.g. 7.12.4) and run when upgrading from one version to another. When developing Umbraco, the only way to run migrations is to force the version in web.config back to an older version, and then all migrations for that version run again. Because of this, migrations have to figure out whether, for instance, the column they are supposed to delete still exists or not.

In v8, migrations need to be explicitely registered in a plan, and the plan defines the order of execution. The plan is a state machine: each state is identified by a string, and each transition from one state to another is accomplished via a migration. The current state is kept in a database table, and is totally independent from the application's version.

Each migration runs once and only once, and they run in a deterministic order.

Migrations

Migration classes have not changed a lot. A migration skeleton would look like:

public class Migration1 : MigrationBase
{
    public Migration1(IMigrationContext context)
        : base(context)
    { }
    
    public override void Migrate()
    { }
}

First, note that there are no more "up" and "down" methods: migrations are one-way only and do not support "going back" or "reverting" anymore. Also, migrations are instantiated by the DI container and it is totally fine to add more parameters to the constructor. Finally, the MigrationBase class provides various useful services such as an ILogger, a Database, etc. along with all the convenient helpers that were already available in v7.

For instance, one could write:

public override void Migrate()
{
    Logger.Debug<Migration1>("Running my Migration1");

    if (ColumnExists("myTable", "myColumn"))
        Delete.Column("myColumn").FromTable("myTable").Do();
        
    Database.Execute(Sql("DELETE FROM myOtherTable"));
}

One key difference is that expression builders such as Delete.Column(...)... used to queue Sql statements, and they would actually be executed after all migrations have run. This caused various odd situations including, for instance, conflicts with immediate statements such as the Database.Execute() call above. In v8, they are executed immediately, when the Do() method executes.

Which means that one should not forget to conclude the expression with Do(), else it would not execute. Umbraco would detect this, and throw.

Migration Plan

As explained in the introduction, a migration plan is a state machine. Each state is a string and each transition is a migration. The easiest way to define a plan is to directly use the built-in MigrationPlan class:

// instantiate a plan named "MyProject"
var plan = new MigrationPlan("MyProject");

// define the transitions
plan.From(string.Empty)
    .To<Migration1>("state-1")
    .To<Migration2>("state-2")
    .To<Migration3>("state-3");

The name of the plan is used as a key to store the current state in the database. At any time, Umbraco knows which state the plan has reached.

By default, a plan that has never executed starts with the string.Empty state. Here, we define that, from the string.Empty state, we want to go to the state-1 state by running Migration1, and then to state-2, and so on.

A plan is executed by an upgrader. The upgrader compares the final state of the plan against the current state, and executes all migrations until it reaches the final state. Again, the easiest solution is to use the built-in Upgrader class, in a component that executes when Umbraco boots:

public class MyProjectUpgradeComponent : UmbracoUserComponent
{
    public override Initialize(IScopeProvider scopeProvider, IMigrationBuilder migrationBuilder, IKeyValueService keyValueService, ILogger logger)
    {
        var plan = new MigrationPlan("MyProject");
        plan.From(string.Empty)
            .To<Migration1>("state-1")
            .To<Migration2>("state-2")
            .To<Migration3>("state-3");
            
        var upgrader = new Upgrader(plan);
        upgrader.Execute(scopeProvider, migrationBuilder, keyValueService, logger);
    }
}

You may want to create your own plan class, to encapsulate it all:

public class MyProjectPlan : MigrationPlan
{
    public MyProjectPlan()
        : base("MyProject")
    {
        From(string.Empty)
            .To<Migration1>("state-1")
            .To<Migration2>("state-2")
            .To<Migration3>("state-3");
    }
}

And then, creating the upgrader becomes:

var upgrader = new Upgrader(new MyProjectPlan());

Creating your own plan also allows you to take control of the initial state. This can be interesting when upgrading from old, pre-v8 versions. For instance:

public override string InitialState
{
    get
    {
        var currentVersion = DetectCurrentVersion(); // based on files, or...
        if (currentVersion == 2) return "init-v2";
        return string.Empty;
    }
}

And in the constructor...

// default path
From(string.Empty)
    .To<Migration1>("state-1")
    .To<Migration2>("state-2")
    .To<Migration3>("state-3");
        
// from v2, no need to run Migration1    
From("init-v2")
    .To("state-2"); // and, from state-2, it's the default path

The plan above defines the graph through two paths, one from string.Empty to state-3, and one from init-v2 to state-2, where it joins the other path to reach state-3.

Managing the Plan

The plan is a direct graph. It must lead to one unique final state. And, from any given state, there has to be only one way to reach the next state. The following plan is valid: each state has only one outgoing migration, and C is the only final state.

A -> B -> C
     D -> C

The following plan is invalid, because B has two outgoing migrations (and then, what shall we do?) and there are two final states:

A -> B -> C
     B -> D

Same for the following one. Although there is only one final state, there are more than one way to get there:

A -> B -> C
A -> D -> C

A benefit of explicitly defining a plan appears in multi-developers teams. Imagine Alice adds a migration:

.To<Migration3>("state-3")
.To<AliceChanges>("state-alice-4");

And Bob adds:

.To<Migration3>("state-3");
.To<BobChanges>("state-bob-4");

In v7, nobody would notice, and both migrations would run in an unspecified order, and hopefully it would be ok. Or not. In v8, these two changes will trigger a merge conflict, and this is a good thing. You have to deal with the situation, figure out whether the migrations are compatible, and... what to do with them.

In the simple case where migrations are indeed compatible, and order does not matter, you would probably merge as:

.To<Migration3>("state-3")
.To<AliceChanges>("state-alice-4"); // execute Bob's changes after Alice's
.To<BobChanges>("state-5");

From("state-bob-4") // provide a way for Bob to reach the final state
.To<AliceChanges>("state-5"); // execute Alice's changes after Bob's

And both Bob and Alice would nicely migrate to state-5 on the next run.

Naming states can be hard, especially when you want to be sure that developers don't share the same names. This is why Umbraco uses Guids for its states.

What's simple?

At that point, what do you think? Was it easier in v7? Maybe. Was it simpler? Maybe not. It was easier to just throw some migrations around, but hard to get them to work correctly. v8 is going to require a bit more work, but in return things should work.

At least that is what we think. But as always, community is probably going to see some obvious flaws and potential gains. We tried our best, but are eager to receive feedback!

Happy migrations!

comments powered by Disqus