justtodolist updates and appengine stuff

I made some enhancements again to justtodolist. There's a couple of interesting points I came across so I thought I would jot it down.

The stateful interactive shell sample app.

This is a must for any app engine developer! It's leaps and bounds closer to the python shell than the default interactive shell. Get it from here. It needs some tweaking to get it working along side your app. Hint, aside from modifying your app.yaml, you will likely have to modify the app config in shell.py as well.

Again, simplicity works!

I made a simplification in my model in the in workings of the ordering of the lists, and the result was much less/simpler and less buggy code.

Data model migration.

So as I pointed out before, migration in app engine is not really well supported, essentially, you are on your own. If you want to change the type of a property - for example, I wanted to change a ListProperty(int) to a ListProperty(db.Key) - you will likely completely break your app, because the data will no validate. There is no way to migrate the data, because there's no way to access the existing data. The only workaround that I could come up with was to rename the property(well, actually, adding a new one and leaving the old one, at least temporarily). After that I could write a migration script. Now that I had the interactive shell on production as well(the default shell was only available for development), I could run it easily through the shell instead of writing a handler with the sole purpose of running the migration script.

My co-worker pointed out to me that an automatic migration scheme might be the future for the appengine data store and similar object oriented datastores. The idea was very interesting. Basically, we could come up with a scheme for migrating data models using renaming mechanisms like what I did above. But, we could do it all behind the scenes inside a framework. I can see how it would work and how it could be done:

  1. you would have a global schema version number. Everytime you modify the schema the version number would increment, ideally, the framework would somehow detect the change you made and up the version automatically.
  2. you would have a sort of version control for every property of every object, and have a numbered naming scheme for them, so: name_1, name_2, for example. Every change to the property would up the property version number.
  3. Each object would have a version property. This means that the system can operate with objects that have different schema versions at the same time without breaking.
  4. Removing a property is easy, nothing more really needs to be done. Unless you want to provide a way derive a value to the old missing property, in which case you can provide a property descriptor of the existing object to do it.
  5. Adding a property is easy too, you can provide a function of the existing object to provide an initial value if you want, which will be invoked the first time that the property is used.
  6. Renaming a property can be done by specifying the property with an old_name directive like: new_name = db.StringProperty(old_name="old_name") then the code just has to create the new property and initialize it to the value of the old property the first time it is used.
  7. Change the type of a property can be done by using an old_type directive much like the old_name directive for renaming, and then you may provide a conversion function of the existing object that returns the new initial value. Behind the scenes, the property version number will be incremented, so you are really adding a new property, which is aliased somehow back to the same name later.

Sounds good...There's one problem though, as you edit the schema in your .py file with directives like old_name and initial value functions which only apply to the most recent migration. But, since as we've seen, migration happens on the fly(as the data is accessed), there needs to be a historical archive of all migration directives, i.e. real version control. One way to do it would be to write explicit migration definitions as in rails migration. So something like:

add_property(Person, 'age', db.IntegerProperty())

instead of editing the models directly. I think this way is sufficient. I was kinda hoping I could just edit the models directly willy-nilly though. Maybe you can write tool that inspects your model definition and diffs it with the last version and generate a migration definition?

Anyway, better get some sleep. This is definitely an interesting idea and a very concrete one for a project.

blog comments powered by Disqus