Updaters

I blogged about picking up Tim’s new installer branch last summer. I worked on it with a lot of help from Max and demo’d it in Berlin. Since Berlin, Ævar, Siebrand and Mark have joined in and it’s getting a whole lot closer to being done. It’s in trunk, the installation runs pretty smoothly. Updates are still a pain in the ass.

Database upgrades suck when you have to maintain them for multiple DBMSes. In working on the new installer (and associated updater refactoring), we’ve come up against a very annoying wall: maintaining versioned updates for databases gets to be difficult. With varying levels of support, the only schema that is guaranteed to work–provided someone didn’t break trunk–is mySQL. We maintain a list of of patch files to be applied, in order, on each upgrade. This list is kept up to date for mySQL; SQLite generally just works with mySQL’s syntax so it’s pretty close behind in support. Postgres tends to lag a few hours or days (and maintains its own set of update sequences, actually). Oracle doesn’t do patch files, but does update its default schema…so updates aren’t really possible except manually. Ibm_Db2 is all but abandoned, and Mssql was removed.

I briefly experimented with abstracting the whole schema to a big array of table definitions, but this was going to take too much work so the work was split off into a branch. One day I’d like to go back to this–done properly, it would cure our schema woes forever. Left with no other solution, Max forced the current updaters to run from the web, grabbing the output and throwing it back at the user. This is bad for a couple of reasons.

  1. Updates can take a long time to run, especially if you have big tables. The fact that we run every update every time you try to upgrade wastes cycles.
  2. It makes it impossible to localize along with the rest of the installer, all of the updaters output is hardcoded.
  3. Without a clearly defined interface, it’s hard to get information back to the user and they’re left reading the output from update.php. Did the script warn or fail completely? Can we proceed?

So I decided to refactor things a bit. First I changed the updatelog table to contain an optional ul_value column (it already had ul_key). I moved the array of updates into their own class, so we can at least visually separate the list from the implementation. I then moved some of the core logic out of updaters.inc into a new Update class. Issue number one is pretty much mitigated now, since the new class logs which updates are performed, so it won’t needlessly repeat updates. Issues 2 and 3 are more possible now, since we could (haven’t yet) implement an output callback system like I did for Installer::performInstallation().

SQLite was easy, and Oracle just has a dummy implementation for now. Postgres is another beast entirely. Like I mentioned above, it maintains its own update sequence. Parts of it make sense, parts of it don’t, but none of it is designed like the others. Basically needs to be rewritten entirely.

For the new-installer to be complete by the time 1.17 rolls out, we need to pick up the pace. The time for new features is pretty much over and we need to start closing up the last few bugs. Updates are a blocker still. They’re a bit better than they were, but it’s still got a lonnnggg way to go before we’re done maintaining patch files for each DBMS. As always, bugs in the new installer can go here or in Bugzilla.

What wikis mean to me

I haven’t blogged in awhile, but I’ve got some free time today and a lot on my mind.

Sometime in 2004 or 2005, I first head of Wikipedia. It was this online encyclopedia with a lot of interesting articles. And the links! Oh the links; they transport you quickly to topics you’ve never even heard about. I soon learned that this marvelous resource was editable by people just like you an I. I was already familiar with free software, so the crowd-sourcing aspect was a concept I was already familiar with.

Becoming more involved, I joined the ranks of the editors and then administrators seeking to organize, revert and discuss the ever-growing content that formed the encyclopedia. My use of e-mail skyrocketed, as I found myself participating in long-winded mailing-list discussions about the intricacies of fair-use media and whether or not joke articles should be preserved. I also tried poking at the software, since it too was open source and asking for contributions.

Somewhere around this time, I became disenchanted with Wikipedia. Perhaps the administrative processes had taken their toll, or maybe I was just tired of arguing. I had never been a good writer (on or offline) so I couldn’t “get back to writing articles” or something of that nature. Facing a void, I turned to Veropedia.

Veropedia was a now-defunt venture by Danny Wool, myself and a few others to showcase the very best of what Wikipedia had to offer. Scraping (yuck) static versions that had been proofread and then using those for display to the end user. Vero was looking for a new developer, so I stepped up. I spent about a year or so working on the project and actually made some decent milestones. I managed to get a Lucene-based search going as well as ported our entire article validation script to PHP from Perl. Our technology was a hodgepodge of scripts, mostly because the original developer thought MediaWiki had too much overhead for our needs. Planning out a MediaWiki-based phase 2 of our software became the new goal.

So I started getting more involved with MediaWiki development. I got commit access, and started working on various bugs that were hindering Veropedia development–sidebar: I originally discovered the libxml2 entity bug while working on some customizations for Special:Import. And so Veropedia kind of just stayed the same while I chugged away at MediaWiki. Somewhere along the line I stopped really caring about Veropedia. I was busy with work, school and now I’d taken up MediaWiki development in my spare time.

Veropedia is dead. At least Veropedia as I’ve known it is dead. Actually, the server I had it living on just got shut down today. I’ve still got the leftover backups lying around, but the site itself is down. Danny still has the domain names if he ever wants to use them again, but I’m done with it. It’s not a wiki.

And that’s really what it’s all about to me at this point. Some point while working on MediaWiki, I realized that that was what I really wanted to do. Work on wikis. Veropedia wasn’t a wiki. Arguments on mailing lists and talkpages aren’t wikis. Collaboratively editing text is what MediaWiki does and what I’m proud to help support. I’ve started working with the Foundation on a contract to support FlaggedRevs/Pending Changes and it’s exciting to make the move from being a volunteer.

I really think that workflow systems like the one I’m helping to support really improve the wiki model. They help pages be as open as possible to editing while still allowing editorial control. And producing free, high-quality content is really what MediaWiki is designed to do. I think it’s pretty cool that I can help make that happen.

New host, new look

I’ve switched hosts (you may have noticed a bit of downtime). I’m now with some awesome guys over at vps.net.

I also got tired of my Android theme, so I picked this one with the fishy.