Notes from Mike Ryan, the Migrate guy, at the bottom. Make sure you read them before copying any of this code.
The Migrate module has some of the best documentation I’ve ever seen in DrupalLand, but there are still just a couple of things that I’ve figured out over the last month that I wish had been clearly explained to me up front. This is an attempt to explain those things to myself.
Clearly, you’re going to be writing code to perform this migration.
There is a module - migrate_d2d - that is specifically for stuff like this. It’s aware of Drupal content types’ basic schema, so it’ll save you a LOT of SQL writing to join each nodes’ fields on to the base table.
You’ll write a class for each migration that you need to perform.
You’ll need to write a separate class for each content type that you have.
You’ll need to write classes for roles, users, files, and each taxonomy vocabulary that you have on the site.
You’ll tell Drupal about these migrations by writing an implementation of hook_cache_clear() that’ll “register” the migrations and make them show up in the GUI and in drush. This looks basically like this —
Registering the migration also creates a set of database tables for each migration, the most interesting of which is the migrate_map_xxx, where “xxx” is the machine_name of your migration, downcased.
Note
Since Migrate is an OOP thing, you can write a parent class for a generic “Node” migration that all of the other specific content types can inherit from. Most of the node migration classes that I wrote look like this, due to most of the fields being set up in the parent class —
Info
Most fields in a Drupal to Drupal migration will come over easily with Migration::addSimpleMappings(), but some require a little more coddling. These are often fields that represent a relationship to another entity - Taxonomy term references, other node references, etc. These will require something like this —
Speaking of that, prior to finally putting the pieces together about how related entities maintain that relationship, I did lots of clever coding to maintain the relationships between imported entities. It’s not that complicated, but I was manually looking into the migrate_map_xxx tables to pull destination_ids out. This is obviously wrong abd felt wrong when I was doing it, but it didn’t all click until chasing down vague error messages about “field validation errors” in later migrations. It doesn’t tell you what fields are in error, it just throws an Exception on these nodes and doesn’t save them. I finally ended up dumping $errors in field_attach_validate() and saw that it was always a related entity field that was erroring. It was easy to figure out after that, but it took me several weeks of getting my head around the rest of it all to be able to get to that very simple point.
I missed all of that for so long because the user migration has this tidy little line about 'role_migation' that establishes the relationship, so I thought it would/should be something along those lines. I spent a long time in the module code tracing down default arguments and the like before finally just doing it the hard way. This is wrong.
Oh, by the way, USE THE LATEST VERSION OF ALL OF THESE MODULES. Migrate finally released 2.6, years in the making apparently, a couple of weeks ago, as I was in the middle of all this. I’d been using the previous stable, which is of course missing years of work, and solves almost all problems out of the box.
Here’s another little gem regarding files, and making those relationships tie out —
Beer shot -
A review from the guy himself —
The blog post looks like a good intro to migrate_d2d 2.0, but I’m afraid now it’s a bit dated (as you point out towards the bottom).
hook_flush_caches() hasn’t been considered a good practice for a while (defining migrations in hook_migrate_api() and using drush migrate-register is preferred - https://www.drupal.org/node/1824884), but I see that migrate_d2d_example still does it - I’ll need to update that before the imminent new release so people aren’t misled.
Setting the source_type to ‘tid’ is covered at https://www.drupal.org/node/1224042 - by default the incoming value for a term field is assumed to be the term name, when you’re making use of a separate term migration via sourceMigration, the incoming value is a tid and you need to set the source_type so the field handler knows what to expect.
The file_class is similar - normally the value coming in to the file field is assumed to be a URL, but when using a separate file migration and referencing it via sourceMigration it’s a fid. The “class” in file_class is a PHP class - the name of any class implementing MigrateFileInterface can be used here.