One of my customers needed to build a new large object-storage array, but using different software and hardware than the now-obsolescent product they have. That means that they had to migrate everything.
Fortunately, the array is accessed via REST calls over https, so we have web-server logs. This is a story about how we used them to migrate everything.
The web server used for the REST interface logs every PUT or GET from the old system. The low-level mechanism includes a database of all the objects, and can list every object in the system.
It’s simple, therefore, to write a query to the low-level database that will list all the objects in the system, so we can migrate them. It doesn’t, however, account for changes to the system. Our migration would be a copy of the system, but only up to the date at which we did the query.
As we expect the migration to take some weeks, That Would Be Bad.
Fortunately, we have all the PUTs in the webserver log, and it is up to date. If we read the log with tail -f and send a notice of every PUT to the new system, those objects can be migrated too.
In effect, we are treating the logs as a variant of a database commit journal, and replaying it at another site to keep the new up to date with the old.
At the old site, we set up a program that is little more that tail -f | awk, which picks out every put to the old system and ships it off to the new one.
We also set up another, independent program that reads the log and reports each second how many requests there were. It also uses a web server option to report the average time a request takes, so we can tell if the migration program is slowing down the old system.
On the new system, we have a program that reads list of objects to migrate, request them from the old system and copies them to the new. Several copies are running at the same time, one of which strictly migrates new objects. As those are new, they will probably be used soon, so it is to our advantage to make sure we have them.
The others migrate objects from the list we get from the database, and we can slow down, speed up, start or stop them, so as to be able to control how much load we put on the old system. And ensure we don’t overload the new system, of course: a read from the old system is fairly “light”, but a write is heavier than a read, and we’re strictly doing writes to the new system.
If we have to stop or restart, we can see how far we got by looking at the logs, the journal, of the new system. They’re even in the same format. We can restart the migration with the next file to be transferred.
Logs aren’t (just) collections of information that the programmers cared about. In many cases, they can be used as database-like journals, and played or replayed to reproduce behaviour at another site or at another time.
If you ever need to migrate work, replay steps that led to a bug or generate a representative load test of an existing system, look to your logs.