Monkey Patching Vagrant LXC Issue in Ubuntu 20.04


Ubuntu 20.04 Focal Fossa was officially launched just a couple of weeks ago. And I thought it was the best time to say goodbye to my Ubuntu 18.04. Looking back then, I was so determined entering the world of Ubuntu 20.04. But little that I knew, I will be losing sleep figuring out why my vagrant lxcs won't "vagrant up".  

It started with my attempt to upgrade the operating system. I opened up the Software Updater, but there was no upgrade offer anywhere in the prompt. I checked the settings in Software & Updates app. Looked into the "Updates" tab, the "Notify me of a new Ubuntu version:" option said: "For long term support version". Well, 20.04 is an LTS, isn't it. It should be there, but why it isn't?

Changing the option to "For any new version", prompted me with 19.10 upgrade but no thanks, 19.10 is so yesterday.What have I missed? Googled it and I found that Software Updater won't give you the prompt, until the dot version of 20.04 aka 20.04.1 is released. And that could be another 3 months. But, luckily "sudo do-release-upgrade -d" is here to make the update right here, right now.

Typing that mantra in my terminal warped me into the time when Ubuntu 20.04 running as my new shiny beautiful OS. Everything seemed perfect. All updates did not require manual intervention except for PostgreSQL. It can't automatically upgrade my v.10 cluster to v.12 cluster, there is a command I had to run. OK, I'll do that later. It wasn't that urgent.

My first todo then was to re-enabled the third-party repositories, which were disabled during the upgrades. Enabling them is just a matter of ticking some check boxes, and removing the "disabled on upgrade to 20.04" comments. One weird thing though, that docker repo somehow got a "fossa" entry added to it, but it didn't work. No fossa release yet for docker. Another googling pointed out that without using the docker repo, one can find the "docker.io" package from Canonical which serves the same purpose.

Now, comes the time that I was waiting for. Running my rails development server inside Ubuntu 20.04 vagrant-lxc. But wait... there's a problem. All the vagrant-lxc machines wouldn't even start. This is a nightmare!

Trying to troubleshoot the problem, my first suspicion was that my vagrant installation could be broken from the upgrade. I downloaded and installed the latest vagrant and reinstalled vagrant-lxc. I also reinstalled lxd and lxc along with all their supporting packages. But the problem wasn't going away. The error was still there saying:



I tried using "VAGRANT_LOG=DEBUG vagrant up", and out in the dark found this somehow cryptic message:



Seemed like bumping into a dead end, I tried running vagrant-lxc in VBoxed 20.04. Got the same error. Ran vagrant-lxc in VBoxed 19.10 and VBoxed 19.04 didn't give me any good result, either. Error messages were different though, but I don't remember them. There was this time when dropping back to Ubuntu 18.04 seems like my only option. But fortunately I didn't take that.

The next day, I googled the name of a file mentioned in the error message: ".vagrant.d/gems/2.6.6/gems/vagrant-lxc-1.4.3/scripts/pipework".

And this article showed up. This might be the answer! Fransisco Soto described his findings about update in the "latest" lxc folder structures that breaks the "pipework" script used by vagrant-lxc.

I began looking into things that he mentions in that post. First, the pipework script itself. Fransisco's case was a little different though. He got "Found more than one container matching $GUESTNAME." error. But I didn't. As you can see above, mine was "Invalid arguments ...". For me, the offending block was this one.



My suspicion was that the "docker inspect" thing could be masking what is actually going on. So, I removed my docker installation. And, turned out my guess was right! I got that message from line 178, saying "Container $GUESTNAME not found, and Docker not installed".

From there, I went into this folder where that "$CGROUPMNT" variable is pointing to: "/sys/fs/cgroup/devices". My finding was this:

 

Again, this is a bit different than what Fransisco had. The "lxc.monitor" and "lxc.payload" are not parent folders of the $GUESTNAME folder, but they came as the filename prefix!

With a little "hack", I tried to fix the "broken pipe". Combining two latest fix from pipework github with my own customization, I came up with this:


Yep, added that "lxc.*." before "$GUESTNAME" string at line 157. But is that all it needed? No, I had to add the "lxc.*." prefix again to line 245 as you can see below, or else the "Could not find a process inside container "$GUESTNAME" is breaking the "vagrant up" again.



A little note about those lines:
I'm not a shell script expert, but if you want to be more careful, changing "lxc.*." with "lxc.payload." might be a better move. As with "lxc.*" you'll got "lxc.payload.$GUESTNAME" as the first item and "lxc.monitor.$GUESTNAME" as the second, the order could be reversed and it might cause failure.

This is the final diffs between the initial pipework script inside vagrant-lxc 1.4.3 plugin and the one that I monkey patch:

https://github.com/chrishadi/vagrant-lxc/commit/969f50b1aa31f5f07cfa64d114ca60aaf127f160

Happy hacking.

Comments

Popular posts from this blog

How to Upgrade PostgreSQL 10 Cluster to 12 in Ubuntu 20.04

Setting up a Green Branch