Devoured By Lions

the eternal struggle to tame complexity

Fixing a Corrupted Encrypted LVM Partition

If you are desperate to recover your system skip down a few paragraphs to cut to the chase.

I just went through this ordeal so I thought I would pass these notes along for the next traveler. I have an MSI CR620 running Fedora 14. Everything has pretty much worked flawlessly on this machine (webcam, audio, wireless), except suspend/resume/hibernate has become flaky with the Fedora 14 kernels (2.6.35+). I have therefore been running the last Fedora 13 kernel (2.6.34.mumble), with which I’ve had no suspend/resume/hiberate issues. On the newer kernels, the machine will not resume correctly, and I have to hard-shutdown and start back up. That’s the background.

Enter a new UPS I purchased recently due to a suspicion that I have less than perfect utility input or wiring in the computer room. So far I like this unit alot. It comes with a USB cable and (Windows) software that displays UPS status and allows you to change configuration. Great.

Well, I decided since it’s USB, I’ll just hook it up to my KVM and see what Linux thinks. Initially it was great. Fedora just recognized it, it showed up in the Gnome power management applet with battery status and all sorts of geeky statistics (graphs!).

Well. At some point that changed, and I’m not sure why (possibly because both my Windows and Linux machine were on?). The power applet seemed to always show 0.0% for battery charge. Unfortunately, the power management scheme (at least what is accessible via a UI) has built-in action triggers for UPS status. Want to guess what happens when it thinks your UPS battery is going run out? Yeah. Hibernate. No. You can’t tell it not to (at least via the UI).

Unfortunately I discovered this while Shotwell was importing a bunch of my images (note: this does a massive amount of disk reads/writes). Apparently this completely consumes CPU and memory and was a problem alone, without the machine trying to hibernate every few seconds.

The machine hibernated on me (I was using a more recent kernel because VirtualBox-OSE will not run on the old kernel, apparently only the newer kernel modules are available). I went through the the cold shutdown reboot process (I had been doing it so frequently at this point this did not phase me at all). When it booted, I was presented with this horrifying message:

/var: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)
[FAILED]
*** An error occurred …
*** Dropping you to a shell …
Give root password …
My face: D:

This would be disconcerting enough. But a little more background. I use whole-drive encryption. Since I dual-boot Windows and Linux, I use Truecrypt for the Windows half of my “whole drive” (I would consider just cannibalizing this partition, but, hey, I paid for that Windows license), and an LVM volume group with a LUKS-encrypted root and home partition. I also set up sudo so that I can sudo to root. I change the root password to a random strong password with mkpasswd (this is in the ‘expect’ package of all places). Which I do not keep. I figure I don’t ever really need to log in as root right? (Needless to say I have reconsidered this).

In any case, I don’t have my root password.

Useful info starts here.

All is not lost, there actually isn’t anything that magical about LVM or LUKS encryption. I owe a debt of gratitude to nteon and hannes in the #fedora channel on Freenode, because it is possible to recover in a fairly straightforward fashion from a situation like this:

  • Boot into a LiveCD (or some other CD that has LVM and LUKS support). If you don’t have the media, well, you better have a machine with access to the net to get it.
  • Run ’cryptsetup luksOpen /dev/<encrypted partition>’ to “unlock” the LUKS partition on your LVM volume group. This should prompt you for the encryption password. You can also use the file browser (at least Nautilus on Gnome), or the Gnome Disk Utility app, to “unlock” the partition. I believe these all mean and do the same thing, although I use the command line for to be absolutely sure.
  • Since the volume group was mounted manually, you will need to run a couple of other arcane commands to tell the system that no, really they are actually there:
  • vgscan –mknodes’. This scans the LVM volume and makes device nodes for partitions. You need device nodes to do anything with the partitions (like, say, fsck them).
  • vgchange -ay’. This tells the system to make all the volumes/partitions “active” (I don’t totally understand it, but I infer that volumes can be active and inactive)
  • Now you should have partition devices under ’/dev/<volgroup>/<partition names>’. They should be recognizable, because you set them up and presumably named them. My volume group was named after the machine (msicr620) and my partitions were named conventionally based on what they did (lv_root, lv_home; I think the install did that).
Now you can ’fsck’ your partitions to figure out what is wrong:

  • fsck /dev/vg_msicr620/lv_root
  • fsck /dev/vg_msicr620/lv_home
  • fsck /dev/non_volgroup_partition
  • etc.
(if you know the file system type, you may want to try using the ’-t <fstype>’ flag. I fscked several times, and I think fsck was able to figure out the type when I omitted it.).

In my case, apparently the sole problem with the partitions, and the cause of this entire ordeal, is that the superblock has a future date as the last mounted date. I assume this was because I had enabled power management tracing (with ’echo 1 > /sys/power/pm_trace’) as described in the Fedora common problems guide, and this is noted to change the clock, some sequence of events must have left the last mounted date with the wrong time.

Once you are done fscking around you can reboot and boot from the HD. If you did not have serious problems with the partitions, everything will be fine and you will boot into your system. If, like me, you either do not have your root password, or have forgotten it, you can reset it by mounting your root volume when booted into the LiveCD, and editing your ‘/etc/shadow’ file. Obviously do this with care. There are some different ways to reset your root password here. Booting into single user mode did not work for me (probably related to corruption on the root volume).

To take forward:

  • recovering from filesystem corruption of encrypted partitions is really no different than any other type of partition, there are just a few more steps to “open” the encrypted volume. after that it’s pretty much the same process. although it’s no less stressful.
  • if you set a synthetic root password, record it somewhere (like KeePass) because you may in fact need it
  • I’m glad I had established a frequent and comprehensive backup schedule (via Deja Dup). It would not have been fun to lose this disk, but at least I had reliable backups.
  • investigate options that can make ext4 more reliable. there has been some controversy over ext4, and after this I am totally not in a mood to sacrifice reliability for…anything, really
  • ##fedora is good people. without them I probably would have rocked myself to sleep in a corner crying.
  • For now I have just disabled sleep and hibernate while running F14 kernels. This is not a great situation but it’s better than constantly locking up the machine. Since this worked basically flawlessly in the past, I assume it’s just a matter of time before the regression gets fixed.

Duplicity on Centos 5.5

There are a few good articles out there describing how to set up duplicity to backup to Amazon S3. If you are on CentOS however, chances are you have an older version of duplicity (0.6.09 on CentOS 5.5), and unfortunately this version of duplicity and a dependency library called python-boto, have problems using S3 as a backend. (there is a patch for one half of the problem, but there is also a problem with the underlying boto library) Newer versions that fix this (duplicity 0.6.11 and python-boto 1.9b-6) are in the epel-testing repo.

If you are using Puppet, an easy way to grab only these packages from epel-testing, without enabling all packages, is to specify a ‘yum-repo’ resource with an ‘includepkgs’ parameter, like:

class yum-repo {
yumrepo {
'epel-testing':
enabled => '1',
includepkgs => "duplicity python-boto"; # only include duplicity
}
}


This will ensure you only get the duplicity and python-boto updates, and don’t slam your entire server with test packages.

It would be more convenient to be able to enable a repository for a specific Puppet package definition (just like you can use –enablerepo on the yum command line), however it’s not possible at the moment. There is a feature request for this.

Fedora 13, PolicyKit and Sudo

If you are like me and randomize your root password and rely on sudo to gain administrative privileges, then you might be annoyed at Fedora’s switch to PolicyKit. While on the whole I think it’s a good framework and the right step forward (this opinion was earned after hours of time debugging this problem unfortunately), by default Fedora’s PolicyKit is configured to prompt for the root password. If you have added your own account to the wheel group and granted sudo privileges to that group, this can be quite an annoyance.

You can onfigure PolicyKit to treat the wheel group as administrator by:


[you@localhost /]$ (cat <<EOF
[Configuration]
AdminIdentities=unix-user:0;unix-group:wheel
EOF
) > /etc/polkit-1/localauthority.conf.d/99-wheel-policy.conf


This policy will override other policies (well, granted you have no policies numbered greater than ‘99’!).

Unfortunately PolicyKit support requires explicit cooperation from applications, and many applications have not yet been updated to integrate with PolicyKit. Notably, the gnome system control applets (system-config-*) all still go through the old ‘consolehelper’ utility. This utility appears to effectively always prompt for the root password. This distinction is not clear at all to the casual user, and it took me hours to realize that no amount of PolicyKit reconfiguration was going to make these apps work.

ActiveRecord Types

If you run into an error like this from Rails:

undefined method `amount’ for #


…check that you have your ActiveRecord column types correct in your migration. For example, :integer, not :int. Apparently db:migrate does not warn about using invalid column types.

More OS Xisms

In the Democratic Republic of Apple windows are only resized at the bottom right corner, and shortcuts definitely do not accept command line parameters. This is only to safeguard the moral purity of the citizenry of course.

If however you want to fight the system, and, say, use an Eclipse installation with multiple workspaces, as you can in other states, you can try this workaround.

Of course, commands are second class citizens and cannot live in the application section of the dock, and must be exiled to the documents ghetto (and marked with an ugly command shell icon).

Configuring Bash on OS X

I’m moving to OS X for my work machine and one of the first things I like to set up is my terminal. OS X has thankfully moved to bash. I initially tried adding aliases to my ~/.bashrc but they were not getting picked up. This is because the Terminal application spawns shells as login shells.

For login shells Bash will read the following files:

1. ~/.bash_profile
2. ~/.bash_login
3. ~/.bash_profile

For non-login interactive shells Bash will read the following file:

1. ~/.bashrc

See http://en.wikipedia.org/wiki/Bash#Startup_scripts

I have always used ~/.bashrc and just assumed this file was sourced under all circumstances. However I now suspect that this perception was due to the fact that typical Linux setups will provide a login script (e.g. ~/.bash_profile, or the system default) which itself sources ~/.bashrc. So I suppose this behavior is more of a convention. Which OS X does not follow.

The solution is simple enough (philosophical issues on how to properly structure these files aside): simply link the login script to the ~/.bashrc script.

ln -s ~/.bashrc ~/.bash_profile


The next step is to get rid of the annoying alert “bell” that is enabled by default in bash (and by the terminal application? the *nix terminal/character-device layer is complicated enough that I can’t pretend to understand how it actually works under the hood). Bash uses a library called readline to get read line input (surprise!). Readline uses the ~/.inputrc for various settings. You can set “set bell-style visible” (or “set bell-style off” if you don’t want it at all) to quell the annoying beep emitted whenever you hit tab for a completion (which for me is every half second). This did not quite do the job for me - the terminal was still beeping. So I had to go into the Terminal application settings and change the bell style to visible under the shell tab.

Hope that helps.

WEP Sucks

WEP sucks. I lost probably around four hours last weekend trying to configure WEP on a router with a repeater. Typically setting up just WEP is not that difficult, but introduce a repeater, from a different vendor, and now the complexity multiplies. It did not help that the machine that needed access via the repeater was physically located on a different floor, making debugging very time consuming.

Every time I need to configure WEP, which happens to be so infrequently that I forget the particular rites necessary, I’m mystified by the inept and complicated protocol and user interfaces.

The most important thing to realize is that if your router vendor supplies a “passphrase”-based key generation option, this has nothing, nothing to do with ASCII keys. Hex and ASCII of course are just different encodings of data. The “passphrase” is converted via a “de facto” (although undocumented as far as I can tell) algorithm into the WEP key. I suppose the rationale behind this is that users cannot fathom providing the key directly, which may be fair. However this distinction between “passphrase” and generated key is generally completely obscured. To further complicate things, the key generation algorithm apparently generates not one key, but FOUR keys. How helpful!

Here are a couple of independent WEP key generators:

* http://www.powerdog.com/wepkey.cgi
* http://www.csgnetwork.com/wepgeneratorcalc.html

(note, with a sample pass phrase I tested, these disagreed on the generated 128 bit key!)

Now, let’s say you’ve generated or selected such a key. Now comes the fun of entering it into the wireless client. What do you think you are supposed to enter? Of course it depends! You can enter your passphrase, or you can enter the hex key. Depending on client one or the other will magically work. You may have a client which obscures this distinction from you so again, it’s not really clear what you are supposed to enter.

I discovered that on Windows XP, if you are entering the WEP key directly as hex, that you need to prefix the key with $. How intuitive! It is also rumored that if you are entering the passphrase, that it must be quoted. Although I don’t remember ever having to quote a passphrase before.

http://www.justanswer.com/questions/1yczg-trying-to-set-up-a-wireless-net-work-at-home-desktop-pc-has

Oh, and that is not to mention that (surprise!) there are two styles of WEP (both suck), “open” and “shared”. If you are lucky this will be irrelevant. If you are not, you will get to find out.

Linux and FLOSS

I came across xwinman.org recently, which dredged up memories of the amateurish, but in their way charming (a time when somebody would actually attempt to copy the look of Windows ‘95) X window managers of yore. I had just started getting into Linux in the late 90s, and my enthusiasm and hope lead me to spend many tedious nights installing and configuring Linux (anybody remember TurboLinux) on the underpowered machine I had at the time. Although it was a great learning experience, using Linux as a desktop was underwhelming.

While I have used Linux on a daily basis since then both on job, and for personal use, it has been as a server OS. In the last year or so I have been using Linux more heavily on the desktop via virtualization (VMWare, and then VirtualBox), and I am really impressed. While the mantra has always been that Linux is “not yet ready” for the desktop, I think those days are solidly over. Hardware support is excellent, and UIs are stable, robust and featureful. Linux is becoming the desktop of choice on netbooks. The amount of software that is available is amazing, it all integrates with the web applications that people care about these days, and updates, bug fixes and new features are frequent.

Simultaneously I have been listening to various Linux (e.g. Linux Outlaws) and open source podcasts, and have gotten a sense of the vast self-supporting open source community that exists these days. Not the lone renegades of yesteryear, but a varied community that includes graphic and music artists, enthusiasts, and everyday users in addition to developers. This free culture movement has grown to include its own social networking, music and graphic art outlets in addition to operating systems.

From a developer’s point of view, Linux is an eden. If there is a technology I want to learn or experiment with, it’s just there. There is nothing to pay for and no permission to request. And if I need help, I can typically talk to somebody who has a similar personal interest, and often the developers themselves. There are no silos. It’s just open. In retrospect I’m sorry I did not get aboard sooner, although I am glad that I didn’t spend all that much time investing in the gated community that is proprietary software in the first place.

At this point I’m considering switching (once again) to a Linux desktop. Really the only things in the way to using Linux as my main desktop environment are:

1) Windows games
2) data backup/migration solution
3) one or two windows apps I may still want to run

I am currently running Linux in a VM only because (I assume) doing the opposite - running Linux as the main OS and Windows in a VM, which I would prefer - imposes performance issues that undercut gaming. This could be a non-issue given the amount of gaming I actually do (or should do), and a solution would be to just permanently give up Windows-based gaming. It may also be the case that some time in the future (or today for all I know) Windows games may perform adequately either in virtualization, or via WINE. The last solution is to just purchase this convenience with faster hardware; it may be worth the money.

As far as a backup solution, or data durability in general, I know this is actually much better under Linux (nothing can be worse than the situation on Windows)…it’s just something I would need to learn a little bit about (preserving home dirs and application settings across machine upgrades and OS distributions). The one Windows program I do use extensively is KeePass, and while this nominally runs under Linux (mono), I have not had much success with it. That said, I’m sure there is a Linux alternative (perhaps the built-in desktop keychains/stores?).

At this point, I’m eager to start doing some open source development, and actively participate instead of simply freeloading. On several occasions in the past I’ve been interested in working, for example, on projects like FireFox or OpenOffice, but it is effectively impossible (or just excruciatingly painful) to attempt this on Windows, and I have turned away. Now there is no excuse - sure the build systems aren’t that great, but if you follow the instructions, the tools are all there and it will work - no more chasing down hacky Windows ports, dealing with “batch” files etc. There just is no excuse not to participate.

Ariba Web Framework

Oh man.

Despite productivity benefits in the back end of any framework, there is still always tedious and mostly formulaic work to do generating all the interfaces for CRUD operations (of course, depending on your application, those interfaces may be where you are adding value; for me it’s not). So having automatically generated interfaces is really useful (e.g. Django’s admin UI). In Java JPA is becoming the ORM leader, and is used to explicitly mark up the classes that compose the model (as opposed to implicit inference, e.g. db4o). Well, that type information can be used to generate user interfaces, and it’s rather pointless to have to specify it multiple times. So I decided to just check what is out there in Java land regarding driving UI from JPA and stumbled into the Ariba Web framework. While their standard template mechanism looks like the typical component XML vomit, they also provide:

MetaUI-JPA: Binds MetaUI to a Java JPA-provided persistence engine (powered, by default, by Hibernate). Business objects can be annotated with JPA annotations (e.g. @Entity, @OneToMany, etc) and JPA with generate a database schema and support persisting and retrieving object instances. MetaUI-JPA further processes these annotations (and those for the Compass search framework) to create MetaUI rules that influence the generated UI.


There’s a bunch of screencasts which hopefully will clear up whether this really does what I hope it does.