LVM recovery tale.

Over the weekend I had the worrying experience of losing my LVM settings and potentially all my data… a quick search on the web showed a confusing set of information, much of it for older versions of LVM and therefore rather suspect.

Well, I recovered all my data and it was really quite simple, so I’ve written up what I did in the hope that someone else, in a similar situation, will find it useful. It’s a scary thing, losing the whole hard disk and knowing that, in reality, its all there.

First the situation

I’ve got a small /boot partition as ext2, and a larger one for the root directory (2Gb, also ext2). The rest of the hard disks (nearly 120Gb) are assigned to a volume group called, descriptively, system… (which is SuSE’s idea of a default name).
More accurately, they were supposed to be assigned to it. At first I had just added 60Gb to the volume group, it was my first use of LVM and I was hedging my bets. After 6 months of trouble free operation I decided to add another 60Gb of disk, which I did 3 months ago. Except that, although the physical and volume group managers all agreed that the volume group had 120Gb, the logical volume manager insisted that there was only 60Gb. I’d used Yast2 to create and add the volumes.
I tried every combination of commands I could think of to get the logical volume manager to recognise the additional space but it wouldn’t.
At the time, I was busy, so forgot about it, then last week I realised that I wanted to use the space so settled down to do something about it.

The problem

So, it seemed the best solution would be to remove the second partition that I had added (/dev/hdd1) from the physical volume manager and then add it back.
It wasn’t recognised so wouldn’t be missed, right?
Wrong!
pvremove /dev/hdd1 removed the label from /dev/hdd1 but also from /dev/hda7 (which was the original partition and full of data).
pvscan and pvs reported no physical volumes on the disk.
vgscan and vgs couldn’t find any volume groups.
lvscan and lvs were non-starters obviously.

The rather surreal thing was, the whole system kept on running quite nicely, X Server and KDE desktop and all, but I knew that as soon as I rebooted the system would be toast.

First I tried adding the partition back to the volume group system, but the system couldn’t find the ‘system’ group. I tried creating the physical volume again (pvcreate) but that told me that the volume already existed. It became clear that I would need to reboot and hope that the system sorted itself out, flushed the disks, resynced, whatever.

The solution

After rebooting the system wouldn’t come up, which is kind of what I had expected so I had to reboot from the SuSE Rescue disks. So now I had to think about how to recreate the physical volumes, volume group and logical volume and do it with the data intact. (I have daily backups but the thought of restoring the whole system, applications and data, was not too exciting, especially as I knew all the data was there and intact. With a ‘regular’ hard disk partition that had got lost I could scan the disk for potential disk partitions and restore them. But that wouldn’t work with LVM.

On a search through various sites, I found one that mentioned the importance of saving a copy of the volume group parameters to a file using vgcfgbackup. This file could then be used to restore the parameters later, assuming that the underlying physical structure hadn’t changed. Well, the physical layout hadn’t changed but unfortunately I hadn’t created a backup of the volume group parameters (the ‘descriptor area’ to use the technical term) so that didn’t seem to hopeful. I poked around in the /etc directory (I still had the ‘/’ partition remember, as that was on its own ext2 partition) and noticed that there was a /etc/lvm/backup/ directory and a /etc/lvm/archive/ directory. Further investigation and I found that these are automatically created by LVM whenever changes are made to the system.

Unfortunately, all the messing around I had done had created a non-working version of the system file and the archive files didn’t seem to be recent enough. But, I remembered that I had a backup of the system files (going back 6 months in fact) and so I dug out a copy of the /etc/lvm/backup/sysem file and used that.

Here is what I did: First find out the old UID’s of the partitions, this is in the /etc/lvm/backup/system file. They are quite long… make sure you get the UID for the physical volumes.
$pvcreate -u sdSD-2343-SD939-adIda2 /dev/hda6
$pvcreate -u dk33kd-929293nd-adfja298a /dev/hdd1
$vgcreate -v system /dev/hda7 /dev/hdd1
$vgcfgrestore -f /etc/lvm/backup/system system

and lo!, all data present and correct!

In fact, I just rebooted the system and was back where I had started with the additional benefit of an extra 60Gb of disk space, because now I had the extra partition properly included.

[Note: in the lines using pvcreate... above I could have used:
$ pvcreate --restorefile /etc/lvm/backup/system
to automatically find the ID's but I hadn't realized that at the time. Without the UID's then the vgcfgrestore will not find the physical volumes that it needs to recreate the volume group.]

The lesson

Don’t panic!
Keep a safe copy of your /etc/lvm/ files!
Make sure that you have a Rescue disk that understands the LVM system!

Apart from the above disaster, which seems to have sorted itself out very easily, I have had no trouble with the LVM system. At first I was worried that if there was a failure it would lose everything. There is something very comforting about a simple ext2 (or FAT) partition in that I know it can just be hacked at the bit level and rebuilt. Something like LVM, which is logical volumes on top of volume groups on top of physical volumes is impossible to rebuild ‘by hand’ so I’m learning to trust technology a bit.

70 thoughts on “LVM recovery tale.”

  1. Well, finally I came to this page. It was a disaster for me today, I accidentally issued a GRUB command- setup (hd0)-onto my (whole 5.2TB RAID6) VG, I can not access all the data onit any more, Redhat ES4 LVM2 list it as “raw” disk, no VG/File System……, which I believe overwrited the beginning of the disk, since less -f /dev/sda can see those grub text in the first few pages. I have the VG backup file and will try to recover the data tomorrow. God bless my data!

  2. I can not express my thankfullness to you, man! you are the life saver. I recoverd everything on this 5.2TB. here is what I did: PV is locate on /dev/sdb (whole raid was be used)

    pvcreate -f –uuid “Z4lh8H-G0e8-K8q1-4WB6-faac-39hk-b82MWU” –restorefile /etc/lvm/backup/recover /dev/sdb Physical volume “/dev/sdb” successfully created pvdisplay — Physical volume — PV Name /dev/sda2 VG Name VolGroup00 PV Size 232.72 GB / not usable 0 Allocatable yes PE Size (KByte) 32768 Total PE 7447 Free PE 1 Allocated PE 7446 PV UUID RF27gp-ESK3-LhLv-Y91B-8NOl-6tmG-42Op0X

    — NEW Physical volume — PV Name /dev/sdb VG Name PV Size 4.73 TB Allocatable NO PE Size (KByte) 0 Total PE 0 Free PE 0 Allocated PE 0 PV UUID Z4lh8H-G0e8-K8q1-4WB6-faac-39hk-b82MWU

    vgcreate sbsraid1 /dev/sdb Volume group “sbsraid1″ successfully created vgcfgrestore -f /etc/lvm/backup/recover sbsraid1 Restored volume group sbsraid1

    vgdisplay — Volume group — VG Name sbsraid1 System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 3 VG Access read/write VG Status resizable MAX LV 0 Cur LV 1 Open LV 0 Max PV 0 Cur PV 1 Act PV 1 VG Size 4.73 TB PE Size 4.00 MB Total PE 1240049 Alloc PE / Size 1240049 / 4.73 TB Free PE / Size 0 / 0 VG UUID Cl4u2r-MqQe-ws9w-uOT1-K8GO-RO5h-4iimfj

    — Volume group — VG Name VolGroup00 System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 4 VG Access read/write VG Status resizable MAX LV 0 Cur LV 3 Open LV 3 Max PV 0 Cur PV 1 Act PV 1 VG Size 232.72 GB PE Size 32.00 MB Total PE 7447 Alloc PE / Size 7446 / 232.69 GB Free PE / Size 1 / 32.00 MB VG UUID LtkJbi-2Lxr-QIBz-ltd5-FH3D-WEuc-FhlymW

    vgchange -ay 1 logical volume(s) in volume group “sbsraid1″ now active 3 logical volume(s) in volume group “VolGroup00″ now active

    change /etc/fstab mount -a everything is recovered

  3. Joe, congratulations on saving your data.

    And many thanks from me and those who follow on, for sharing the details of what worked for you.

    cheers

  4. I don’t get it. sudo pvcreate -u /dev/hdd2 Software RAID md superblock detected on /dev/hdd2. Wipe it? [y/n] n Physical volume “/dev/hdd2″ successfully created

    scary enough, then:

    sudo vgcreate -v test /dev/hdd2 Wiping cache of LVM-capable devices Adding physical volume ‘/dev/hdd2′ to volume group ‘test’ /dev/hdd2 not identified as an existing physical volume Unable to add physical volume ‘/dev/hdd2′ to volume group ‘test’.

    Maybe I don’t pass the interpretation test from your site to my shell prompt.

    It’s a foreign drive to this system.

  5. further investigation: sudo pvcreate -u 6oVVgd-WIVK-qB0y-PXk2-Acu5-ZxcI-gMj5JD /dev/hdd2 Software RAID md superblock detected on /dev/hdd2. Wipe it? [y/n] n Physical volume “/dev/hdd2″ successfully created

    pvdisplay — NEW Physical volume — PV Name /dev/hdd2 VG Name PV Size 74.43 GB Allocatable NO PE Size (KByte) 0 Total PE 0 Free PE 0 Allocated PE 0 PV UUID 6oVVgd-WIVK-qB0y-PXk2-Acu5-ZxcI-gMj5JD

    Then pvdisplay again; no output, no pv found. Weird eh? You can pvcreate then pvdisplay once, then it’s gone!

  6. Hi, all

    I have some questions to ask you, please help me, please send the mail to daobangw@promisechina.com.

    1. How to resume the data from snapshot?

    2. Does LVM support rollback?

    3. If I create a snapshot for LV on PC1, and I copy the snapshot volume data to PC2, how to resume the LV data on PC1 from the snapshot volume data on PC2?

    Thank you very much.

  7. Hi q

    Maybe it’s because it’s a RAID drive? Sounds like it’s writing to one drive and trying to read another, or syncing to the ‘un-lvm’ed’ drive.

    Not something I’ve come across though.

  8. Hello,

    This procedure really saved me. I was working on an IBM z990-2084 under zVM 5.1 when I needed to add more disk space. You should seen my face when I rebooted. Linux asked me to enter the root password. I knew then I was in deep problems. I spent a week trying to recover my data when I found your procedure. It’s a lifesaver. Thanks a lot.

    Miguel

  9. Thanks, thanks and thanks and Thanks, thanks and thanks and ….. I needed change my hard disk, and some thing gone wrong. No, no , never my fault ;-)))) Just read and do what you said, more 1 year after, it’s very good thing you exist. Just for people, keep in you mind to never write on your partition to recover it properly all my thanks to your mother to invent you !

  10. Jalal, dude you rock! Your instructions saved my team from certain destruction! We were migrating data from a Xiotech SAN to a NetApp filer and something got screwed up along the way. Basically three physical volumes on the Xiotech as a single logical volume migrated to one physical volume on the NetApp – but somehow it kept the original partition information on the new logical volume (this was on a new iSCSi LUN). Anyway, our admin had decided to remove this volume from a volume group as there were some issues booting and it lost the volume information for the new logical volume.

    Well, I read up on LVM for about an hour and a half and it dawned on me that the data should still be on the original physical volumes on the Xiotech SAN – and all I would need to do is roll back to the config before the migration. Thanks to your instructions this process became a lot easier and I had it back and running within an hour.

    We have a research operation (drug discovery) which depends on this server too, so you have my thanks for making the process of finding the correct sequence of commands that much easier.

    Muchos Gracias Amigo!

  11. Many thx for this guide and thx to Mr. John (http://codeworks.gnomedia.com/archives/2005/general/lvm_recovery/#comment-2803) I was able to restore my data in 2 hdd, which is connected to HW-RAID 0 FC6 server I found that both hdd has the same info in the begining of partition. So after boot to knoppix I did this (dd if=/dev/”first/second hdd” of=”VolGroup00 bc=100000 count=1) and some editing in gvim and I get file for vgcfgrestore. After that I run fsck.ext3 -y and as I see most of my data is OK.

    THX again and sorry for my english.

  12. I tell pvcreate /dev/sda9 /dev/sda5 Before it i had vg. Then I create new vg in same place. I lost my data. After $pvcreate -u sdSD-2343-SD939-adIda2 /dev/hda6 $pvcreate -u dk33kd-929293nd-adfja298a /dev/hdd1 $vgcreate -v system /dev/hda7 /dev/hdd1 $vgcfgrestore -f /etc/lvm/backup/system system i restore my volumes and data.Thank you very much

Comments are closed.