Canonical Voices

Posts tagged with 'canonical'

Louis

Introduction

Once in a while, I get to tackle issues that have little or no documentation other than the official documentation of the product and the product’s source code.  You may know from experience that product documentation is not always sufficient to get a complete configuration working. This article intend to flesh out a solution to customizing disk configurations using Curtin.

This article take for granted that you are familiar with Maas install mechanisms, that you already know how to customize installations and deploy workloads using Juju.

While my colleagues in the Maas development team have done a tremendous job at keeping the Maas documentation accurate (see Maas documentation), it does only cover the basics when it comes to Maas’s preseed customization, especially when it comes to Curtin’s customization.

Curtin is Maas’s fastpath installer which is meant to replace Debian’s installer (familiarly known as d-i). It does a complete machine installation much faster than with the standard debian method.  But while d-i is well known and it is easy to find example of its use on the web, Curtin does not have the same notoriety and, hence, not as much documentation.

Theory of operation

When the fastpath installer is used to install a maas unit (which is now the default), it will send the content of the files prefixed with curtin_ to the unit being installed.  The curtin_userdata contains cloud-config type commands that will be applied by cloud-init when the unit is installed. If we want to apply a specific partitioning scheme to all of our unit, we can modify this file and every unit will get those commands applied to it when it installs.

But what if we only have one or a few servers that have specific disk layout that require partitioning ?  In the following example, I will suppose that we have one server, named curtintest which has a one terabyte disk (1 TB) and that we want to partition this disk with the following partition table :

  • Partition #1 has the /boot file system and is bootable
  • Partition #2 has the root (/) file system
  • Partition #3 has a 31 Gb file system
  • Partition #4 has 32 Gb of swap space
  • Partition #5 has the remaining disk space

Since only one server has such a disk, the partitioning should be specific to that curtintest server only.

Setting up Curtin development environment

To get to a working Maas partitioning setup, it is preferable to use Curtin’s development environment to test the curtin commands. Using Maas deployment to test each command quickly becomes tedious and time consuming.  There is a description on how to set it up in the README.txt but here are more details here.

Aside from putting all the files under one single directory, the steps described here are the same as the one in the README.txt file :

$ mkdir -p download
$ DLDIR=$(pwd)/download
$ rel="trusty"
$ arch=amd64
$ burl="http://cloud-images.ubuntu.com/$rel/current/"
$ for f in $rel-server-cloudimg-${arch}-root.tar.gz $rel-server-cloudimg-{arch}-disk1.img; do wget "$burl/$f" -O $DLDIR/$f; done
$ ( cd $DLDIR && qemu-img convert -O qcow $rel-server-cloudimg-${arch}-disk1.img $rel-server-cloudimg-${arch}-disk1.qcow2)
$ BOOTIMG="$DLDIR/$rel-server-cloudimg-${arch}-disk1.qcow2"
$ ROOTTGZ="$DLDIR/$rel-server-cloudimg-${arch}-root.tar.gz"
$ mkdir src
$ bzr init-repo src/curtin
$ (cd src/curtin && bzr  branch lp:curtin trunk.dist )
$ (cd src/curtin && bzr  branch trunk.dist trunk)
$ cd src/curtin/trunk

You now have an environment you can use with Curtin to automate installations. You can test it by using the following command which will start a VM and run « curtin install » in it.  Once you get the prompt, login with :

username : ubuntu
password : passw0rd

$ sudo ./tools/launch $BOOTIMG --publish $ROOTTGZ -- curtin install "PUBURL/${ROOTTGZ##*/}"

Using Curtin in the development environment

To test Curtin in its environment, simply remove  — curtin install « PUBURL/${ROOTTGZ##*/} » at the end of the statement. Once logged in, you will find the Curtin executable in /curtin/bin :

ubuntu@ubuntu:~$ sudo -s
root@ubuntu:~# /curtin/bin/curtin --help
usage: main.py [-h] [--showtrace] [--verbose] [--log-file LOG_FILE]
{block-meta,curthooks,extract,hook,in-target,install,net-meta,pac
k,swap}
...

positional arguments:
{block-meta,curthooks,extract,hook,in-target,install,net-meta,pack,swap}

optional arguments:
-h, --help            show this help message and exit
--showtrace
--verbose, -v
--log-file LOG_FILE

Each of Curtin’s commands have their own help :

ubuntu@ubuntu:~$ sudo -s
root@ubuntu:~# /curtin/bin/curtin install --help
usage: main.py install [-h] [-c FILE] [--set key=val] [source [source ...]]

positional arguments:
source what to install

optional arguments:
-h, --help show this help message and exit
-c FILE, --config FILE
read configuration from cfg
--set key=val define a config variable

 

Creating Maas’s Curtin preseed commands

Now that we have our Curtin development environment available, we can use it to come up with a set of commands that will be fed to Curtin by Maas when a unit is created.

Maas uses preseed files located in /etc/maas/preseeds on the Maas server. The curtin_userdata preseed file is the one that we will use as a reference to build our set of partitioning commands.  During the testing phase, we will use the -c option of curtin install along with a configuration file that will mimic the behavior of curtin_userdata.

We will also need to add a fake 1TB disk to Curtin’s development environment so we can use it as a partitioning target. So in the development environment, issue the following command :

$ qemu-img create -f qcow2 boot.disk 1000G Formatting ‘boot.disk’, fmt=qcow2 size=1073741824000 encryption=off cluster_size=65536 lazy_refcounts=off

sudo ./tools/launch $BOOTIMG –publish $ROOTTGZ

ubuntu: ubuntu password: passw0rd

ubuntu@ubuntu:~$ sudo -s root@ubuntu:~# cat /proc/partitions

major minor  #blocks  name

253        0    2306048 vda 253        1    2305024 vda1 253       16        426 vdb 253       32 1048576000 vdc 11        0    1048575 sr0

We can see that the 1000G /dev/vdc is indeed present.  Let’s now start to craft the conffile that will receive our partitioning commands. To test the syntax, we will use two simple commands :

root@ubuntu:~# cat << EOF > conffile 
partitioning_commands:
  builtin: []
  01_partition_make_label: ["/sbin/parted", "/dev/vdc", "-s", "'","mklabel","msdos","'"]
  02_partition_make_part: ["/sbin/parted", "/dev/vdc", "-s", "'","mkpart","primary","1049K","538M","'"] 
sources:
  01_primary: http://192.168.0.13:9923//trusty-server-cloudimg-amd64-root.tar.gz
  EOF

The sources: statement is only there to avoid having to repeat the SOURCE portion of the curtin command and is not to be used in the final Maas configuration. The URL is the address of the server from which you are running the Curtin development environment.

WARNING

The builtin [] statement is VERY important. It is there to override Curtin’s native builtin statement which is to partition the disk using « block-meta simple ».  If it is removed, Curtin will overwrite he partitioning with its default configuration. This comes straight from Scott Moser, the main developer behind Curtin.

Now let’s run the Curtin command :

root@ubuntu:~# /curtin/bin/curtin install -c conffile

Curtin will run its installation sequence and you will see a display which you should be familiar with if you installed units with Maas previously.  The command will most probably exit on error, comlaining about the fact that install-grub received an argument that was not a block device. We do not need to worry about that at the motent.

Once completed, have a look at the partitioning of the /dev/vdc device :

root@ubuntu:~# parted /dev/vdc print
Model: Virtio Block Device (virtblk)
Disk /dev/vdc: 1074GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start   End     Size    Type      File system  Flags
1      1049kB  538MB   537MB   primary   ext4

The partitioning commands were successful and we have the /dev/vdc disk properly configured.  Now that we know that the mechanism works, let try with a complete configuration file. I have found that it was preferable to start with a fresh 1TB disk :

root@ubuntu:~# poweroff

$ rm -f boot.img

$ qemu-img create -f qcow2 boot.disk 1000G
Formatting ‘boot.disk’, fmt=qcow2 size=1073741824000 encryption=off cluster_size=65536 lazy_refcounts=off

sudo ./tools/launch $BOOTIMG –publish $ROOTTGZ

ubuntu@ubuntu:~$ sudo -s

root@ubuntu:~# cat << EOF > conffile 
partitioning_commands:
  builtin: [] 
  01_partition_announce: ["echo", "'### Partitioning disk ###'"]
  01_partition_make_label: ["/sbin/parted", "/dev/vda", "-s", "'","mklabel","msdos","'"]
  02_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","primary","1049k","538M","'"]
  02_partition_set_flag: ["/sbin/parted", "/dev/vda", "-s", "'","set","1","boot","on","'"]
  04_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","primary","538M","4538M","'"]
  05_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","extended","4538M","1000G","'"]
  06_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","logical","25.5G","57G","'"]
  07_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","logical","57G","89G","'"]
  08_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","logical","89G","1000G","'"]
  09_partition_announce: ["echo", "'### Creating filesystems ###'"]
  10_partition_make_fs: ["/sbin/mkfs", "-t", "ext4", "/dev/vda1"]
  11_partition_label_fs: ["/sbin/e2label", "/dev/vda1", "cloudimg-boot"]
  12_partition_make_fs: ["/sbin/mkfs", "-t", "ext4", "/dev/vda2"]
  13_partition_label_fs: ["/sbin/e2label", "/dev/vda2", "cloudimg-rootfs"]
  14_partition_mount_fs: ["sh", "-c", "mount /dev/vda2 $TARGET_MOUNT_POINT"]
  15_partition_mkdir: ["sh", "-c", "mkdir $TARGET_MOUNT_POINT/boot"]
  16_partition_mount_fs: ["sh", "-c", "mount /dev/vda1 $TARGET_MOUNT_POINT/boot"]
  17_partition_announce: ["echo", "'### Filling /etc/fstab ###'"]
  18_partition_make_fstab: ["sh", "-c", "echo 'LABEL=cloudimg-rootfs / ext4 defaults 0 0' >> $OUTPUT_FSTAB"]
  19_partition_make_fstab: ["sh", "-c", "echo 'LABEL=cloudimg-boot /boot ext4 defaults 0 0' >> $OUTPUT_FSTAB"]
  20_partition_make_swap: ["sh", "-c", "mkswap /dev/vda6"]
  21_partition_make_fstab: ["sh", "-c", "echo '/dev/vda6 none swap sw 0 0' >> $OUTPUT_FSTAB"]
sources: 01_primary: http://192.168.0.13:9923//trusty-server-cloudimg-amd64-root.tar.gz EOF

You will note that I have added a few statement like [« echo », « ‘### Partitioning disk ###' »] that will display some logs during the execution. Those are not necessary.
Now let’s try a second test with the complete configuration file :

root@ubuntu:~# /curtin/bin/curtin install -c conffile

root@ubuntu:~# parted /dev/vdc print
Model: Virtio Block Device (virtblk)
Disk /dev/vdc: 1074GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start   End     Size    Type      File system  Flags
1      1049kB  538MB   537MB   primary   ext4         boot
2      538MB   4538MB  4000MB  primary   ext4
3      4538MB  1000GB  995GB   extended               lba
5      25.5GB  57.0GB  31.5GB  logical
6      57.0GB  89.0GB  32.0GB  logical
7      89.0GB  1000GB  911GB   logical

We now have a correctly partitioned disk in our development environment. All we need to do now is to carry that over to Maas to see if it works as expected.

Customization of Curtin execution in Maas

The section « How preseeds work in MAAS » give a good outline on how to select the name of the a preseed file to restrict its usage to specific sub-groups of nodes.  In our case, we want our partitioning to apply to only one node : curtintest.  So by following the description in the section « User provided preseeds« , we need to use the following template :

{prefix}_{node_arch}_{node_subarch}_{release}_{node_name}

The fileneme that we need to choose needs to end with our hostname, curtintest. The other elements are :

  • prefix : curtin_userdata
  • osystem : amd64
  • node_subarch : generic
  • release : trusty
  • node_name : curtintest

So according to that, our filename must be curtin_userdata_amd64_generic_trusty_curtintest

On the MAAS server, we do the following :

root@maas17:~# cd /etc/maas/preseeds

root@maas17:~# cp curtin_userdata curtin_userdata_amd64_generic_trusty_curtintest

We now edit this newly created file and add our previously crafted Curtin configuration file just after the following block :

{{if third_party_drivers and driver}}
  early_commands:
  {{py: key_string = ''.join(['\\x%x' % x for x in map(ord, driver['key_binary'])])}}
  driver_00_get_key: /bin/echo -en '{{key_string}}' > /tmp/maas-{{driver['package']}}.gpg
  driver_01_add_key: ["apt-key", "add", "/tmp/maas-{{driver['package']}}.gpg"]
  driver_02_add: ["add-apt-repository", "-y", "deb {{driver['repository']}} {{node.get_distro_series()}} main"]
  driver_03_update_install: ["sh", "-c", "apt-get update --quiet && apt-get --assume-yes install {{driver['package']}}"]
  driver_04_load: ["sh", "-c", "depmod && modprobe {{driver['module']}}"]
  {{endif}}

The complete section should look just like this :

{{if third_party_drivers and driver}}
  early_commands:
  {{py: key_string = ''.join(['\\x%x' % x for x in map(ord, driver['key_binary'])])}}
   driver_00_get_key: /bin/echo -en '{{key_string}}' > /tmp/maas-{{driver['package']}}.gpg
   driver_01_add_key: ["apt-key", "add", "/tmp/maas-{{driver['package']}}.gpg"]
   driver_02_add: ["add-apt-repository", "-y", "deb {{driver['repository']}} {{node.get_distro_series()}} main"]
   driver_03_update_install: ["sh", "-c", "apt-get update --quiet && apt-get --assume-yes install {{driver['package']}}"]
   driver_04_load: ["sh", "-c", "depmod && modprobe {{driver['module']}}"]
  {{endif}}
  partitioning_commands:
   builtin: []
   01_partition_announce: ["echo", "'### Partitioning disk ###'"]
   01_partition_make_label: ["/sbin/parted", "/dev/vda", "-s", "'","mklabel","msdos","'"]
   02_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","primary","1049k","538M","'"]
   02_partition_set_flag: ["/sbin/parted", "/dev/vda", "-s", "'","set","1","boot","on","'"]
   04_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","primary","538M","4538M","'"]
   05_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","extended","4538M","1000G","'"]
   06_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","logical","25.5G","57G","'"]
   07_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","logical","57G","89G","'"]
   08_partition_make_part: ["/sbin/parted", "/dev/vda", "-s", "'","mkpart","logical","89G","1000G","'"]
   09_partition_announce: ["echo", "'### Creating filesystems ###'"]
   10_partition_make_fs: ["/sbin/mkfs", "-t", "ext4", "/dev/vda1"]
   11_partition_label_fs: ["/sbin/e2label", "/dev/vda1", "cloudimg-boot"]
   12_partition_make_fs: ["/sbin/mkfs", "-t", "ext4", "/dev/vda2"]
   13_partition_label_fs: ["/sbin/e2label", "/dev/vda2", "cloudimg-rootfs"]
   14_partition_mount_fs: ["sh", "-c", "mount /dev/vda2 $TARGET_MOUNT_POINT"]
   15_partition_mkdir: ["sh", "-c", "mkdir $TARGET_MOUNT_POINT/boot"]
   16_partition_mount_fs: ["sh", "-c", "mount /dev/vda1 $TARGET_MOUNT_POINT/boot"]
   17_partition_announce: ["echo", "'### Filling /etc/fstab ###'"]
   18_partition_make_fstab: ["sh", "-c", "echo 'LABEL=cloudimg-rootfs / ext4 defaults 0 0' >> $OUTPUT_FSTAB"]
   19_partition_make_fstab: ["sh", "-c", "echo 'LABEL=cloudimg-boot /boot ext4 defaults 0 0' >> $OUTPUT_FSTAB"]
   20_partition_make_swap: ["sh", "-c", "mkswap /dev/vda6"]
   21_partition_make_fstab: ["sh", "-c", "echo '/dev/vda6 none swap sw 0 0' >> $OUTPUT_FSTAB"]

Now that maas is properly configured for curtintest, complete the test by deploying a charm in a Juju environment where curtintest is properly comissionned.  In that example, curtintest is the only available node so maas will systematically pick it up :

caribou@avogadro:~$ juju status
environment: maas17
machines:
« 0 »:
agent-state: started
agent-version: 1.24.0
dns-name: state-server.maas
instance-id: /MAAS/api/1.0/nodes/node-2555c398-1bf9-11e5-a7c4-525400214658/
series: trusty
hardware: arch=amd64 cpu-cores=1 mem=1024M
state-server-member-status: has-vote
services: {}
networks:
maas-eth0:
provider-id: maas-eth0
cidr: 192.168.100.0/24

caribou@avogadro:~$ juju deploy mysql
Added charm « cs:trusty/mysql-25″ to the environment.

Once the mysql charm has been deployed, connect to the unit to confirm that the partitioning was successful

caribou@avogadro:~$ juju ssh mysql/0
ubuntu@curtintest:~$ sudo -s
root@curtintest:~# parted /dev/vda print
Model: Virtio Block Device (virtblk)
Disk /dev/vda: 1074GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
 
Number  Start   End     Size    Type      File system  Flags
1      1049kB  538MB   537MB   primary   ext4         boot
2      538MB   4538MB  4000MB  primary   ext4
3      4538MB  1000GB  995GB   extended               lba
5      25.5GB  57.0GB  31.5GB  logical
6      57.0GB  89.0GB  32.0GB  logical
7      89.0GB  1000GB  911GB   logical
ubuntu@curtintest:~$ swapon -s
Filename Type Size Used Priority
/dev/vda6 partition 31249404 0 -1

Conclusion

Customizing disks and partition using curtin is possible but currently not sufficiently documented. I hope that this write up will be helpful.  Sustained development on Curtin is currently done to improve these functionalities so things will definitively get better.

Read more
Louis

I have seen this setup documented a few places, but not for Ubuntu so here it goes.

I have used this many time to verify or diagnose Device Mapper Multipath (DM-MPIO) since it is rather easy to fail a path by switching off one of the network interfaces. Nowaday, I use two KVM virtual machines with two NIC each.

Those steps have been tested on Ubuntu 12.04 (Precise) and Ubuntu 14.04 (Trusty). The DM-MPIO section is mostly a cut and paste of the Ubuntu Server Guide

The virtual machine that will act as the iSCSI target provider is called PreciseS-iscsitarget. The VM that will connect to the target is called PreciseS-iscsi. Each one is configured with two network interfaces (NIC) that get their IP addresses from DHCP. Here is an example of the network configuration file :

$ cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet dhcp
#
auto eth1
iface eth1 inet dhcp

The second NIC resolves to the same hostname with a « 2 » appended (i.e. PreciseS-iscsitarget2 and PreciseS-iscsi2)

Setting up the iSCSI Target VM

This is done by installing the following packages :

$ sudo apt-get install iscsitarget iscsitarget-dkms

Edit /etc/default/iscsitarget and change the following line to enable the service :

ISCSITARGET_ENABLE=true

We now proceed to create an iSCSI target (aka disk). This is done by creating a 50 Gb sparse file that will act as our disk :

$ sudo dd if=/dev/zero of=/home/ubuntu/iscsi_disk.img count=0 obs=1 seek=50G

This container is used in the definition of the iSCSI target. Edit the file /etc/iet/ietd.conf. At the bottom, add :

Target iqn.2014-09.PreciseS-iscsitarget:storage.sys0
        Lun 0 Path=/home/ubuntu/iscsi_disk.img,Type=fileio,ScsiId=lun0,ScsiSN=lun0

The iSCSI target service must be restarted for the new target to be accessible

$ sudo service iscsitarget restart


Setting up the iSCSI initiator

To be able to access the iSCSI target, only one package is required :

$ sudo apt-get install open-iscsi

Edit /etc/iscsi/iscsid.conf changing the following:

node.startup = automatic

This will ensure that the iSCSI targets that we discover are enabled automatically upon reboot.

Now we will proceed to discover and connect to the device that we setup in the previous section

$ sudo iscsiadm -m discovery -t st -p PreciseS-iscsitarget
$ sudo iscsiadm -m node --login
$ dmesg | tail
[   68.461405] iscsid (1458): /proc/1458/oom_adj is deprecated, please use /proc/1458/oom_score_adj instead.
[  189.989399] scsi2 : iSCSI Initiator over TCP/IP
[  190.245529] scsi 2:0:0:0: Direct-Access     IET      VIRTUAL-DISK     0    PQ: 0 ANSI: 4
[  190.245785] sd 2:0:0:0: Attached scsi generic sg0 type 0
[  190.249413] sd 2:0:0:0: [sda] 104857600 512-byte logical blocks: (53.6 GB/50.0 GiB)
[  190.250487] sd 2:0:0:0: [sda] Write Protect is off
[  190.250495] sd 2:0:0:0: [sda] Mode Sense: 77 00 00 08
[  190.251998] sd 2:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[  190.257341]  sda: unknown partition table
[  190.258535] sd 2:0:0:0: [sda] Attached SCSI disk

We can see in the dmesg output that the new device /dev/sda has been discovered. Format the new disk & create a file system. Then verify that everything is correct by mounting and unmounting the new file system.

$ fdisk /dev/sda
n
p
1
<ret>
<ret>
w
$  mkfs -t ext4 /dev/sda1
$ mount /dev/sda1 /mnt
$ umount /mnt

 

Setting up DM-MPIO

Since each of our virtual machines have been configured with two network interfaces, it is possible to reach the iSCSI target through the second interface :

$ iscsiadm -m discovery -t st -p
192.168.1.193:3260,1 iqn.2014-09.PreciseS-iscsitarget:storage.sys0
192.168.1.43:3260,1 iqn.2014-09.PreciseS-iscsitarget:storage.sys0
$ iscsiadm -m node -T iqn.2014-09.PreciseS-iscsitarget:storage.sys0 --login

Now that we have two paths toward our iSCSI target, we can proceed to setup DM-MPIO.

First of all, a /etc/multipath.conf file must exist.  Then we install the needed package :

$ sudo -s
# cat << EOF > /etc/multipath.conf
defaults {
        user_friendly_names yes
}
EOF
# exit
$ sudo apt-get -y install multipath-tools

Two paths to the iSCSI device created previously need to exist for the multipath device to be seen.

# multipath -ll
mpath0 (149455400000000006c756e30000000000000000000000000) dm-2 IET,VIRTUAL-DISK
size=50G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `- 4:0:0:0 sda 8:0   active ready  running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 5:0:0:0 sdb 8:16  active ready  running

The two paths are indeed visible. We can move forward and verify that the partition table created previously is accessible :

$ sudo fdisk -l /dev/mapper/mpath0

Disk /dev/mapper/mpath0: 53.7 GB, 53687091200 bytes
64 heads, 32 sectors/track, 51200 cylinders, total 104857600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0e5e5db1

              Device Boot      Start         End      Blocks   Id  System
/dev/mapper/mpath0p1            2048   104857599    52427776   83  Linux

All that is remaining is to add an entry to the /etc/fstab file so the file system that we created is mounted automatically at boot.  Notice the _netdev entry : this is required otherwise the iSCSI device will not be mounted.

$ sudo -s
# cat << EOF >> /etc/fstab
/dev/mapper/mpath0-part1        /mnt    ext4    defaults,_netdev        0 0
EOF
exit
$ sudo mount -a
$ df /mnt
Filesystem               1K-blocks   Used Available Use% Mounted on
/dev/mapper/mpath0-part1  51605116 184136  48799592   1% /mnt

Read more
Louis

A few years ago, while I started to participate to the packaging of makedumpfile and kdump-tools for Debian and ubuntu. I am currently applying for the formal status of Debian Maintainer to continue that task.

For a while now, I have been noticing that our version of the kernel dump mechanism was lacking from a functionality that has been available on RHEL & SLES for a long time : remote kernel crash dumps. On those distribution, it is possible to define a remote server to be the receptacle of the kernel dumps of other systems. This can be useful for centralization or to capture dumps on systems with limited or no local disk space.

So I am proud to announce the first functional beta-release of kdump-tools with remote kernel crash dump functionality for Debian and Ubuntu !

For those of you eager to test or not interested in the details, you can find a packaged version of this work in a Personal Package Archive (PPA) here :

https://launchpad.net/~louis-bouchard/+archive/networked-kdump

New functionality : remote SSH and NFS

In the current version available in Debian and Ubuntu, the kernel crash dumps are stored on local filesystems. Starting with version 1.5.1, they are stored in a timestamped directory under /var/crash. The new functionality allow to either define a remote host accessible through SSH or an NFS mount point to be the receptacle for the kernel crash dumps.

A new section of the /etc/default/kdump-tools file has been added :

# ---------------------------------------------------------------------------
# Remote dump facilities:
# SSH - username and hostname of the remote server that will receive the dump
# and dmesg files.
# SSH_KEY - Full path of the ssh private key to be used to login to the remote
# server. use kdump-config propagate to send the public key to the
# remote server
# HOSTTAG - Select if hostname of IP address will be used as a prefix to the
# timestamped directory when sending files to the remote server.
# 'ip' is the default.
# NFS - Hostname and mount point of the NFS server configured to receive
# the crash dump. The syntax must be {HOSTNAME}:{MOUNTPOINT} 
# (e.g. remote:/var/crash)
#
# SSH="<user@server>"
#
# SSH_KEY="<path>"
#
# HOSTTAG="hostname|[ip]"
# 
# NFS="<nfs mount>"
#

The kdump-config command also gains a new option : propagate which is used to send a public ssh key to the remote server so passwordless ssh commands can be issued to the remote SSH host.

Those options and commands are nothing new : I simply based my work on existing functionality from RHEL & SLES. So if you are well acquainted with RHEL remote kernel crash dump mechanisms, you will not be lost on Debian and Ubuntu. So I want to thank those who built the functionality on those distributions; it was a great help in getting them ported to Debian.

Testing on Debian

First of all, you must enable the kernel crash dump mechanism at the kernel level. I will not go in details as it is slightly off topic but you should :

  1. Add crashkernel=128M to /etc/default/grub in GRUB_CMDLINE_LINUX_DEFAULT
  2. Run udpate-grub
  3. reboot

Install the beta packages

The package in the PPA can be installed on Debian with add-apt-repository. This command is in the software-properties-common package so you will have to install it first :

$ apt-get install software-properties-common
$ add-apt-repository ppa:louis-bouchard/networked-kdump

Since you are on Debian, the result of the last command will be wrong, as the serie defined in the PPA is for Utopic. Just use the following command to fix that :

$ sed -i -e 's/sid/utopic/g' /etc/apt/sources.list.d/louis-bouchard-networked-kdump-sid.list 
$ apt-get update
$ apt-get install kdump-tools makedumpfile

Configure kdump-tools for remote SSH capture

Edit the file /etc/default/kdump-tools and enable the kdump mechanism by setting USE_KDUMP to 1 . Then set the SSH variable to the remote hostname & credentials that you want to use to send the kernel crash dump. Here is an example :

USE_KDUMP=1
...
SSH="ubuntu@TrustyS-netcrash"

You will need to propagate the ssh key to the remote SSH host, so make sure that you have the password of the remote server’s user you defined (ubuntu in my case) for this command :

root@sid:~# kdump-config propagate
Need to generate a new ssh key...
The authenticity of host 'trustys-netcrash (192.168.122.70)' can't be established.
ECDSA key fingerprint is 04:eb:54:de:20:7f:e4:6a:cc:66:77:d0:7c:3b:90:7c.
Are you sure you want to continue connecting (yes/no)? yes
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/kdump_id_rsa to server ubuntu@TrustyS-netcrash

If you have an existing ssh key that you want to use, you can use the SSH_KEY option to point to your own key in /etc/default/kdump-tools :

SSH_KEY="/root/.ssh/mykey_id_rsa"

Then run the propagate command as previously :

root@sid:~/.ssh# kdump-config propagate
Using existing key /root/.ssh/mykey_id_rsa
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/mykey_id_rsa to server ubuntu@TrustyS-netcrash

It is a safe practice to verify that the remote SSH host can be accessed without password. You can use the following command to test (with your own remote server as defined in the SSH variable in /etc/default/kdump-tools) :

root@sid:~/.ssh# ssh -i /root/.ssh/mykey_id_rsa ubuntu@TrustyS-netcrash pwd
/home/ubuntu

If the passwordless connection can be achieved, then everything should be all set. You can proceed with a real crash dump test if your setup allows for it (not a production environment for instance).

Configure kdump-tools for remote NFS capture

Edit the /etc/default/kdump-tools file and set the NFS variable with the NFS mount point that will be used to transfer the crash dump :

NFS="TrustyS-netcrash:/var/crash"

The format needs to be the syntax that normally would be used to mount the NFS filesystem. You should test that your NFS filesystem is indeed accessible by mounting it manually :

root@sid:~/.ssh# mount -t nfs TrustyS-netcrash:/var/crash /mnt
root@sid:~/.ssh# df /mnt
Filesystem 1K-blocks Used Available Use% Mounted on
TrustyS-netcrash:/var/crash 6815488 1167360 5278848 19% /mnt
root@sid:~/.ssh# umount /mnt

Once you are sure that your NFS setup is correct, then you can proceed with a real crash dump test.

Testing on Ubuntu

As you would expect, setting things on Ubuntu is quite similar to Debian.

Install the beta packages

The package in the PPA can be installed on Debian with add-apt-repository. This command is in the software-properties-common package so you will have to install it first :

$ sudo add-apt-repository ppa:louis-bouchard/networked-kdump

Packages are available for Trusty and Utopic.

$ sudo apt-get update
$ sudo apt-get -y install linux-crashdump

Configure kdump-tools for remote SSH capture

Edit the file /etc/default/kdump-tools and enable the kdump mechanism by setting USE_KDUMP to 1 . Then set the SSH variable to the remote hostname & credentials that you want to use to send the kernel crash dump. Here is an example :

USE_KDUMP=1
...
SSH="ubuntu@TrustyS-netcrash"

You will need to propagate the ssh key to the remote SSH host, so make sure that you have the password of the remote server’s user you defined (ubuntu in my case) for this command :

ubuntu@TrustyS-testdump:~$ sudo kdump-config propagate
[sudo] password for ubuntu: 
Need to generate a new ssh key...
The authenticity of host 'trustys-netcrash (192.168.122.70)' can't be established.
ECDSA key fingerprint is 04:eb:54:de:20:7f:e4:6a:cc:66:77:d0:7c:3b:90:7c.
Are you sure you want to continue connecting (yes/no)? yes
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/kdump_id_rsa to server ubuntu@TrustyS-netcrash
ubuntu@TrustyS-testdump:~$
If you have an existing ssh key that you want to use, you can use the SSH_KEY option to point to your own key in /etc/default/kdump-tools :
SSH_KEY="/root/.ssh/mykey_id_rsa"

Then run the propagate command as previously :

ubuntu@TrustyS-testdump:~$ kdump-config propagate
Using existing key /root/.ssh/mykey_id_rsa
ubuntu@trustys-netcrash's password: 
propagated ssh key /root/.ssh/mykey_id_rsa to server ubuntu@TrustyS-netcrash

It is a safe practice to verify that the remote SSH host can be accessed without password. You can use the following command to test (with your own remote server as defined in the SSH variable in /etc/default/kdump-tools) :

ubuntu@TrustyS-testdump:~$sudo ssh -i /root/.ssh/mykey_id_rsa ubuntu@TrustyS-netcrash pwd
/home/ubuntu

If the passwordless connection can be achieved, then everything should be all set.

Configure kdump-tools for remote NFS capture

Edit the /etc/default/kdump-tools file and set the NFS variable with the NFS mount point that will be used to transfer the crash dump :

NFS="TrustyS-netcrash:/var/crash"

The format needs to be the syntax that normally would be used to mount the NFS filesystem. You should test that your NFS filesystem is indeed accessible by mounting it manually (you might need to install the nfs-common package) :

ubuntu@TrustyS-testdump:~$ sudo mount -t nfs TrustyS-netcrash:/var/crash /mnt 
ubuntu@TrustyS-testdump:~$ df /mnt
Filesystem 1K-blocks Used Available Use% Mounted on
TrustyS-netcrash:/var/crash 6815488 1167488 5278720 19% /mnt
ubuntu@TrustyS-testdump:~$ sudo umount /mnt

Once you are sure that your NFS setup is correct, then you can proceed with a real crash dump test.

 Miscellaneous commands and options

A few other things are under the control of the administrator

The HOSTTAG modifier

When sending the kernel crash dump, kdump-config will use the IP address of the server to as a prefix to the timestamped directory on the remote host. You can use the HOSTTAG variable to change that default. Simply define in /etc/default/kdump-tools :

HOSTTAG="hostname"

The hostname of the server will be used as a prefix instead of the IP address.

Currently, this is only implemented for the SSH method, but it will be available for NFS as well in the final version.

kdump-config show

To verify the configuration that you have defined in /etc/default/kdump-tools, you can use kdump-config’s show command to review your options.

ubuntu@TrustyS-testdump:~$ sudo kdump-config show
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr: 0x2d000000
SSH: ubuntu@TrustyS-netcrash
SSH_KEY: /root/.ssh/kdump_id_rsa
HOSTTAG: ip
current state: ready to kdump
kexec command:
 /sbin/kexec -p --command-line="BOOT_IMAGE=/vmlinuz-3.13.0-24-generic root=/dev/mapper/TrustyS--vg-root ro console=ttyS0,115200 irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-3.13.0-24-generic /boot/vmlinuz-3.13.0-24-generic

If the remote crash kernel dump functionality is setup, you will see the options listed in the output of the commands.

Conclusion

As outlined at the beginning, this is the first functional beta version of the code. If you are curious, you can find the code I am working on here :

http://anonscm.debian.org/gitweb/?p=collab-maint/makedumpfile.git;a=shortlog;h=refs/heads/networked_kdump_beta1

Don’t hesitate to test & let me know if you find issues

Read more
caribou

One year ago, I had done my last day after thirteen years with Digital Equipment Corp which became Compaq, then HP.  After starting on Digital Unix/Tru64, I had evolved to a second level support position in the Linux Global Competency Center.

In a few days, on the 18th, I will have completed my first full year as a Canonical employee. I think it is time to take a few minutes to look back at that year.

Coming from a RHEL/SLES environment with a bit of Debian, my main asset was the fact that I  had been an Ubuntu user since 5.04, using it as my sole operating system on my corporate laptop. The first week in the new job was also a peculiar experience, as it brought me back to my native country and to Montréal, a city that I love and where I lived for three years.  So I was not totally lost in my new environment. I also had the chance of ramping up my knowledge of Ubuntu Server, which was an easy task.  What was more surprizing and became one of the most exciting part of the new job is to work in a completely dedicated opensource environment from day one.

Rapidly I became aware of the fact that, participating in the Ubuntu community was not only possible, but it was expected.  That if I were to find a bug, I needed to report it and, if possible find ways to fix it.  In my previous job I was looking for existing solutions, or bringing in enough elements to my L3 counterpart that they would be able to request a fix to Red Hat or Novell.  Here if I was able to identify the problem and suggest a solution, I was encouraged to propose it as the final fix.  I also rapidly found out that the developpers were no longer the remote engineers in some changelog file, but IRC nicks that I could chat with and eventually meet.

Then came about Openstack in the summer : a full week of work with colleagues aimed at getting to know the technology, trying to master concepts that were very vague back then and making things work.  Getting Swift Object Store up and running and trying to figure out how best this could be used.  Here I was asked to do one of the think I like best : learning by getting things to work. This lead to a better understanding of what a cloud architecture was all about and really made me understand how useful and interesting a cloud infrastructure can be. Oh, and I did get to build my first openstack cloud.

This was another of this past year’s great experience : UDS-P. I had heard of UDS-O when I joined but it was too early for me to attend.  But after six months around it was time for UDS-P and, this time, I would be there.  Not only I had time to meet a good chunk of developpers, but I also got a lot of work done.  Like helping Michael Terry fix a bug on Deja-Dup that would only appear on localized systems, get advices on fixing kdump with the kernel team and some of the foundation engineers and a whole lot more.

Then came back the normal work for our customers, fixing their issues, trying to help improve their support experience and get better at what we do. And also seeing some of my fixes make it into our upcoming distribution and also back to the existing ones.  This was a great thrill and an objective that I did not think would come by so fast.

Being part of the Ubuntu community has been a great addition to my career. This makes me want to do even more and get the best out of our collective efforts.

This was a great year. Sure hope that the next one will be even better.

Read more
caribou

Recently, I have realized a major difference in how customer support is done on Ubuntu.

As you know, Canonical provides official customer support for Ubuntu both on server and desktop. This is the work I do : provice customer with the best level of support on the Ubuntu distribution.  This is also what I was doing on my previous job, but for the Red Hat Enterprise Linux and SuSE Linux Enterprise Server distributions.

The major difference that I recently realized is that, unlike my previous work with RHEL & SLES, the result of my work is now available to the whole Ubuntu community, not just to the customers that may for our support.

Here is an example. Recently one of our customer identified a bug with vm-builder in a very specific case.  The work that I did on this bug resulted in a patch that I submitted to the developers who accepted its inclusion in the code. In my previous life, this fix would have been made available only to customers paying a subscription to the vendors through their official update or service pack services.

With Ubuntu, through Launchpad and the regular community activity, this fix will become available to the whole community through the standard -updates channel of our public archives.

This is true for the vast majority of the fixes that are provided to our customers. As a matter of fact, the public archives are almost the only channel that we have to provide fixes to our customers, hence making them available to the whole Ubuntu community at the same time.  This is different behavior and something that makes me a bit prouder of the work I’m doing.

Read more
caribou

In my job as a support engineer, I spend my day trying to fix problem. So it is refreshing when, sometimes, the technology that we’re providing simply works.

Like this morning when I decided to test a total autonomous laptop setup. By that I mean that I will on the road tonight and might need internet access for m y laptop.  So I went ahead and configured my Android HTC Desire to act as a WiFi Hotspot.  Then I switched my laptop’s Wireless connection to the SSID that was broadcast by my phone; it connected right away.

Now since I’ll be in a car, I need to have some kind of headset to avoid street noises. So I powered my bluetooth  earphone that I had previously configured to be recognized by my laptop. I went in the notification area and selected the Bluetooth applet & saw that my earphone was been seen. So I connected it.  Now the only thing that doesn’t totally work as I would expect is that I needed to change the whole sound setup of my laptop to send all input and output to my earphone. I’m expecting to use Twinkle for phone calls, and it is the only way I found to have the earphone work properly. If someone knows how to be more selective and have Twinkle use the Bluetooth earphone, I’m a taker.

But this is not a show stopper.  So I fired up Twinkle, connected to my SIP account and dialed my home phone : Drriiiinngg!!!

So I’m there, with my Android phone connecting me to the Internet, my laptop acting as a telephone, while I still have full laptop functionality. I must say that I’m impressed. Don’t get me wrong : I’ve been using Ubuntu since Hoary, and Debian before that.  But nowaday, with all the great work done on those tools, it make rather complex setups work smoothly.

Bravo !

Read more
caribou

As many around me know, I have recently moved to a new job with Canonical.  This journey started quite funnily on April 18th by a Support Sprint event in our Montréal’s office. I say funnily because, while I now lives in France, near Versailles, I was born in northern Québec and I have lived in Montréal for three years before moving to France.

This Sprint week was a great occasion to meet with my new colleagues and to jumpstart my exposure to my new job by learning as much as I could about Natty, both on server and desktop with those responsible for providing top class support to our customers.  But most of all, it was an occasion to meet face to face, to interact with colleagues, learn to know them and understand a bit more about the context of how things are done at Canonical.

Then a little while after coming back home, I got that email from Jono Bacon, encouraging us to do more blogging. Talk about a cultural shock ! For years, I pushed myself into maintaining an internal blog at my prior job, worrying about the possibility that it would be seen as a waste of my time by some manager.  I even seen a few occurrences where very good “internal” bloggers were forced to abandon their blogging activities because of some politically incorrect statement that did not please some high ranking manager.  So being encouraged to go out on the open and talk about what I do, how it is to work here at Canonical, how I think I can help improve our customer’s experience did made my day.

So why did it took me so long to react, to start writing ? Call it paranoia, historical worries, I don’t know. I also wanted to take the opportunity to take out some of the english posts out of my french speaking blog, which required some reworking of my own personal WordPress infrastructure.  But that’s all done now. It’s time to go ahead and start sharing how it is like to “really” be a part of the Open Source effort.

Read more