4 Autocluster is set of scripts for building virtual clusters to test
5 clustered Samba. It uses Linux's libvirt and KVM virtualisation
8 Autocluster is a collection of scripts, template and configuration
9 files that allow you to create a cluster of virtual nodes very
10 quickly. You can create a cluster from scratch in less than 30
11 minutes. Once you have a base image you can then recreate a cluster
12 or create new virtual clusters in minutes.
14 The current implementation creates virtual clusters of RHEL5 nodes.
20 * INSTALLING AUTOCLUSTER
35 INSTALLING AUTOCLUSTER
36 ======================
38 Before you start, make sure you have the latest version of
39 autocluster. To download autocluster do this:
41 git clone git://git.samba.org/tridge/autocluster.git autocluster
43 Or to update it, run "git pull" in the autocluster directory
45 You probably want to add the directory where autocluster is installed
46 to your PATH, otherwise things may quickly become tedious.
52 This section explains how to setup a host machine to run virtual
53 clusters generated by autocluster.
56 1) Install and configure required software.
58 a) Install kvm, libvirt and expect.
60 Autocluster creates virtual machines that use libvirt to run under
61 KVM. This means that you will need to install both KVM and
62 libvirt on your host machine. Expect is used by the "waitfor"
63 script and should be available for installation form your
70 Autocluster should work with the standard RHEL6 qemu-kvm and
71 libvirt packages. However, RHEL's KVM doesn't support the SCSI
72 emulation, so you will need these settings:
75 SHARED_DISK_TYPE=virtio
76 KVM=/usr/libexec/qemu-kvm
78 For RHEL5/CentOS5, useful packages for both kvm and libvirt used
81 http://www.lfarkas.org/linux/packages/centos/5/x86_64/
83 However, since recent versions of RHEL5 ship with KVM, 3rd party
84 KVM RPMs for RHEL5 are now scarce.
86 RHEL5.4's KVM also has problems when autocluster uses virtio
87 shared disks, since multipath doesn't notice virtio disks. This
88 is fixed in RHEL5.6 and in a recent RHEL5.5 update - you should
89 be able to use the settings recommended above for RHEL6.
91 If you're still running RHEL5.4, you have lots of time, you have
92 lots of disk space and you like complexity then see the sections
93 below on "iSCSI shared disks" and "Raw IDE system disks".
97 Useful packages ship with Fedora Core 10 (Cambridge) and later.
98 Some of the above notes on RHEL might apply to Fedora Core's
103 Useful packages ship with Ubuntu 8.10 (Intrepid Ibex) and later.
104 In recent Ubuntu versions (e.g. 10.10 Maverick Meerkat) the KVM
105 package is called "qemu-kvm". Older versions have a package
108 For other distributions you'll have to backport distro sources or
109 compile from upstream source as described below.
111 * For KVM see the "Downloads" and "Code" sections at:
113 http://www.linux-kvm.org/
119 b) Install guestfish or qemu-nbd and nbd-client.
121 Recent Linux distributions, including RHEL6.0, contain guestfish.
122 Guestfish (see http://libguestfs.org/ - there are binary packages
123 for several distros here) is a CLI for manipulating KVM/QEMU disk
124 images. Autocluster supports guestfish, so if guestfish is
125 available then you should use it. It should be more reliable than
128 Guestfish isn't yet the default autocluster method for disk image
129 manipulation. To use it put this in your configuration file:
131 SYSTEM_DISK_ACCESS_METHOD=guestfish
133 Note that autocluster's guestfish support is new and was written
134 to work around some bugs in RHEL6.0's version of guestfish... so
135 might not work well with newer, non-buggy versions. If so, please
138 If you can't use guestfish then you'll have to use NBD. For this
139 you will need the qemu-nbd and nbd-client programs, which
140 autocluster uses to loopback-nbd-mount the disk images when
141 configuring each node.
143 NBD for various distros:
147 qemu-nbd is only available in the old packages from lfarkas.org.
148 Recompiling the RHEL5 kvm package to support NBD is quite
149 straightforward. RHEL6 doesn't have an NBD kernel module, so is
150 harder to retrofit for NBD support - use guestfish instead.
152 Unless you can find an RPM for nbd-client then you need to
153 download source from:
155 http://sourceforge.net/projects/nbd/
161 qemu-nbd is in the qemu-kvm or kvm package.
163 nbd-client is in the nbd package.
167 qemu-nbd is in the qemu-kvm or kvm package. In older releases
168 it is called kvm-nbd, so you need to set the QEMU_NBD
169 configuration variable.
171 nbd-client is in the nbd-client package.
173 * As mentioned above, nbd can be found at:
175 http://sourceforge.net/projects/nbd/
177 c) Environment and libvirt virtual networks
179 You will need to add the autocluster directory to your PATH.
181 You will need to configure the right kvm networking setup. The
182 files in host_setup/etc/libvirt/qemu/networks/ should help. This
183 command will install the right networks for kvm:
185 rsync -av --delete host_setup/etc/libvirt/qemu/networks/ /etc/libvirt/qemu/networks/
187 Note that you'll need to edit the installed files to reflect any
188 changes to IPBASE, IPNET0, IPNET1, IPNET2 away from the defaults.
189 This is also true for named.conf.local and squid.conf (see below).
191 After this you might need to reload libvirt:
193 /etc/init.d/libvirt reload
197 You might also need to set:
199 VIRSH_DEFAULT_CONNECT_URI=qemu:///system
201 in your environment so that virsh does KVM/QEMU things by default.
203 2) You need a caching web proxy on your local network. If you don't
204 have one, then install a squid proxy on your host. See
205 host_setup/etc/squid/squid.conf for a sample config suitable for a
206 virtual cluster. Make sure it caches large objects and has plenty
207 of space. This will be needed to make downloading all the RPMs to
210 To test your squid setup, run a command like this:
212 http_proxy=http://10.0.0.1:3128/ wget <some-url>
214 Check your firewall setup. If you have problems accessing the
215 proxy from your nodes (including from kickstart postinstall) then
216 check it again! Some distributions install nice "convenient"
217 firewalls by default that might block access to the squid port
218 from the nodes. On a current version of Fedora Core you may be
219 able to run system-config-firewall-tui to reconfigure the
222 3) Setup a DNS server on your host. See host_setup/etc/bind/ for a
223 sample config that is suitable. It needs to redirect DNS queries
224 for your virtual domain to your windows domain controller
226 4) Download a RHEL install ISO.
232 A cluster comprises a single base disk image, a copy-on-write disk
233 image for each node and some XML files that tell libvirt about each
234 node's virtual hardware configuration. The copy-on-write disk images
235 save a lot of disk space on the host machine because they each use the
236 base disk image - without them the disk image for each cluster node
237 would need to contain the entire RHEL install.
239 The cluster creation process can be broken down into 2 mains steps:
241 1) Creating the base disk image.
243 2) Create the per-node disk images and corresponding XML files.
245 However, before you do this you will need to create a configuration
246 file. See the "CONFIGURATION" section below for more details.
248 Here are more details on the "create cluster" process. Note that
249 unless you have done something extra special then you'll need to run
252 1) Create the base disk image using:
254 ./autocluster create base
256 The first thing this step does is to check that it can connect to
257 the YUM server. If this fails make sure that there are no
258 firewalls blocking your access to the server.
260 The install will take about 10 to 15 minutes and you will see the
261 packages installing in your terminal
263 The installation process uses kickstart. If your configuration
264 uses a SoFS release then the last stage of the kickstart
265 configuration will be a postinstall script that installs and
266 configures packages related to SoFS. The choice of postinstall
267 script is set using the POSTINSTALL_TEMPLATE variable, allowing you
268 to adapt the installation process for different types of clusters.
270 It makes sense to install packages that will be common to all
271 nodes into the base image. This save time later when you're
272 setting up the cluster nodes. However, you don't have to do this
273 - you can set POSTINSTALL_TEMPLATE to "" instead - but then you
274 will lose the quick cluster creation/setup that is a major feature
277 When that has finished you should mark that base image immutable
280 chattr +i /virtual/ac-base.img
282 That will ensure it won't change. This is a precaution as the
283 image will be used as a basis file for the per-node images, and if
284 it changes your cluster will become corrupt
286 2) Now run "autocluster create cluster" specifying a cluster
289 autocluster create cluster c1
291 This will create and install the XML node descriptions and the
292 disk images for your cluster nodes, and any other nodes you have
293 configured. Each disk image is initially created as an "empty"
294 copy-on-write image, which is linked to the base image. Those
295 images are then attached to using guestfish or
296 loopback-nbd-mounted, and populated with system configuration
297 files and other potentially useful things (such as scripts).
303 At this point the cluster has been created but isn't yet running.
304 Autocluster provides a command called "vircmd", which is a thin
305 wrapper around libvirt's virsh command. vircmd takes a cluster name
306 instead of a node/domain name and runs the requested command on all
307 nodes in the cluster.
309 1) Now boot your cluster nodes like this:
313 The most useful vircmd commands are:
316 shutdown : graceful shutdown of a node
317 destroy : power off a node immediately
319 2) You can watch boot progress like this:
321 tail -f /var/log/kvm/serial.c1*
323 All the nodes have serial consoles, making it easier to capture
324 kernel panic messages and watch the nodes via ssh
330 Now you have a cluster of nodes, which might have a variety of
331 packages installed and configured in a common way. Now that the
332 cluster is up and running you might need to configure specialised
333 subsystems like GPFS or Samba. You can do this by hand or use the
334 sample scripts/configurations that are provided
336 1) Now you can ssh into your nodes. You may like to look at the
337 small set of scripts in /root/scripts on the nodes for
338 some scripts. In particular:
340 mknsd.sh : sets up the local shared disks as GPFS NSDs
341 setup_gpfs.sh : sets up GPFS, creates a filesystem etc
342 setup_samba.sh : sets up Samba and many other system compoents
343 setup_tsm_server.sh: run this on the TSM node to setup the TSM server
344 setup_tsm_client.sh: run this on the GPFS nodes to setup HSM
346 To setup a SoFS system you will normally need to run
347 setup_gpfs.sh and setup_samba.sh.
349 2) If using the SoFS GUI, then you may want to lower the memory it
350 uses so that it fits easily on the first node. Just edit this
351 file on the first node:
353 /opt/IBM/sofs/conf/overrides/sofs.javaopt
355 3) For automating the SoFS GUI, you may wish to install the iMacros
356 extension to firefox, and look at some sample macros I have put
357 in the imacros/ directory of autocluster. They will need editing
358 for your environment, but they should give you some hints on how
359 to automate the final GUI stage of the installation of a SoFS
369 Autocluster uses configuration files containing Unix shell style
370 variables. For example,
374 indicates that the last octet of the first IP address in the cluster
375 will be 30. If an option contains multiple words then they will be
376 separated by underscores ('_'), as in:
380 All options have an equivalent command-line option, such
385 Command-line options are lowercase. Words are separated by dashes
390 Normally you would use a configuration file with variables so that you
391 can repeat steps easily. The command-line equivalents are useful for
392 trying things out without resorting to an editor. You can specify a
393 configuration file to use on the autocluster command-line using the -c
396 autocluster -c config-foo create base
398 If you don't provide a configuration variable then autocluster will
399 look for a file called "config" in the current directory.
401 You can also use environment variables to override the default values
402 of configuration variables. However, both command-line options and
403 configuration file entries will override environment variables.
405 Potentially useful information:
407 * Use "autocluster --help" to list all available command-line options
408 - all the items listed under "configuration options:" are the
409 equivalents of the settings for config files. This output also
410 shows descriptions of the options.
412 * You can use the --dump option to check the current value of
413 configuration variables. This is most useful when used in
414 combination with grep:
416 autocluster --dump | grep ISO_DIR
418 In the past we recommended using --dump to create initial
419 configuration file. Don't do this - it is a bad idea! There are a
420 lot of options and you'll create a huge file that you don't
421 understand and can't debug!
423 * Configuration options are defined in config.d/*.defconf. You
424 shouldn't need to look in these files... but sometimes they contain
425 comments about options that are too long to fit into help strings.
430 * I recommend that you aim for the smallest possible configuration file.
435 and move on from there.
437 * Use the --with-release option on the command-line or the
438 with_release function in a configuration file to get default values
439 for building virtual clusters for releases of particular "products".
440 Currently there are only release definitions for SoFS.
442 For example, you can setup default values for SoFS-1.5.3 by running:
444 autocluster --with-release=SoFS-1.5.3 ...
446 Equivalently you can use the following syntax in a configuration
449 with_release "SoFS-1.5.3"
451 So the smallest possible config file would have something like this
452 as the first line and would then set FIRSTIP:
454 with_release "SoFS-1.5.3"
458 Add other options as you need them.
460 The release definitions are stored in releases/*.release. The
461 available releases are listed in the output of "autocluster --help".
463 NOTE: Occasionally you will need to consider the position of
464 with_release in your configuration. If you want to override options
465 handled by a release definition then you will obviously need to set
466 them later in your configuration. This will be the case for most
467 options you will want to set. However, some options will need to
468 appear before with_release so that they can be used within a release
469 definition - the most obvious one is the (rarely used) RHEL_ARCH
470 option, which is used in the default ISO setting for each release.
471 If things don't work as expected use --dump to confirm that
472 configuration variables have the values that you expect.
474 * The NODES configuration variable controls the types of nodes that
475 are created. At the time of writing, the default value is:
477 NODES="rhel_base:0-3"
479 This means that you get 4 nodes, at IP offsets 0, 1, 2, & 3 from
480 FIRSTIP, all part of the CTDB cluster. That is, with standard
481 settings and FIRSTIP=35, 4 nodes will be created in the IP range
482 10.0.0.35 to 10.0.0.38.
484 The SoFS releases use a default of:
486 NODES="tsm_server:0 sofs_gui:1 sofs_front:2-4"
488 which should produce a set of nodes the same as the old SoFS
489 default. You can add extra rhel_base nodes if you need them for
490 test clients or some other purpose:
492 NODES="$NODES rhel_base:7,8"
494 This produces an additional 2 base RHEL nodes at IP offsets 7 & 8
495 from FIRSTIP. Since sofs_* nodes are present, these base nodes will
496 not be part of the CTDB cluster - they're just extra.
498 For many standard use cases the nodes specified by NODES can be
499 modified by setting NUMNODES, WITH_SOFS_GUI and WITH_TSM_NODE.
500 However, these options can't be used to create nodes without
501 specifying IP offsets - except WITH_TSM_NODE, which checks to see if
502 IP offset 0 is vacant. Therefore, for many uses you can ignore the
505 However, NODES is the recommended mechanism for specifying the nodes
506 that you want in your cluster. It is powerful, easy to read and
507 centralises the information in a single line of your configuration
513 The RHEL5 version of KVM does not support the SCSI block device
514 emulation. Therefore, you can use either virtio or iSCSI shared
515 disks. Unfortunately, in RHEL5.4 and early versions of RHEL5.5,
516 virtio block devices are not supported by the version of multipath in
517 RHEL5. So this leaves iSCSI as the only choice.
519 The main configuration options you need for iSCSI disks are:
521 SHARED_DISK_TYPE=iscsi
522 NICMODEL=virtio # Recommended for performance
523 add_extra_package iscsi-initiator-utils
525 Note that SHARED_DISK_PREFIX and SHARED_DISK_CACHE are ignored for
526 iSCSI shared disks because KVM doesn't (need to) know about them.
528 You will need to install the scsi-target-utils package on the host
529 system. After creating a cluster, autocluster will print a message
530 that points you to a file tmp/iscsi.$CLUSTER - you need to run the
531 commands in this file (probably via: sh tmp/iscsi.$CLUSTER) before
532 booting your cluster. This will remove any old target with the same
533 ID, and create the new target, LUNs and ACLs.
535 You can use the following command to list information about the
538 tgtadm --lld iscsi --mode target --op show
540 If you need multiple clusters using iSCSI on the same host then each
541 cluster will need to have a different setting for ISCSI_TID.
546 The RHEL5 version of KVM does not support the SCSI block device
547 emulation. Therefore, you can use virtio or ide system disks.
548 However, writeback caching, qcow2 and virtio are incompatible and
549 result in I/O corruption. So, you can use either virtio system disks
550 without any caching, accepting reduced performance, or you can use IDE
551 system disks with writeback caching, with nice performance.
553 For IDE disks, here are the required settings:
556 SYSTEM_DISK_PREFIX=hd
557 SYSTEM_DISK_CACHE=writeback
559 The next problem is that RHEL5's KVM does not include qemu-nbd. The
560 best solution is to build your own qemu-nbd and stop reading this
563 If, for whatever reason, you're unable to build your own qemu-nbd,
564 then you can use raw, rather than qcow2, system disks. If you do this
565 then you need significantly more disk space (since the system disks
566 will be *copies* of the base image) and cluster creation time will no
567 longer be pleasantly snappy (due to the copying time - the images are
568 large and a single copy can take several minutes). So, having tried
569 to warn you off this option, if you really want to do this then you'll
572 SYSTEM_DISK_FORMAT=raw
575 Note that if you're testing cluster creation with iSCSI shared disks
576 then you should find a way of switching off raw disks. This avoids
577 every iSCSI glitch costing you a lot of time while raw disks are
583 The -e option provides support for executing arbitrary bash code.
584 This is useful for testing and debugging.
586 One good use of this option is to test template substitution using the
587 function substitute_vars(). For example:
589 ./autocluster --with-release=SoFS-1.5.3 -e 'CLUSTER=foo; DISK=foo.qcow2; UUID=abcdef; NAME=foon1; set_macaddrs; substitute_vars templates/node.xml'
591 This prints templates/node.xml with all appropriate substitutions
592 done. Some internal variables (e.g. CLUSTER, DISK, UUID, NAME) are
593 given fairly arbitrary values but the various MAC address strings are
594 set using the function set_macaddrs().
596 The -e option is also useful when writing scripts that use
597 autocluster. Given the complexities of the configuration system you
598 probably don't want to parse configuration files yourself to determine
599 the current settings. Instead, you can ask autocluster to tell you
600 useful pieces of information. For example, say you want to script
601 creating a base disk image and you want to ensure the image is
604 base_image=$(autocluster -c $CONFIG -e 'echo $VIRTBASE/$BASENAME.img')
605 chattr -V -i "$base_image"
607 if autocluster -c $CONFIG create base ; then
608 chattr -V +i "$base_image"
611 Note that the command that autocluster should run is enclosed in
612 single quotes. This means that $VIRTBASE and $BASENAME will be expand
613 within autocluster after the configuration file has been loaded.