Novos blogs:  postgresql extreme tuning 40 cores 512 GB RAM Debian  debian linux kernel tuning high io  ZFS parte 6: vídeo palestra ZFS para gestores e sysadmins  ZFS parte 5: como comprar ou montar ZFS data storage server  PostgreSQL tuning example


Tech Force / Linux blog / Red Hat cluster suite Debian Etch



Right menu

Linux blog recente

PostgreSQL: palestra tuning extremo em hardware 40 núcleos, 80 threads, 512 GB RAM, Debian

Palestra com dicas de tuning extremo para dezenas de milhares de TPS e IOPS com PostgreSQL sobre Debian GNU/Linux.

Não conectado

Notificação


Red Hat cluster suite in Debian GNU / Linux 4.0 Etch

The Red Hat cluster suite packages in Debian 4.0 Etch are partially broken.

See how to patch them to get your cluster up and running.

I presume you are an experienced system administrator if you are going to get a high availability cluster running.

Red Hat cluster suite is intended for High Availability clustering, and load balancing of virtual servers.

You will have to patch source files and repackage them the Debian way. It must not be a mistery for you.

Download the Debian source files for Red Hat cluster from Debian repository [6 ] and place them in /usr/src.

Also, it is a lot better to add your user name to the group "src".

This group has special privileges over source files operations, despite not being root and will ease things ahead without risking your system.

It has to be /usr/src directory because you will have to repackage some other packages and they presume this directory.

The rgmanager is not binary packaged in Debian, but it is at the source package.

So, you have to follow 2 bug reports for it and apply the patch.

The missing dependency [0 ] is solved along with the missing control definition for rgmanager [1].

Apply the patch showed at the end of bug report and solve both problems.

There is a serious problem that gnbd kernel module is not binary packaged in Debian.

You will need to carefully apply three patches showed at the bug report [2 ].

First you have to patch the source package redhat-cluster in order to get a correct redhat-cluster-source package instead of the original [7 ]

This resulting redhat-cluster-source package will be used for generating a correct linux-modules-extra-2.6 for your kernel instead of those from repositories [8 ].

Verify that the resulting redhat-cluster-source package is installed in /usr/src.

Special attention to the displayed command lines. They show the exact files affected.

You will end with a dozen of binary packages in /usr/src.

Move or copy most of them to another directory you will use for installation.

Select the ones suitable for the cluster you are willing to configure.

DLM and GULM are mutually exclusive.

Development packages are not needed in all machines.

You will need good clvm [9 ] and fence [4] init scripts, if the defaults are not good enough.

The installation will fail at configuration phase if you do not have an /etc/cluster/cluster.conf [4 ].

I do not like to have such default file for this task.

If install without the file, you will have to repeat the installation to force reconfiguration of all pending packages (dpkg -i *.deb) .

As you will have many files at /etc/default/* and /etc to configure at each machine, and will have to carefully craft YOUR /etc/cluster/cluster.conf , it is not a real blocking problem.

The linux-modules-extra-2.6 , when recompiled and repackaged, generates lots of binary packages for different kernels. You will have to select the suitable ones for your machines.

Also, if you upgrade kernel, you will have to regenerate it.

The cluster configuration and set up is a subject for a future article.

How to recompile linux-modules-extra-2.6 in Debian way

This source package contains some kernel modules and needs to be recompiled when you upgrade the kernel of your machine and if you patched the source, like we did at redhat-cluster suite source package.

GFS over GNBD quirks

Freezes when writing

The Red Hat cluster source 1.03.00-2 with the kernel 2.6.18-4 has a subtle problem leading to seemingly random filesystem I/O freezes during file writings when you mount GFS over GNBD imported devices from GNBD servers.

This problem could be masked by SAN with clever cache and connections and GNBD servers interconected into the cluster through really fast fiber optic dedicated network.

The gnbd servers and cluster coordination involves a big network overhead above the actual filesystem operations.

This problems become evident at slow shared (saturated?) networks.

The I/O filesystem calls are uninterruptible, and your solution is to shutdown or even hardware reset the machine importing the gnbd device.

The work around is to use an undocumented feature of GFS.

Mount at the GFS client machines using option "sync" for disabling filesystem cache by using synchronous transactions.

This will have big bad impact on performance but will give reliable operation.

The newest Red Hat cluster source and kernel versions should be tested about this problem.

Concurrent accesses

During further tests, I spotted a problem with concurrent access at GFS over GNBD imported devices.

Even using sync mount option, when one reads the same device that is being written, ramdom freezes happen.

Tried various other configurations, but nothing solved the problem.

This kind of problem is sometimes difficult to reproduce. You have to impose a reasonable payload over your combined hardware and network capacity.

So, it is unlikely to be seen on high end hardware (servers and disks) and private Fiber Channel networks under confortable loads.

At low end hardware and shared (loaded / crowded?) network, it is much more easy to reproduce.

Even if you use Red Hat, you should investigate your set up against this problem under heavy load.

It seems a problem of GNBD because when you use local GFS, it does not happen.

I further investigated the issue and realized that GNBD kernel module seems to be caught in an I/O deadlock.

As Red Hat supports it in its kernel releases, I guess that it is an unfortunate association of kernel 2.6.18 and Red Hat Cluster Suite 1.03.00-2 source packaged in Debian 4.0 Etch.

Examining the CVS commits at gnbd-kernel source subtree, I realized that developers were already aware of some potential race conditions and dead locks, addressed in recent development versions.

But as GFS2 (included in newer redhat-cluster-suite 2.x series) is considered beta even by Red Hat (as of May 2007), and I would like to not diverge too much of official Debian 4.0 packages (using redhat-cluster-suite 1.x series) , decided to discard GNBD because of its reliability issues.

Will miss its built in fencing code and straight integration with Red Hat cluster suite.

It is also easier to deploy than other solutions in this enironment.

GFS over iSCSI

I am currently testing iSCSI for the task of substituting GNBD, confirmed as the Achiles heel of the cluster.

You need a target device (the device "server") and a scsi initiator (the "client").

I am using all software implementations.

For the target, chose iSCSI Enterprise Target [10 ].

For the scsi initiator, chose Open-ISCSI [11 ].

During 3 days, I tried hard to crash the 3 GFS volumes.

Mounted at only one cluster "client", with gnome desktop, despite being possible to mount GFS volumes locally at their respective servers, opened 6 simultaneous bonnie++ hard disk test suites [12 ] and 2 scp sessions. Scp at local machine allows you to see the transfer speed.

At the humble gfs iscsi targets used and the 100 Mb/s network, and the 1 GB ram P4 desktop used, this was enough to saturate network and turned the desktop almost unresponsive.

  • Not a single crash in 3 days.

Will stay with iSCSI Enterprise Target and Open-iSCSI. See [14 ], [15 ], [16 ], [17 ].

Open-iSCSI initiator and bonnie++ are officially packaged in Debian.

iSCSI Enterprise Target is unnoficially packaged for Debian Etch [13 ] as of may 2007. It is available at Unstable repository already. You may try to backport it because official package is newer.

Updates to this article

Will update this article when more information becomes available for publication.

Update october, 9th, 2007:

The kernel was security updated [20], but the red-hat-cluster source package was not. So, you will still have to apply the patches yourself.

Update november, 7, 2007:

We prepared a text with instructions [21] for recompiling the redhat-cluster-modules needed after you upgrade your kernel.

The repository stock packages does not contain the needed patches and you must patch the sources and recompile packages for installing.

References

[0] rgmanager use arping options from iputils-arping package, not arping's one - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=417813

[1] redhat-cluster: rgmanager not build, missing debian/control definition - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=400204

[2] gnbd.ko block device not packaged - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=426769

[3] clvm needs init script - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=400205

[4] updated init script for fence - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=336259

[5] ccs: cluster.conf(5) missing - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=406364

[6] Debian source for redhat-cluster http://packages.debian.org/stable/source/redhat-cluster

[7] redhat-cluster-source (1.03.00-2) http://packages.debian.org/stable/admin/redhat-cluster-source

[8] Source package: linux-modules-extra-2.6 (2.6.18-7+etch2) http://packages.debian.org/stable/source/linux-modules-extra-2.6

[9[ improved clvmd init script compliant with Debian Policy 9.3 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=427499

[10] iSCSI Enterprise Target (the "server") http://iscsitarget.sourceforge.net

[11] Open-iSCSI project (the "client") http://www.open-iscsi.org

[12] bonnie++ hard disk benchmarking sw http://www.coker.com.au/bonnie++/

[13] iSCSI Enterprise Target unofficial packages for Debian http://iscsitarget.sourceforge.net/wiki/index.php/Unoffical_DEBs

[14] A quick guide to iSCSI in Linux (a bit outdated, but still excellent and crystal clear basics) http://www.cuddletech.com/articles/iscsi/index.html

[15] Open-iSCSI readme http://www.open-iscsi.org/docs/README

[16] iSCSI target and initiator setup http://docs.solstice.nl/index.php/Infrastructure_virtualisation_with_Xen_advisory#iSCSI

[17] Open-iSCSI and SuSe Linux http://en.opensuse.org/Open-iSCSI_and_SUSE_Linux

[18] iSCSI brazilian portuguese hands on article http://www.vivaolinux.com.br/artigos/verArtigo.php?codigo=5809

[19] redhat-cluster-source: does not repackage gnbd-kernel http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=401073

[20] kernel images : http://www.debian.org/security/2007/dsa-1381

[21] how recompile linux-modules-extra-2.6 http://www.techforce.com.br/index.php/news/linux_blog/recompile_debian_linux_modules_extra_2_6

Virtualização e serviço de arquivos em cluster de alta disponibilidade com Debian Etch, com redundância, espelhamento, replicação, em ambientes de desenvolvimento (parte 1)

Virtualização e serviço de arquivos POSIX em cluster de alta disponibilidade redundante com espelhamento síncrono e replicação assíncrona em ambientes de desenvolvimento, com montagem local, suporte a ACL, quotas, direct I/O (dio), asynchronous I/O (aio), homologado para banco de dados Oracle RAC e compatível com PostgreSQL.

How to install Oracle Enterprise 10g R2 in a Debian GNU / Linux Xen guest, domU, into a Debian Xen host, dom0.

It is possible to install and run Oracle Enterprise Database 10g Release 2 (10.2.0.1.0) for Linux x86 in a Debian GNU / Linux 4.0 Etch Xen guest, domU, into a Debian GNU / Linux 4.0 Etch Xen host, dom0.

Massive scale installation and management tools for Debian GNU / Linux (part 1)

You have 5,000 or 50,000 machines to install and manage across the country, at a thousand distant locations into your enterprise.
Such massive GNU / Linux installation and management for corporations needs suitable tools or become a nightmare.

Comentários

Usuários registrados têm permissão para criar comentários.


Translate this page.  

Add to Free Software Daily   
  
Follow meon