Yenya's World

Fri, 06 Feb 2009

Filesystem Round-trip Time

It is not so long since I moved most of my personal machines (home computer, workstation at work, and both laptops) to the ext4 filesystem, which I have been already using for backup partitions for some time. However, these days the development in the filesystems area is really fast (and I am not even going into the networked FS area). So, what is next?

If you are thinking BTRFS, you are right. I have grabbed a fresh-from-git version of btrfs-progs (BTW, the release included in Fedora repository is well behind the kernel, get the git one), compiled it, and now my backup partition runs BTRFS.

The next task is to figure out how to add another partition to the existing filesystem in a mirror-like way, and how to use the FS should either of the disks/partitions crash.

Section: /computers (RSS feed) | Permanent link | 7 writebacks

Tue, 03 Feb 2009

Client-side Redundancy

Many network protocols out there have some kind of client-side redundancy built in the client side: for example, DNS can ask the second nameserver from /etc/resolv.conf, should the first one be too slow to reply in time. For LDAP, multiple LDAP servers can be set up in /etc/ldap.conf. The same with Kerberos, SMTP, and many others. Nevertheless, I think depending (solely) on the client-side redundancy in network protocols should be considered harmful. There are many problems with it:

The information about server availability is not shared even within the same computer. Should the first nameserver in resolv.conf die, all programs on the same computer will try to contact it first, wait 5 seconds, and then fall back to the second entry in resolv.conf. This was not a problem 10 years ago, but these days, users are not willing to wait five seconds for every DNS request while you reboot the DNS server for a kernel upgrade.
The problem is much worse when the primary server is "almost" dead. Yesterday our primary LDAP server died in such a strange way that it still accepted TCP connections, but the userland was dead. So all nscd(8) daemons in our network just tried to connect, and when the connection succeeded, did not even attempt to contact the secondary LDAP server. No LDAP replies until the primary server was restarted.

Therefore I think the redundancy for such latency-sensitive services like DNS, Kerberos, or LDAP should be maintained on the server side using things like Heartbeat and a STONITH device. This avoids the "half-dead" server state, and gives the clients a single IP address to talk with. Fortunately, many client-side protocol libraries have a separate server for write access (such as changing the Kerberos password). So the writes can be redirected to a master server, and reads can be done from a set of two, heartbeat-redundant servers.

Which is what we currently do for DNS and DHCP, and I am thinking about doing so for LDAP and Kerberos as well. The client-side redundancy can be an added bonus, but not a primary solution. How do you handle the redundancy of the network services?

Section: /computers (RSS feed) | Permanent link | 0 writebacks

Yenya's World

Fri, 06 Feb 2009

Filesystem Round-trip Time

Tue, 03 Feb 2009

Client-side Redundancy

About:

Links:

Categories:

Archive:

Blog roll: