Wed, 15 Nov 2006
Hunting Ghosts
Today I worked on synchronizing filesystems on some of our high-availability
systems. We use custom-made
rsync
-based
setup for checking for differences between filesystems in a cluster.
One of the hosts in a H-A pair has been down for a while because of a faulty
hardware, so I had to manually check whether the changes on the active system
can be propagated to the backup as well. I have synchronized the filesystems,
and switched the load to the newly plugged-in host (because it is faster
than the other one). Just to be sure, I re-ran the checks again,
and was surprised: some files have been different on the new host now.
What was worse, the set of files which were different was a bit suspicious:
bash
, login
, tcpdump
, some other utils
and libraries, including those which are run every time system boots
(such as heartbeat
and its libraries). I ran "rpm -V
", just to be sure the files
are different
than in the RPM database, but it displayed that all files are OK and well
matching the database. I took the clean RPMs from the FTP file repository,
and the files in question were shorter in the package than on my filesystem.
I thought: are current rootkits so smart that they modify the RPM database,
and so stupid that "ls -l
" still can tell the difference?
"rpm -qlv bash|grep /bin/bash
" displayed that there was
a different size in the RPM database than in the file itself, yet
"rpm -V bash
" said the package was perfectly OK.
Strange. So I suspected the rpm
program has been modified
as well (even though it did not show up in the list of modified files).
To prove this, I used strace
. On a clean system its output
was shorter, and the difference was that on a modified system
rpm
spawned some more threads/processes.
"strace -f
" then showed the quilty party - the rpm
command executed prelink
on each modified binary.
So I have been hunting ghosts all the time: the files in question have only
not been prelinked yet, or the prelinking info has been overwritten (or not
overwritten, I don't know) by my synchronization scripts.
After running "/etc/cron.daily/prelink
" on
a "modified" system both filesystems look the same. Problem solved.
For a long time I wondered
how prelinking can be done without modifying the binary (and thus breaking
the packaging system). The answer for rpm
appears to be:
the package manager needs to know about prelinking as well. I have to find
some time to read Jakub's
prelink paper (PDF).
Back to a serious work now.
3 replies for this story:
Vasek Stodulka wrote:
Yes, prelink. It shoud be written at the top of rpm man page (maybe also bold and red) that prelink modifies binaries and RPM knows about it. Most people find differences and then hunt ghosts - just like you. I only wonder how a guru like you do not know this. :-)
Yenya wrote: Too much knowledge
I think hunting ghosts is not bad per se, provided that I find the right answer after all. It is probably that I know about many things that can go wrong, so it takes time to find the right one. Last week one of my students asked me to find out whether he cannot login via KDM any more after installing some completely unrelated package. I have traced the X startup scripts and so on, and it took me at least quarter an hour before I ran "df /" and discovered that his root filesystem is full...
Peter Kruty wrote: knowledge
Right, there is so much knowledge about the live Linux system, that sometimes tooks time to find out right source of the problems.