Tue, 16 Mar 2010
Fedora Mcelog Maintainer Sucks
Welcome to the new episode of the "bashing Fedora bug handling" series, this is your grumpy Fedora user Yenya speaking. I hereby declare that the maintainer of the Fedora mcelog package sucks:
At some time during the Fedora 11 devel cycle, there was a change in kernel in a way which mcelog events are reported. The userland package needed an upgrade. The problem has been reported in June 2009, and is still unfixed in F11 as of now. In the meantime, the same problem has been found in RHEL: reported in November 2009, fixed in January 2010.
Today after installing a new F12 server I have found another problem, which has been handled in a strikingly similar way by the Fedora package maintainer: it is still not fixed as of today, despite the fact that the corresponding RHEL bug has been fixed in September 2009.
Last time I have checked Fedora has been supposed to be a bleeding-edge roll-out-fixes-fast distribution. They are even discussing whether upgrades to new versions of software are feasible inside a single Fedora release. I think such a discussion is premature when some maintainers of packages installed by default are not even able to pull the existing RHEL fixes to their packages.
That's all from today's "bashing Fedora bug handling", see you next time!
Tue, 02 Mar 2010
Aisa
In today's mailbox:
Date: Mon, 1 Mar 2010 20:40:32 +0100
From: arpwatch-at-our-domain
Subject: changed ethernet address (aisa.fi.muni.cz)
hostname: aisa.fi.muni.cz
ip address: 147.251.48.1
ethernet address: 0:25:b3:xx:xx:xx
ethernet vendor: Hewlett Packard
old ethernet address: 8:0:69:yy:yy:yy
old ethernet vendor: SILICON GRAPHICS INC.
timestamp: Monday, March 1, 2010 20:39:50 +0100
previous timestamp: Monday, March 1, 2010 20:20:35 +0100
delta: 19 minutes
hw: SGI Origin 2000, irix6.5
Aisa has been running on SGI Origin 2000 hardware with Origin Vault external disk box since 2000, and was the last non-Linux UNIX server in faculty-wide use (there may be few OpenSolaris and other UNIX boxes here and there, used inside some laboratory, but that's all).
The hardware is still relatively up-to-date even in today's terms: 8 CPUs (at 350 MHz), 2 GB RAM, 100 Mbit ethernet, etc. It has been extremely stable: e.g. no disk have crashed in the last few years (actually, I don't remember any disk crash in Aisa at all, but my memory might be fading, especially concerning the first few years of production use of this server).
Wed, 24 Feb 2010
Image Scaling
It is not very often when there is a software bug which is present in nearly all different implementations which do not even have common ancestor in terms of source code.
The image scaling bug is one of these exceptions. I wonder how many programs simply assume that the luminosity of the pixel created as a combination of the two pixels with luminosities of 0 and 255 (e.g. by downscaling the image) is somewhere around 128.
There are definitely several programs written by yours truly, which are built around this assumption. Altough I remember reading the NetPBM source code and seeing those odd calculations using a lookup table and wondering why they did not simply use the arithmetic mean.
I even think (but my memory is fading, so no strong statement here) that we used the arithmetic mean even in the computer graphics course during my studies.
Wed, 10 Feb 2010
Playing with 6to4
We have finally got some time to work on native IPv6 inside a faculty network
(which includes rewriting the iptables configuration to be
protocol-neutral). In order to test it, I have enabled
6to4 at home.
So now I have a native IPv6 in my home network, and I can even directly
SSH to devices in my home network from the university network, even though
the home network is hidden behind a single IPv4 address. Apparently my traffic
is routed symmetrically, as both directions use the same 6to4 relay
in ip-exchange.de in Nuernberg.
As for the network parameters, I have a direct ping 13.2 ms,
while the ping6 is 27.1ms. The transfer rate, on the other hand,
is purely limited by my ISP (measured by SCPing a large file), and it is
the same for both protocols - slightly above 500 KB/s. Now if only I had a nearer 6to4 relay (maybe in NIX.CZ?).
The setup in Fedora is relatively straightforward, except when the outgoing interface has an IPv4 address assigned from DHCP. So I had to add the IPv6 configuration manually, and will have to change it whenever I get a new IPv4 address (which is usually once per year or two).
Update - Wed, 10 Feb 2010: Fedora problems fixed
My previous statement about problems in Fedora was not true. I must have made a mistake somewhere, but after recheking my setup and restarting the network the 6to4 tunnel works as expected.
Sat, 06 Feb 2010
iPhone
Tell me again why should anybody bother to buy iPhone, when Apple is actively hostile to the application developers, and the device cannot do Bluetooth, does not have audio tracks in a portable format, accessible as a mass storage, and the manufacturer does not want you to customize the device?
Fri, 05 Feb 2010
DHCP Relay
With our new WiFi access points, we run multiple SSIDs, each on its own VLAN. Unfortunately, we are observing an interesting problem: even though the DHCP server can see the WiFi client requesting the address and replies back, the reply sometimes does not reach the client. It vaguely depends on the number of clients on the particular WiFi AP. Today I think I have found what causes this problem.
In an unrelated event, our new 10GbE switches have arrived, and I have been configuring them. When browsing the manuals, I have noticed they have a feature called "DHCP relay", which causes DHCP requests to be magically switched to the primary VLAN of the switch, even when they are obtained from another VLAN. Well, one might think that almost nobody needs this obscure and unintuitive feature, so it is expected to be switched off by default.
Apparently the HP engineers do not share this point of view. Not only
the DHCP relay is on by default on all newer ProCurve switches,
but as a default setting, it is not mentioned in the "show config"
command output, except when explicitly disabled. So nobody knows a new
"feature" has been added, except when re-reading the manuals for every
new ProCurve model.
I have found the DHCP relay feature being enabled even on other recently purchased switches. I have promptly disabled it, and we will see what happens with the above problem of missing DHCP replies.
Fri, 20 Nov 2009
Fedora 12
I have been using Fedora 12 on my laptop for a week now, and on my primary workstation for three days. So far I have walked through Bugzilla and checked that most of my bugs are still present in F12. But apart from that, there has not been any unpleasant surprise so far. The new KMS code and X server for Radeon cards work as expected, so I am looking forward to install F12 also to my dual-seat workstation at home. So far it is OK. Well, except ...
... except this
bug (covered also in
fedora-devel
and also at Slashdot).
I wonder who could seriously have thought this feature would be an improvement?
Probably if the Anaconda can ask whether this is a single-user workstation,
and only then enable it, it would be bearable. But having it on by
default is simply insane. The fact that to disable it, the 6+ lines file
in an undocumented format in four-levels deep directory under
/var/lib should be created, just underlines the gross insanity
of the whole thing.
I have been a long-term supporter of using Fedora also for other purposes than a single-user workstation, but apparently it seems that some Fedora maintainers either do not care (see my post about GDM), or - what is worse - some are actively trying to undermine the other usages of Fedora.
We have been considering returning Fedora to some of our computer labs (to solve some problems with that African-Debian distro), but with this problem I am not sure whether this is a good thing to do anymore.
Update - Fri, 20 Nov 2009: Resolved after all
From fedora-devel:
[...] Executive summary
We'll make an update to the F12 PackageKit, so that the root password is required to install packages.
Glad to see this being resolved relatively fast. This was the most voted-for bug in the Fedora bugzilla (by a factor of almost 10).
Thu, 19 Nov 2009
Database Woes
Using the SQL database for keeping one's data gives an excellent environment, maintaining the data integrity, providing the transactional behaviour, providing the remote access to the data, and so on. Even the locking properties can be something which one can get used to. That is, in the ideal world.
However, our world is not ideal. The huge problem of SQL databases is their implementation. For example, after rewriting the IS MU mailserver back-end to do a parallel delivery, it started to generate big load spikes on the Oracle DB server. The problem turned out to be the cache of SQL queries: when several processes tried to do exactly the same query in parallel, the DB server locked up on the access to the SQL cache, and a simple "select row by its primary key" query took as long as three minutes to handle.
Another example is the Oracle problem with foreign key locking which I have
recently ran into: I have a long transactions running in parallel,
modifying various rows of a single table (but each session touches
a different set of rows, so the access should be deadlock-free).
After creating another table with foreign key to the original one,
I started to get "deadlock detected" errors in DELETE commands.
Apparently Oracle locks not only the appropriate
row in the foreign-key table, but the whole block in this table.
So I have been getting the deadlocks when trying to delete the row
with primary key N, where another session added a row to the table
with foreign key referencing the primary key N+1 or N-1.
Replacing DELETE with UPDATE ... SET status='deleted'
and deleting afterwards from the single session fixed the problem for me.
The SQL databases are pile of rubbish, which can always surprise you not only with their by-definition properties, but often also with their implementation-dependent behaviour. Oracle is an excellent example of this.
Wed, 18 Nov 2009
The GDM Fiasco
A short trip to the history: for GNOME 2.22 (two years ago, in the Fedora 9 timeframe) someone decided that it would be nice to completely rewrite the GNOME display manager. So far so good, but they have decided to include this partially rewritten piece of crap without many important features (a display manager without XDMCP, WTF?) to the official GNOME and thus Fedora releases.
Fast forward to the present time: basically, for two years, GDM has not been usable for anything beyond a single-user desktop (I use xdm on my home dual-seat desktop, and we have replaced Fedora altogether in some of our computer labs partly because of GDM).
- It did not handle XDMCP (at least this one got fixed).
- There is still no way of setting the X server command line, making GDM unusable in multiseat configurations.
- It cannot be configured as XDMCP-only daemon without starting the local X server.
- The login window cannot be configured, and the way it works it is usable on a personal desktop, but definitely not in a computer lab with ~2200 accounts and users logging in on random hosts.
Apparently, somebody has started to work on solving at least some of the problems after all. But guess what? Instead of backing off quickly (say, before the Fedora 10 has been released), Fedora maintainers has ignored the problem despite many polite and even some profane requests to provide an upgrade to the latest working version (i.e. the Fedora 8 one). And now the answer is "wait for Fedora 13 (another half a year), we are probably going to fix it there". Without any hint of being sorry for forcing an utterly broken package to the users for two years and counting.
Tue, 20 Oct 2009
Framework?
When teaching, the questions from the audience provide an important feedback to me - a notion of whether I was successful in passing the information to the audience, and what to improve or explain in a different way. There are, however, rare occasions when the question just makes me think "WTF?".
Yesterday I tried to explain the
setjmp(3)/longjmp(3)
semantics. These two functions are not straightforward, and it probably
takes a while to wrap one's mind around them.
But after that, the usage is quite simple:
the target of the non-local jump is firstly initialized using
setjmp(3), and later the jump itself can be made using
longjmp(3). I have written the following code snipplet
to demonstrate it:
During the lecture when I asked whether there were any questions, the question
was: "But is there any framework for those functions?".
I was totally puzzled:
I probably don't know all the meanings of the English word "framework",
but I think it means something like a higher-level abstraction or environment
to wrap the lower-level things in order to make them simpler to use (often at
a cost of freedom of how to do things). But can this fancy goto
be made even simpler than it is? It would still be necessary to declare
the label somehow (setjmp(3)) and then jump to it
(longjmp(3)).
WTF? What framework?
Fri, 16 Oct 2009
Terminal Font
Today I have read an announcement of the Anonymous Pro font, which should be optimized for text terminals and for the programming environment. As this clearly matches my use case, I have decided to try it.
I was soooo disappointed. I may be too used to the font I use (Lucida Typewriter, the upper part of the image), but I think Anonymous Pro is clearly worse.
Which terminal font do you use? And how does it compare to Anonymous Pro or some other fonts?
Fri, 26 Jun 2009
XXXIV EurOpen.CZ
A month ago (wow, I am really slow to update this blog ...) I went to the 34th EurOpen.CZ conference. I did a presentation about Git (paper, slides).
The first day the weather was pretty good and we even went to Praděd summit to take a few snapshots of the setting sun. The next day was rainy so I set up my dance pads and we had a lot of fun playing DDR in the evening. One of the most interesting things there was Microsoft Surface, which is in fact an overweight PDA (weighs about 90 kg). It is pretty addictive, especially for children. Microsoft can really make a pretty cool hardware. However, it is in some way a bit, well, Microsoftish :-). For example, they apparently invented their own 2D barcodes, ignoring well established standards like QR code or Semacode. Also, apparently the external keyboard and mouse is required to boot the Surface.
There were many interesting presentations: Tomáš Košnar talked about logging the network traffic, Ondřej Surý had a brief intro to DNSSEC, Václav Pergl from Kerio Technologies talked about agile development, etc. Anyway, AbcLinuxu.cz did a report from the conference, and also (oh, horror!) an interview with me, mostly focused on Git.
Wed, 17 Jun 2009
How Do I Install ... ?
The funniest page of the day is the page with installation instructions for OpenAIS. But seriously, do you have experience with those clustering suites?
My task is pretty simple: use a clustered LVM from two hosts. I have been using Heartbeat for my HA clusters (and IPVS for load balancing) for ages, but apparently Heartbeat-based cluster cannot be easily used for CLVM.
Wed, 10 Jun 2009
Fedora 11
OK, after another half a year we now have a new Fedora. I have installed it on my laptop, and found no obvious bugs. It "just works". I haven't got time to read the Release Notes yet, but so far F11 looks good. The minor issues are:
gnome-terminalnow wants to confirm the window closing when something other than the original shell is running inside it. WTF? I guess users are getting more and more stupid and have to be protected from their own stupidity (or maybe Fedora/GNOME is becoming more and more widespread). The only usability problem with this "enhancement" is that it cannot be disabled from the Preferences menu - one has to resort to usinggconf-editor, which is not even installed by default.- Cpufreq on ASUS F3E still does not work (but then, I use my own kernel on F3E and in vanilla it works).
- A nice surprise is that with F11 and its new X.org + WINE, the "In the Groove" game finally renders correctly (in the previous version all the arrows were rendered orange, which made the gameplay more confusing).
- The new boot splash screen is not so minimalistic than the one from F10 (which I consider to be the best boot splash screen ever; its neat visual trick with colors in the progress bar made the boot process feel faster).
- GRUB still
does not support ext4, which means that even those who want a full ext4-based
system need to have a small ext3 partition for
/boot.
Of course, I now have tens of mails from the bugzilla bot forewarning about
closing bugs filled against F9, which I need to test on F11. I guess
many of the gdm >2.20 regressions are still not fixed,
and returning 2.20 to the Fedora (at least as an optional package)
is long overdue.
Thu, 21 May 2009
Weird Hardware
Probably the most weird piece of hardware I have seen in a while is this. I should probably get one in order to do a maintainance of my COSA Linux driver on a real hardware.

