Tue, 29 Jun 2010
I had always the impression that Python's capabilities were more-or-less similar to Perl's (minus CPAN, DBI, and minus one-liners). At least this is what Python's proponents are trying to make us believe. Apparently, it is not the case. They are struggling to make the Python interpreter really multithreaded, while I have been runing threaded computations in Perl for at least two or three years now.
Threads in Perl are really simple, and the design - shared nothing unless explicitly declared as such - really helps to write clean and fast programs. Last week I did a massive text-processing application (many individual documents, but summing up the results) on our 24-core server, and I have observed almost linear scaling with increased number of CPUs used.
The similar case is with Python and UTF-8. Python of course now supports it, but its ease-of-use is still far behind Perl. Another case is FreeBSD: for a long time they have tried to make us believe they are faster and more scalable than Linux (their famous ftp.cdrom.com). But apparently it is not the case and probably never has been, at least in terms of SMP support.
I wonder why is it so easy to believe other people's claims, especially when they are experts in a given subject (Python, FreeBSD, etc.).
4 replies for this story:
disorder wrote: perl threads
Since you probably know more about the topic, can you comment on Perl threads? http://justin.harmonize.fm/index.php/2008/09/threading-model-overview/#perl http://search.cpan.org/~jesse/perl-5.12.1/pod/perlthrtut.pod#Performance_considerations http://search.cpan.org/~jesse/perl-5.12.1/pod/perlthrtut.pod#Process-scope_Changes http://search.cpan.org/~jesse/perl-5.12.1/pod/perlthrtut.pod#Thread-Safety_of_System_Libraries It seems that those are more native than Python/Ruby, but not without problems.
Yenya wrote: Re: perl threads
I have read the first two links you have posted, and I think they pretty accurately describe the situation: Perl threads are fast, except for creating new threads, which can have a significant overhead (as opposed to clone(2)). As for the system libraries - well, if the underlying library is not thread safe, one probably should not expect the calling interpreter to make it thread safe by some magic. But all these properties are the properties I write code for anyway, so I don't think they are limiting for my code. And on the other hand, they allow me to easily use all available cores without worrying too much about the synchronization issues.
Matěj Cepl wrote:
Yes, Python threads suck for multi-threaded computation ... they are still quite useful for I/O operations. However, more interesting is your comment on missing DBI ... what's missing from DB-API 2.0 (http://www.python.org/dev/peps/pep-0249/)? It is AFAIK supported for all possible Databases.
Yenya wrote: DB-API
Well, I have not been following Python world lately, but after a quick look at the spec (thanks for the link!): 1. having to deal with cursor objects is IMHO ugly; 2. the prepare() method is missing (and executemany() cannot fully replace it), and 3. the global "paramstyle" option is _really_ ugly. But apart from that, it seems that DB-API is a step in the right direction.