Yenya's World

Tue, 19 Feb 2008

The Cost of Flexibility (and Cleanliness)

In the previously mentioned distributed computing project, I am trying to do something like the following code:

sub parse_file {
    my $fh = shift;
    while (my $parsed_data
            = nontrivial_get_data_from($fh)) {
        handle($parsed_data);
    }
}

The nontrivial_get_data_from($fh) code is indeed non-trivial (in the terms of lines of code, not necessarily in the terms of CPU time), while handle($parsed_data) is pretty straightforward. Now the problem is that I want to use this non-trivial code with different handle($parsed_data) routines (for example, printing out the $parsed_data for testing purposes). A natural way would be to implement a pure virtual class in which the $self->handle($parsed_data) routine would be called inside the parse_file() method, and which the programmer would subclass, providing different $self->handle() implementations.

I have found that using a subclassed method $self->handle() instead of putting the handling code directly into parse_file() costs about 14 % of time (the dirty inlined code took 35 seconds on the test data set, while the nice and clean subclassed one took 40 seconds).

So, my dear Perl gurus, how would you implement this? I need to call different code in the innermost loop of the program, and just factoring it out into the subroutine (or a virtual method) costs me about 14 % of time. Maybe some clever eval { } and precompiling different instances of parse_file()? In fact I don't really need the flexibility of objects: I need only a single implementation of handle($parsed_data) in a single program run, but I want to be able to use a different handle() code with the same parse_file() code base called from different programs.

Section: /computers (RSS feed) | Permanent link | 6 writebacks

6 replies for this story:

Adelton wrote:

How about taking reference to that handle function and passing it to the parsing function as an argument. That way you'd still have the liberty of using different handles, while avoiding the ISA method resolution. Hmmm. I remember that back in 2000, Andreas K. was solving similar problem (slow method dispatching) with Apache::HeavyCGI. And of course, there is DBI as an example of module which avoids the class inheritance and method dispatching for speed reasons.

Yenya wrote: Re: Adelton

As far as I can tell, the problem persits even when I use the handle() code as a static function. Just factoring out the handling code into a separate function instead of writing it directly to the parse_file() causes this 14% slowdown. Object inheritance here does not have measurable overhead against a function call.

Miroslav Suchý wrote: Param length

How big is $parsed_data? Can you use reference instead? $a = 'some data' x 10000; for (1..1000000) { handle($a); } This take 27 sec, whereas: for (1..1000000) { handle(\$a); } last only one second.

errhm..hmm..yesihavesome wrote: I don't want to troll, but..

FOURTEEN PRECENT? Buy a faster CPU, and you're done.

Yenya wrote:

Well, 14 % in a single part of the code (and ignoring it) can easily lead into ten percent here, another ten percent there, and the system would be unbearably slow. And there is an upper limit of how fast CPU you can buy (not to mention other limits, such as memory bandwidth).

errhm..hmm..yesihavesome wrote: I don't want to troll, but..

Don'ŧ worry. 10% here + 10% there = 10% overall.

Reply to this story:

 
Name:
URL/Email: [http://... or mailto:you@wherever] (optional)
Title: (optional)
Comments:
Key image: key image (valid for an hour only)
Key value: (to verify you are not a bot)

About:

Yenya's World: Linux and beyond - Yenya's blog.

Links:

RSS feed

Jan "Yenya" Kasprzak

The main page of this blog

Categories:

Archive:

Blog roll:

alphabetically :-)