Yenya's World

Fri, 30 Jan 2009

"Too late for -CS" Howto

Speaking on portability: I have recently came across a problem with the newest Perl. We start most of our scripts with the -CS or -CSD switch on the shebang line. Since perl-5.10, this no longer works, it fails with Too late for "-CS" option error message. While I don't understand what has led the Perl developers to this incompatible change, here is the workaround:

The -CS switch can be substituted with the following code at the beginning of the script:

use open ':std', ':utf8';
use open IO => ':bytes';

and the -CSD switch can be replaced with just

use open ':std', ':utf8';

Morale of the story: the most portable languages (and language features) are those which are sufficiently old (which Unicode support in Perl or the STL library in C++ is not). Apart from this problem Perl still seems to be a relatively portable language even for large projects such as IS MU.

Section: /computers (RSS feed) | Permanent link | 3 writebacks

3 replies for this story:

adelton wrote:

$ perl -CS -e 'use utf8; print "křížala\n";' křížala $ perl -v This is perl, v5.10.0 built for x86_64-linux-thread-multi Copyright 1987-2007, Larry Wall Perl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the Perl 5 source kit. Complete documentation for Perl, including FAQ lists, should be found on this system using "man perl" or "perldoc perl". If you have access to the Internet, point your browser at http://www.perl.org/, the Perl Home Page. What am I missing?

adelton wrote:

Hmm. The above thing really had newlines in the textarea. Couldn't you retain them?

Yenya wrote: Re: adelton

Sorry about the newlines, I will try to be more liberal about what I allow in comments. The -CS does not work on the shebang line inside the script, but it works on the real command line. See perldiag(1).

Reply to this story:

 
Name:
URL/Email: [http://... or mailto:you@wherever] (optional)
Title: (optional)
Comments:
Key image: key image (valid for an hour only)
Key value: (to verify you are not a bot)

Thu, 29 Jan 2009

Questionable Content

When browsing the Web, one can often find a content, which is - how to say it - questionable. Recently I have found an excellent example of it :-)

Hannelore Questionable Content is a web comic written by Jeph Jacques. It is not only interesting and funny, but it also contains humanoid robots, lots of rock music references (not that I get most of those), occasional anime references, and the highest proportion of characters with mental disorders from all the web comic strips. The only annoying part is occasional filler strips and guest strips. But on the other hand, it is actually released five times a week (unlike the Order of the Stick these days).

For a light intro you can read for example the story arc beginning in the strip #1322 from a week ago. Or start from the beginning. Favourite character? Hannelore the obsessive-compulsive disorder, of course. Go forth and waste your time :-)

Section: /world (RSS feed) | Permanent link | 0 writebacks

0 replies for this story:

Reply to this story:

 
Name:
URL/Email: [http://... or mailto:you@wherever] (optional)
Title: (optional)
Comments:
Key image: key image (valid for an hour only)
Key value: (to verify you are not a bot)

Wed, 28 Jan 2009

C++ Woes

I wonder why even these days people start new open source projects in C++. C++ is - as far as I know - by far the least portable compiled language. I would understand using C++ in big proprietary projects, where everything including the compile-time environment is fixed. But for open source projects, where compiler versions, header file features, etc. vary greatly? No way.

The project I hate today is named IMMS. While trying to compile it I had to edit 11 different files, adding #includes all over the code. I wonder how this could have been buildable anywhere at all - there were missing prototypes/definitions for things like memcpy(), INT_MAX, abs(), etc.

I cannot imagine how could it be possible at all for these symbols to end up defined in the author's build environment. The problem (one out of many) of C++ is that it is very strict about missing prototypes, but in turn it does nothing for preventing the namespace pollution, i.e. symbols being defined "accidentally", and thus the project being unbuildable elsewhere. I ran into the same problem a week or so ago when trying to compile a few months old version of Valknut.

Recommended reading: Linus' response to a question why Git is not written in C++. Morale: stay away from C++, or your projects will end up unbuildable after only few months.

Section: /computers (RSS feed) | Permanent link | 14 writebacks

14 replies for this story:

Hynek (Pichi) Vychodil wrote: Which one?

I wonder which one other compiled language you recommend to use? ANSI C, Haskell, Forth, Java, Erlang ;-) ??? I'm agree with you that C++ is one of the worst language to choose for open source project not only for portability issues. But which one? D, Objective C, CL, Clojure, Scala, Qi II, Oz, Perl ;-) ?

Yenya wrote: Re: Which one?

I would go with plain old C for compiled languages, or Perl for scripting languages. They are both mature languages with wide range of available libraries. Of course the surrounding environment would matter too - for OpenStep project ObjectiveC is definitely a valid choice, for example. I would not use any "niche" language like Haskell or Forth, and I hope my opinion on Java and Python is widely known :-)

Milan Zamazal wrote:

There is no single universal all-purpose programming language. Just choose a fitting language or a combination of languages for each particular project. "Fitting" means one of those you like, you understand, are suitable to the problem you try to solve, allow writing efficient and maintainable code efficiently, are known to your coworkers, provide useful libraries and development tools, etc. There is no inherent reason to avoid niche languages as long as a good free compiler and the necessary set of libraries are available. C, Haskell, Forth, Java, Erlang, Objective C, CL, C++ can all be good languages to choose from. I use several very good applications written in C++, although I wouldn't choose C++ for any of my own projects.

michal wrote: portability

imho c++ is well portable languege indeed. But there is no abstraction layer between the language and an OS (platform). So it's the programmers job, to keep his code portable. He has to know, that he can and what he can not do on target platforms. There support for namespaces is very good in C++. Gues what std::cout does? And the conclusion? Programmers shoud use their tools visely :-)

Yenya wrote: Re: portability

"In theory, there is no difference between theory and practice. But in practice, there is.". Yes, it is programmers' job to keep the code portable. But in fact, C++ makes it very hard. How can programmer know (or even notice) that he uses memcpy() without #including string.h first, when in his own environment string.h is #included from some other system file, and thus everything works for him? C++ is _not_ portable (in practice, I mean).

avakar wrote:

Well, LT's response regarding the quality of C++ programmers is (sadly) pretty accurate. An incompetent programmer is just more likely to choose C++ over C. That's also the reason why C++ projects tend to be crappier than C projects in average. As with everything, it comes down to the quality of contributors. That C++ may be less suited for open source development is certainly not caused by it being "a horrible language". Admittedly, it is much more difficult and time-consuming to learn to use it correctly (emphasis on correctly). Regarding IMMS, the real problem isn't the missing prototype for memcpy, it's the very use of memcpy in a C++ project. Regarding namespace pollution, C++ provides namespaces to deal with that. In fact, C suffers from exactly the same defect (even worse since C does not provide namespaces at all). Regarding "varying header file features": those have been fixed for more than 10 years now. The fault is again on the shoulders of programmers whose code use toolset-dependent features.

avakar wrote: Re: portability

Let me ask you in return: How can a C programmer know (or even notice) that he uses memcpy() in his C project without #including string.h first, when in his own environment string.h is #included from some other system file, and thus everything works for him?

Yenya wrote: Re: avakar

C does not have this problem, because in C, prototypes are optional. So C compiler would never barf on missing prototype.

GM wrote:

In my opinion, C++ has very poor concept from the beginning, all its features are the proof (look at templates, you wouldn't be able to work with collections because of broken design of C++). However at least Trolltech guys shown me with their QT, it is possible to write good and portable application even in C++. But I would also never say, perl is good scripting language. I see same lack of design as it is with C++, so many features and just because of features itself (like it is in C++). Portability of perl programs is very similar to portability of C++ in my eyes, I struggled tough with perl script to let it run on different systems. The other point, anything what could be called "scripting" is *not* portable at all (web vs. system, Unix vs. other Unix, etc). The only advantages of perl are these two: 1) it is real programming language in the opposite to sh/awk/sed combo, 2) you can write good oneliners in opposite to other real programming languages. Neither one of these two is a good reason for such weird design. Well, everybody has its greatest candidate for the most shitty popular language. ;-)

avakar wrote: Re: missing prototypes

[... in C, prototypes are optional.] Fair enough. Missing prototypes are, however, a disaster waiting to happen (fabs(1)).

Hynek (Pichi) Vychodil wrote: Re: Re: Which one?

I don't know if your opinion on Java and Python is widely known but hasn't know for me. When you wrote it in those manner I think it is very similar to mine ;-)

Hynek (Pichi) Vychodil wrote:

@GM: Perl programs are far better portable than C++ ones. I haven't ever felt in memcpy() like problem. You should just keep using modules which are in standard perl distribution and keep your file paths manipulation with File::Spec and some other thinks. See perlport - Writing portable Perl Be aware of two important points: Not all Perl programs have to be portable There is no reason you should not use Perl as a language to glue Unix tools together, or to prototype a Macintosh application, or to manage the Windows registry. If it makes no sense to aim for portability for one reason or another in a given program, then don’t bother. Nearly all of Perl already is portable Don’t be fooled into thinking that it is hard to create portable Perl code. It isn’t. Perl tries its level-best to bridge the gaps between what’s available on different platforms, and all the means available to use those features. Thus almost all Perl code runs on any machine without modification. But there are some significant issues in writing portable code, and this document is entirely about those issues.

Sten wrote:

It is not C++'s fault that the programmer of the IMMS does not know how to programm in it, since there's no such thing like memcpy or INT_MAX in (strict) C++. Namespace pollution can be avoided by using anonymous namespaces - again, programmer's fault not to do so. The same that would cause namespace pollution in C when he/she would forget to write "static".

wrote: Yenya

The point is, that with C, the program would compile and run without #including the proper file (or forgetting static), while in C++ it does not even compile. And there is no way for the programmer to know that the program compiles on his own machine only accidentally.

Reply to this story:

 
Name:
URL/Email: [http://... or mailto:you@wherever] (optional)
Title: (optional)
Comments:
Key image: key image (valid for an hour only)
Key value: (to verify you are not a bot)

Wed, 14 Jan 2009

Git

I have finally got my feet wet with git. I have been playing with it before, but now I did some real work using it. There is a plenty of articles comparing git to other SCM systems, mostly pointing out advantages of distributed SCM. I will not repeat it here (altough it is also an important point), I will write about what I find most appealing on git specifically:

Branches
Unlike Mercurial, branches are first-class citizens in git. With hg, you are often better cloning the repository than undergo the pain of using hg branches.
Readability of changesets
git makes it really easy to make the development work readable by others. I.e. it allows the flow of changes to be neatly split into logical pieces, which can be then read, reviewed, and individually applied by others. This includes not only git-format-patch(1) which takes the history and makes each commit a separate RFC822-like file which you can mail to somebody else (and git-am(1) on the other side), but also git commit --amend which allows to edit the previously commited data. So as long as you do not publish the repository, you can rewrite the history as you like.
Hot-fixes
... AKA git-stash(1): you can postpone all your work just to commit a simple fix to something else, and then switch back to your long-term work.
It is fast!
... altough I have to find out when to call git gc and when/what to repack to be even more efficient.
No special server-side software to publish your changes
(not only git-specific) The repository is a directory structure which can be published e.g. via HTTP by any HTTP server. So when I wanted to contribute to some project, I just cloned its repository, commited my fixes, cloned it again to my public WWW space, and then mailed the developer "please pull from this URL and let me know what do you think". Zero-cost public repository setup (remember how long it took SourceForge to provide a public SVN repository?). Zero "politics" amongst developers (no "I have a commit access" anymore).

After having used CVS for many years, Subversion for some years, and briefly BitKeeper, Arch/tla, and Mercurial, I think Git is by far the best one (that reminds me: do not bother to read any articles comparing Git < 1.6 with Mercurial - present Git is very user-friendly and well-documented). So the message is: do not even thing about using a centralized VCS for new projects, and amongst distributed VC systems, at least have a look at Git.

What is your own experience with version control systems?

Section: /computers (RSS feed) | Permanent link | 1 writebacks

1 replies for this story:

Milan Zamazal wrote:

Your conclusion is right. git is a clear winner among new version control systems. I've been using it for about two years, only in simple ways, but it nevertheless assured me this is the right system to use. git may not be perfect, but it works, it's well maintained and its future looks well. Other distributed version control systems I tried before didn't have all these properties (although some of them made important contributions to the process of developing a proper CVS replacement). And Subversion is just a demonstration of a wrong approach to the problem from the beginning, not worth of trying to use it at all, despite many people like it.

Reply to this story:

 
Name:
URL/Email: [http://... or mailto:you@wherever] (optional)
Title: (optional)
Comments:
Key image: key image (valid for an hour only)
Key value: (to verify you are not a bot)

About:

Yenya's World: Linux and beyond - Yenya's blog.

Links:

RSS feed

Jan "Yenya" Kasprzak

The main page of this blog

Categories:

Archive:

Blog roll:

alphabetically :-)