execline
Software
www.skarnet.org
Why not just use /bin/sh ?
Security
One of the most frequent sources of security problems in programs
is parsing. Parsing is a complex operation, and it is easy to
make mistakes while designing and implementing a parser. (See
what Dan Bernstein says
on the subject, section 5.)
But shells parse all the time. Worse, the essence
of the shell is parsing: the parser and the runner are intimately
interleaved and cannot be clearly separated, thanks to the
norm.
Even worse, the
shell sometimes has to perform double parsing, for instance
after parameter expansion. This can lead to atrocities like
zork="foo ; echo bar"
touch $zork
not doing what you would like them to do, even in that simple
case. (zsh has a sane behaviour by
default, at the expense of explicitly breaking the norm.)
execline parses the script only once: when
reading it. The parser has been designed to be simple and systematic,
to reduce the potential for bugs - which you just cannot do
with a shell. After execline has split up the script into
words, it will not be parsed again. Positional parameters, when
used, are never split, even if they contain spaces or newlines
- unless you explicitly ask so. Script writers control exactly what
is split and how.
execline-0.x used some primitive unquoting and
substitution ("expansion") mechanisms, that could not prevent a
security risk from happening when you mixed the two. In that respect,
it was only a minor improvement to the shell.
execline-1.y, on the other hand, comes with new, carefully
designed unquoting and substitution mechanisms that resulted from long
discussions with the users, and that do not suffer from the same weakness.
Unlike /bin/sh,
execline is now a perfectly secure scripting language.
Portability
The shell language was designed to make scripts portable across various
versions of Unix. What a joke ! There are dozens of distinct
sh flavours, not even counting the openly incompatible
csh approach and its various tcsh-like followers.
The ash, bash, ksh and zsh shells
all exhibit a different behaviour, even when they are
run with the so-called compatibility mode. From what I have
seen on various experiments, only zsh is able to follow the
norm to the letter, at the expense of being big and complex to
configure. This is a source of endless problems for shell script writers,
who should be able to assume that a script will run everywhere,
but cannot in practice. Even a simple utility like test
cannot be used safely with the normalized options, because most shells
come with a builtin test that does not respect the
specification to the letter.
execline scripts are portable. There is no
complex syntax with opportunity to have an undefined or nonportable
behaviour. The execline package is portable across platforms:
there is no reason for vendors or distributors to fork their own
incompatible version. execline is registered in
/package, so the
only "official" execline version is mine. Scripts will
not break from one machine to another; if they do,
it's not a "portability problem", it's a bug. You are then encouraged
to find the program that is responsible for the different behaviour,
and send a bug-report to the program author - including me, if the
relevant program is part of the execline distribution.
Simplicity
I originally wanted a shell that could be used on an embedded system.
Even the ash shell seemed big, so I thought of writing my
own. Hence I had a look at the
sh specification.
Aaagh ! I recommend this masterpiece to anyone
who still believes in the virtues of the shell. This specification
is insane. It goes against every good programming
practice; it seems to have been designed only to give headaches
to wannabe sh implementors.
No wonder existing shells are big, complex, slightly incompatible with
the specification, and full of bugs. It's practically impossible
to follow such constraints without turning the code into a huge mess.
An OpenBSD developer said to me, when asked about the OpenBSD /bin/sh:
"It works, but it's far from not being a nightmare".
I don't want nightmare-like software on my system. Unix is simple. Unix
was designed to be simple. And if, as Dennis Ritchie said, "it takes a
genius to understand the simplicity", that's because incompetent people
took advantage of the huge Unix flexibility to write insanely crappy or
complex software. System administrators can only do a decent job when
they understand how the programs they run are supposed to work. People
are slowly starting to grasp this - they are beginning to get fed up with
sendmail and BIND (the two most unbelievably crappy pieces of software
I ever had to deal with). But you don't have to go that far - even
sh, a seemingly simple and basic Unix program, is a threat to
computer engineers' sanity. So, forget about sh, I decided to
take a new approach. For instance, do something obvious
nobody seems to have thought of before: separate interactivity from
scripting. Incredibly brilliant, as you can see.
The execline specification is simple, and,
as I hope to have shown, easy to implement without too many bugs or
glitches.
Performance
Since it was made to run on an embedded system, execline was
designed to be light in memory usage. And it is.
- No overhead due to interactive support.
- No overhead due to unneeded features. Since every command performs
its task then executes another command, all occupied resources are instantly
freed. By contrast, a shell stays in memory during the whole execution
time.
- Very limited use of the C library. Only the C interface to the
kernel's system calls, and some very basic functions like malloc(),
are used in the C library. In addition to avoiding the crappy interfaces
like stdio and the usual libc bugs, this approach makes it easy
to statically compile execline - you will want to do that on an embedded
system, or just to gain performance.
You can have hundreds of execline scripts running simultaneously on an
embedded box. Just try that with a shell.
execline is faster than the shell. Unlike sh's
one, the execline parser is simple and
straightforward; actually, it's more of a lexer than a parser.
The execline language has been designed to be LL(1): keep it simple, stupid.
So the script gets analysed and launched practically without a delay.
I'm interested in any detailed performance measurement: for instance, set
up a web server that spawns a shell script for every page, benchmark it,
then convert the shell script to an execline script and do the same.
Please send me any
reports you have.
execline limitations
- execline can only handle scripts that fit in one argv.
Unix systems have a limit on the argv+envp size;
execline cannot execute scripts that are bigger than this limit.
- execline commands do not perform signal handling. It is not
possible to trap signals inside an execline script. If you want to trap
signals, write a specific C program, or use a shell.
- Due to the execline design, maintaining a state is
difficult. Information has to transit via environment variables or
temporary files, which makes commands like
loopwhile a bit painful to handle.
- Despite all its defaults, the main shell advantage (apart from
being available on practically every Unix platform, that is) is that it
is often convenient. Shell constructs can be terse and short,
where execline constructs will be verbose and lengthy.
- An execline script is generally heavier on execve() than
the average shell script. This can lead to a performance loss when
executed programs make numerous calls to the dynamic linker: the system
ends up spending a lot of time resolving dynamic symbols. If it is a
concern to you, you should try and statically compile the
execline package, to eliminate the dynamic resolution costs.
The remaining execve() costs will be negligible.