Typically when a command is passed to the shell, the shell will arrange for an executable file to be loaded into memory and a new process is created. Executable files can either be a binary file (usually created by the linker as part of compiling a program) or a shell script (text file to be interpreted by a binary file, like sh(1) or perl(1)). The file(1) command can usually determine what is inside a file.
Binary files need to have a well defined format for the system to be able to use them properly. Part of the file will be the executable machine code (the instructions that tell the CPU what to do), part of it will be data space with pre-defined values, part will be data space with no pre-defined values, etc. Through time, different binary file formats have evolved.
To understand why FreeBSD uses the elf(5) format, the three currently “dominant”, executable formats for UNIX® must be described:
The oldest and “classic” UNIX® object format. It uses a short and compact header with a magic(5) number at the beginning that is often used to characterize the format. It contains three loaded segments: .text, .data, and .bss, plus a symbol table and a string table.
COFF
The SVR3 object format. The header comprises a section table which can contain more than just .text, .data, and .bss sections.
The successor to COFF, featuring multiple sections and 32-bit or 64-bit possible values. One major drawback is that ELF was designed with the assumption that there would be only one ABI per system architecture. That assumption is actually incorrect, and not even in the commercial SYSV world (which has at least three ABIs: SVR4, Solaris, SCO) does it hold true.
FreeBSD tries to work around this problem somewhat by providing a utility for branding a known ELF executable with information about its compliant ABI. Refer to brandelf(1) for more information.
FreeBSD comes from the “classic” camp and used the a.out(5) format, a technology tried and proven through many generations of BSD releases, until the beginning of the 3.X branch. Though it was possible to build and run native ELF binaries and kernels on a FreeBSD system for some time before that, FreeBSD initially resisted the “push” to switch to ELF as the default format. Why? When Linux made its painful transition to ELF, it was due to their inflexible jump-table based shared library mechanism, which made the construction of shared libraries difficult for vendors and developers. Since ELF tools offered a solution to the shared library problem and were generally seen as “the way forward”, the migration cost was accepted as necessary and the transition made. FreeBSD's shared library mechanism is based more closely on the SunOS™ style shared library mechanism and is easy to use.
So, why are there so many different formats? Back in the
PDP-11 days when simple hardware supported a simple, small
system, a.out
was adequate for the job of
representing binaries. As UNIX® was ported, the
a.out
format was retained because it was
sufficient for the early ports of UNIX® to architectures like
the Motorola 68k or VAXen.
Then some hardware engineer decided that if he could force
software to do some sleazy tricks, a few gates could be shaved
off the design and the CPU core could run faster.
a.out
was ill-suited for this new kind of
hardware, known as RISC. Many formats were
developed to get better performance from this hardware than the
limited, simple a.out
format could offer.
COFF, ECOFF, and a few
others were invented and their limitations explored before
settling on ELF.
In addition, program sizes were getting huge while disks
and physical memory were still relatively small, so the concept
of a shared library was born. The virtual memory system became
more sophisticated. While each advancement was done using the
a.out
format, its usefulness was stretched
with each new feature. In addition, people wanted to
dynamically load things at run time, or to junk parts of their
program after the init code had run to save in core memory and
swap space. Languages became more sophisticated and people
wanted code called before the main() function automatically.
Lots of hacks were done to the a.out
format
to allow all of these things to happen, and they basically
worked for a time. In time, a.out
was not
up to handling all these problems without an ever increasing
overhead in code and complexity. While ELF
solved many of these problems, it would be painful to switch
from the system that basically worked. So
ELF had to wait until it was more painful to
remain with a.out
than it was to migrate to
ELF.
As time passed, the build tools that FreeBSD derived their build tools from, especially the assembler and loader, evolved in two parallel trees. The FreeBSD tree added shared libraries and fixed some bugs. The GNU folks that originally wrote these programs rewrote them and added simpler support for building cross compilers and plugging in different formats. Those who wanted to build cross compilers targeting FreeBSD were out of luck since the older sources that FreeBSD had for as(1) and ld(1) were not up to the task. The new GNU tools chain (binutils) supports cross compiling, ELF, shared libraries, and C++ extensions. In addition, many vendors release ELF binaries, and FreeBSD should be able to run them.
ELF is more expressive than
a.out
and allows more extensibility in the
base system. The ELF tools are better
maintained and offer cross compilation support.
ELF may be a little slower than
a.out
, but trying to measure it can be
difficult. There are also numerous details that are different
between the two such as how they map pages and handle init
code.
This, and other documents, can be downloaded from http://ftp.FreeBSD.org/pub/FreeBSD/doc/
For questions about FreeBSD, read the
documentation before
contacting <questions@FreeBSD.org>.
For questions about this documentation, e-mail <doc@FreeBSD.org>.