A Tour through the NetBSD Source Tree - Part III: Kernel
This is the third part of our tour through the NetBSD source tree. After we have
talked about the various components that build up the userland, we will
concentrate on the kernel source now. It is located in /usr/src/sys, with the
/sys symlink being a well-known abbreviation to reach the system's kernel
source.
Let's remember what happens when building a kernel: after editing the kernel
config file located in /sys/arch/<arch>/conf and running config(8) on it,
a number of files are created in /sys/arch/<arch>/compile/KERNELNAME. The
header files contain data about what and how many devices to include, as well as
other data for the system's configuration. Besides that, a Makefile is created,
that is used to build the kernel from source. The interesting point to note here
is that there is only one Makefile that will locate and compile all the needed
sources and place the object files in the .../compile/KERNELNAME directory. In
NetBSD, there is no recursive tree-walk of the whole source tree using several Makefiles to build the various sub-trees of the kernel source. This
allows building kernels for several configurations and platforms from the same
source, without different builds tripping across one another.
Still, the various parts of the NetBSD kernel are placed in various
subdirectories that we will have a closer look at now. Under /usr/src/sys, there
are:
- adosfs, coda, filecorefs, isofs, msdosfs, nfs, ntfs:
- These are various filesystems used directly by NetBSD to access data. Some
of the filesystems' primary goal is to help in exchanging data between the
machine's native operating system (AmigaOS's adosfs, Acorn Computers RISC
OS's filecorefs, ...), while others implement filesystems that can be found
on many systems (isofs, nfs, ...).
- ufs:
- The Unix (User) File System is the base of the native filesystem used in
NetBSD. Ancient (AT&T) Unix filesystems only allowed up to 14 chars long
filenames; there were no symlinks, for one. The problems were solved by the
Berkeley computer scientists implementing BSD Unix. Their filesystem
implementation serves as a base for several filesystems based on it these
days, using various methods of data layout on the disk.
The filesystems are stored in the "ufs" subdirectory, filesystems
contained in there include:
- ext2fs: Linux' ext2fs
- lfs: Log structured filesystem
- mfs: Memory filesystem, for things like in-core /tmp
- ufs: The native NetBSD filesystem
- ffs: General routines of the Berkeley Fast File System, used by
the other UFS-based filesystems, including things like softdeps.
- miscfs:
- This directory contains further filesystems that aren't directly related to
physical storage. Instead they implement various layered filesystems for
services like data translation or routines for implementing kernel features.
Using the virtual filesystem operations table, it is easy to change
behaviour of a operation upon certain conditions, e.g. mapping operations to
deadfs on a file whose filedescriptors were revoke(2)'d.
The filesystems included here are:
- deadfs: Implements operations that don't modify any data and instead
return indications of invalid IO. Used to revoke(2) file
descriptors.
- fdesc: Maps a process' file descriptors into filesystem space,
depending on the accessing process. Can be mounted on /dev/fd using
mount_fdesc(8).
- fifofs: Implements FIFOs using Unix domain sockets internally.
- genfs: Generic filesystem functions that mostly return errors of
some kind - bad filedescriptor, bad operation, or one that does no
operation at all. Used for implementing deadfs etc.
- kernfs: This filesystem is usually mounted under /kern and provides
various informations about the running system, like kernel version,
system time, etc.
- nullfs: Used to "mirror" one directory tree onto another
directory, providing the same tree on both mount points. Also known
as loopback mount - see mount_null(8) for more information.
- overlay: The operation of this filesystem is similar to the null
filesystem. The implementation allows using this filesystem as a
base for further layered filesystems, however, as all VFS operations
are defined. See mount_overlay(8) for more information.
- portal: The portal filesystem provides an service that allows
descriptors such as sockets to be made available in the filesystem
namespace following conversion rules given in a config file. See the
mount_portal(8) manpage for further information.
- procfs: Similar to kernfs, this filesystem is usually mounted on
/proc and allows accessing various data about processes. It is used
by ps(1) and other utilities. See mount_proc(8) for more
information.
- specfs: Implements routines to access special devices. The
filesystem provides a filesystem interface, and calls the
device-specific routines depending on the device's type, major and
minor number.
- syncfs: Operations used to implement the ioflush kernel thread that
writes out modified pages to disk.
- umapfs: A filesystem for re-mapping UIDs/GIDs, useful, e.g., when
mounting a NFS volume from a server that has a different set of
UIDs/GIDs than the local machine.
- union: This layered filesystem allows merging two filesystems,
providing a view as if they were mounted on the same mountpoint.
Modifications go either to the "upper" or to the
"lower" layer, which allows mounting a CDROM
(read-only :), and mounting an empty but writable directory over it,
making it possible, for example, to do a compile on a source expanded on the
CDROM. See mount_union(8) for further details.
- compat:
- This directory contains code for emulating binary compatibility with various
non-NetBSD operating systems as well as with old NetBSD binaries. It
includes:
- aout: This subsystem is used to run native NetBSD a.out binaries on
systems that made the transition to the ELF executable format. As
for most emulations, the shared library loader ld.so, shared libs
etc. are looked for in /etc/aout first.
- common: Various common routines used by all emulations like system
call table translation routines; also contains compat code for prior
NetBSD releases, see the COMPAT_* kernel options in options(4).
- freebsd: mostly a few glue routines for running FreeBSD/i386 a.out
and ELF binaries; See the compat_freebsd(8) manpage for details on
setting things up!
- hpux: To run native HP/UX programs on the Motorola based hp300/hp400
machines. Adjusts a fair number of calls, including terminal IO,
signals, IO, etc.
- ibcs2: This code implements the Intel Binary Compatibility Suite
version2 used for running SCO programs on i386, but also for general
compatibility with AT&T System V.3 which is used on the VAX
port. Maybe it should have been named COMPAT_SVR3 - the
compat_ibcs2(8) manpage contains more data.
- linux: Code to run a.out and ELF Linux binaries for a number of
hardware platforms, including alpha, arm32, i386, powerpc, mips,
m68k, sparc and sparc64. One of the special things of the Linux
emulation is that Linux uses a different system call table on each
port, which makes maintaining things a bit more interesting. The
code is seperated in a "common" directory that applies to
all platforms, and various architecture specific directories for
different CPUs. The compat_linux(8) manpage contains more
information on using the system, and there are also several packages
in pkgsrc that help in setting up the necessary shared libraries
,etc., to run Linux binaries like Netscape or Acrobat Reader.
- m68k4k: Some of the m68k ports used to use a pagesize of 4k instead
of the 8k common today. This code helps in maintaining binary
compatibility with old binaries that still use 4k.
- netbsd32: Used by 64bit systems like sparc64 to run native 32bit
binaries. Maps the programs' 32bit args to the 64bit args used by
LP64 systems' kernels.
- osf1: The compat_osf1(8) system allows running OSF/1 (AKA Digital
Unix AKA Tru64) on the Alpha platform.
- ossaudio: This software layer provides Open Sound System compatible
ioctl calls that are then mapped to the native NetBSD audio model by
this code. Enabled when compiling in support for Linux and/or
FreeBSD binary compatibility.
- pecoff: This subsystem allows running programs that are in the
PEcoff executable format, which is found on the Microsoft Windows
platform. Of course mapping system calls is a real challenge here,
as the API to present to the upper layer is definitely nothing that
is even remotely near to the API used on all the Unix-like compat
systems, and as such there's no easy mapping of the calls to NetBSD
functions. Much of the work is done by libraries in the userspace
instead, which then talk to the X server, etc. See the
compat_pecoff(8) manpage for further details.
- sunos: If users still have SPARC or m68k applications built for
SunOS 4.x, this emulation layer will help run them. See
compat_sunos(8) for more information.
- svr4: The System V compat system allows binary compatibility for
several systems, e.g. Solaris (SunOS 5.x) on i386, sparc and
sparc64, Amix on m68k and SCO/Xenix on i386. The compat_svr4 manpage
contains further information.
- ultrix: For pmax and other MIPS based systems as well as VAX
systems, to run Ultrix binaries. See compat_ultrix(8).
- vax1k: For VAX binaries that still use 1k pagesizes, this allows
running them. No idea where these originate - probably very
historic. :)
- conf:
- The /sys/conf directory contains the main list of files to include into
kernel builds as well as scripts and files used to update the OS version and
compile it into the kernel. The operating system's version is stored in the
"osrelease.sh" script, which is used from a number of places to
determine the OS version.
- crypto:
- This directory contains code for various data encryption standards (arc4,
blowfish, DES, Rijndael etc.) that is subject to crypto export regulations.
The code is use by the IPSec kernel subsystem.
- ddb:
- The DDB kernel debugger that can be used to do post mortem debugging is
found here. The debugger is used on all NetBSD ports.
- dev:
- This directory contains device drivers that use the machine independent
bus_dma(9) and bus_space(9) interfaces and that work on all platforms that
support the necessary bus glue routines. There are several subdirectories
grouping drivers by various categories:
- bus interface: cardbus, eisa, ieee1394, isa, isapnp, mca, pci,
pcmcia, sbus, tc, usb, vme, qbus, xmi
- functionality: ata, i2c, i2o, mii, ofw, pckbc, raidframe, rasops,
rcons, scsipi, sysmon, wscons, wsfont
- general interfaces that are backed by bus-specific drivers: audio,
midi, rnd
The directory structure is mostly oriented towards the bus system that a
hardware device attaches to, not towards the functionality it provides.
There are no special categories for things like audio, network etc. - these
are in their bus-specific directories like pci, isa etc. containing (only)
the bus-specific attachment routines.
If a chip implements some functionality like audio, network or SCSI, it is
often used on several cards that all have the same chip, but different bus
interfaces - ISA, PCI, etc. To prevent maintaining several drivers that have
identical core functionality, NetBSD drivers are separated into bus-glue
code kept in the bus-specific directories mentioned above, and the core
functionality of the integrated circuit. Naming conventions help identifying
e.g. network cards (if_*), but aren't implemented thoroughly,
unfortunately.
The drivers for the core functionality are stored in the "ic"
subdirectory, with the file names indicating the IC's chip numbers:
% ls /sys/dev/ic
CVS cac.c isp_target.c pckbc.c
Makefile cacreg.h isp_target.h pckbcvar.h
README.ncr5380sbc cacvar.h isp_tpublic.h pdq.c
ac97.c cd1190reg.h ispmbox.h pdq_ifsubr.c
ac97reg.h cd1400reg.h ispreg.h pdqreg.h
...
- ipkdb:
- An IP-based debugger interface to a remote machine. Another way to debug the
NetBSD besides the DDB kernel debugger and gdb, which can be used for
debugging both userland and kernel.
- kern:
- This directory contains the core kernel code including a number of
facilities:
- loaders for executables in various formats (a.out, EOF, COFF,
scripts ...)
- process and (kernel) thread management
- signal delivery and handling
- terminal IO subsystem
- sockets and other interprocess comunication primitives
- virtual filesystem layer, providing the framework used by the
filesystems in /sys/miscfs.
- many auxiliary routines used from all places
- lib:
- Throughout the NetBSD kernel, there are many tasks that are used from many
places, and that are stored within a few libraries that are used only in the
kernel:
- libkern: This is basically what libc is for the userland, with
functions used for providing various arithmetic operations that
can't be inlined by gcc as well as string/memory copy/comparison
operations.
- libsa: The StandAlone library provides functions used for loading
the kernel, when there's no operating system running yet and thus
many of the services provided by the NetBSD operating system are not
available. The library includes code for netbooting (rarp, RPC,
NFS), locating/loading the kernel from an UFS, LFS, ISO 9660 or
tar-structured media, memory management and others.
- libz: In-kernel decompression library for loading gzip compressed
kernels.
- stand:
- This directory contains source for several standalone programs that aren't
used by NetBSD currently.
- lkm:
- NetBSD supports loadable kernel modules, and the sources are in this
directory. LKMs include a floppy driver for mac68k, various binary
emulations, IPFilter logging and several filesystems.
- net:
- NetBSD's networking framework contains many routines that are independent of
a special protocol, and that are used by several networking
protocols/stacks. The components are included in this directory, functions
include packet filtering (BPF), access routines for all hardware cards
(ARCNET, ATM, Ethernet, FDDI, IEEE 802.11, PPP, Token Ring, etc.) that hand
device access to drivers in the /sys/dev directory, routing code etc.
- netatalk:
- The code in this directory implements the kernel part of the AppleTalk
protocol stack. The userland part is not included in NetBSD, it can be
installed from pkgsrc/net/netatalk(-sun).
- netccitt, netiso:
- Not in widespread use these days, NetBSD compes with an ISO/OSI protocol
stack which is located in these directories.
- netinet:
- Internet stuff - the NetBSD TCP/IP (v4) stack. Documentation on this is
available in section 9 of the NetBSD manual pages as well as in Richard
Steven's "TCP/IP Illustrated" books.
- netinet6:
- Internet, next generation - this directory contains the KAME IPv6 stack that
is shipping with NetBSD. See
http://www.kame.net/ for further
information.
- netkey:
- Key management for IPSec - see the ipsec(4) manpage for more details.
- netnatm:
- The code in this directory implements native mode ATM to transport other
protocols like IP.
- netns:
- NetBSD has support for the Xerox network service protocol, which can be
found in this directory. Not in widespread use any more today, the protocol
is described in the first edition of Richard Stevens' "TCP/IP Network
Programming" book.
- sys:
- This directory contains only header files that get installed into
/usr/include/sys.
- uvm:
- The code in this directory implements NetBSD's New Virtual Memory system that
replaced the old Mach-based VM system some time ago. See the uvm(4) manpage
for more information.
- vm:
- This directory has only the header files of the old Mach-based virtual
memory system left, for use with various programs. The VM system itself is
not used any longer.
- arch:
- Code specific to one hardware platform is collected under this directory.
Directories are present for each port as well as for CPU-specific functions
that are shared by several ports that use the same CPU, avoiding redundancy.
Port-specific directories contain several subdirectories, with the following
ones being present for all ports:
- conf: contains kernel config files, a list of files specific to the
port and a template for the Makefile used to build a kernel
- compile: This directory is initially empty. It gets populated by
config(8) with directories that contain a Makefile and headerfiles
to build a kernel.
- <port>: Port-specific functions, CPU/MMU/CPU initialisation
code, etc. - all the machine specific code that cannot be shared
across various hardware architectures.
- include: machine specific include files that describe the CPU and
MMU layout, data formats used by the FPU, limits, etc.
- stand: This directory contains sources for loading the kernel into
the system - usually it contains code for bootblocks, secondary
stage bootloaders, netboot miniroots and other facilities used to
boot the system.
Further directories may exist in the arch specific directories that contain
bus-specific/non-machine independent device drivers which don't fit into
/sys/dev as they work on one port only. Ideally, a port only uses machine
independent drivers, of course.
We have now described all the important directories that are available in the
NetBSD source tree. To get used to the directory structure, it is recommented
that you browse the directories and have a look at the various files to fully
explore things.
|
|