mirror of
git://projects.qi-hardware.com/iris.git
synced 2025-01-17 09:01:05 +02:00
152 lines
8.2 KiB
Plaintext
152 lines
8.2 KiB
Plaintext
|
This file describes the kernel architecture. It does no go into detail on all
|
||
|
the fields of structs; for that, refer to the source code.
|
||
|
|
||
|
# Overview
|
||
|
|
||
|
Iris is an operating system. The kernel should be called "the Iris kernel",
|
||
|
but sometimes it is simply called "Iris". If there can be confusion, the terms
|
||
|
"kernel" and "userspace" are used to clarify.
|
||
|
|
||
|
Iris uses a capability based microkernel. Being a microkernel means that most
|
||
|
parts that would be part of a monolithic kernel are not part of the Iris
|
||
|
kernel, but of the Iris userspace. Being capability based means that there is
|
||
|
no public dictionary of running processes; in order to communicate with another
|
||
|
process, the caller must have received a capability to them.
|
||
|
|
||
|
|
||
|
# First class objects
|
||
|
|
||
|
First class objects are implemented by the kernel. They can be used through
|
||
|
capabilities.
|
||
|
|
||
|
- Cap: a single capability. Can be invoked and passed to others.
|
||
|
- Caps: storage container for a fixed number of Cap objects. Every Thread has
|
||
|
at least one of these so it can communicate with the kernel, its parent, and
|
||
|
other processes.
|
||
|
- Receiver: an object that allows to create Cap objects. When those are
|
||
|
invoked, the Receiver's listener receives the message.
|
||
|
- Thread: an execution context. On creation, a number of slots is specified
|
||
|
and space is reserved for that many Caps pointers. Only Cap objects in those
|
||
|
Caps can be invoked from the thread.
|
||
|
- Page: a single page of memory, always 4kB. A Page can be mapped in a Memory
|
||
|
and then accessed by a Thread.
|
||
|
- Memory: Everything[*] needs a Memory object to be stored in. In addition to
|
||
|
storing first class objects, a Memory can own Page objects and map them. A
|
||
|
mapped page is accessible for running Threads that stored in the Memory.
|
||
|
- List: Helper for implementing a list of Cap objects, which are stored by the
|
||
|
caller. A List allows servers to keep a list of clients without paying for
|
||
|
its storage. This prevents a denial of service attack. Each item is stored
|
||
|
with a code that is set and only accessible by the List owner.
|
||
|
- ListItem: an item in a List object.
|
||
|
|
||
|
[*] There is of course one exception to the rule that everything is stored in a
|
||
|
Memory. Everything is a tree, with Memory objects as nodes and all other
|
||
|
objects as leaves. The root of the tree is not stored in anything. This
|
||
|
node is called the "top Memory".
|
||
|
|
||
|
Example: A new process consists of a Memory with one or more mapped Page
|
||
|
objects that hold the code and data, a Receiver, a Thread, and a Caps
|
||
|
that contains a Cap for each of those objects, plus one for its parent
|
||
|
process. That Caps is stored in slot 0.
|
||
|
|
||
|
Note that the kernel provides system calls through capabilities. If a thread
|
||
|
doesn't hold the capability, it cannot make the system call. The parent Cap is
|
||
|
used to request access to other processes, or devices. The Thread has no way
|
||
|
to know if it is talking to the thing it requested, or something that simulates
|
||
|
it. That is intentional; Threads should not be able to detect that they are
|
||
|
being debugged.
|
||
|
|
||
|
|
||
|
# Capability invocations
|
||
|
|
||
|
When a Cap is invoked, a message is sent to the Receiver that created it (or,
|
||
|
if it was created by the kernel, to the kernel). This message contains three
|
||
|
64 bit numbers (which are usually treated as two 32 bit numbers each) and two
|
||
|
Cap objects. Two of the numbers, named d0 and d1, are passed with the
|
||
|
invocation, the third one is named protected_data and is defined when the Cap
|
||
|
is created. The owner of the Cap cannot see or change protected_data; it is
|
||
|
the target's way of recognizing who's sending the message.
|
||
|
|
||
|
The Cap objects in the message are called arg and reply. By convention, a call
|
||
|
that requires a reply passes a Cap for it, which will be invoked with the
|
||
|
reply. However, this is only a convention; if a program wants, it can use both
|
||
|
arg and reply as regular arguments if no reply is required. Normally a Caps is
|
||
|
passed in arg if more than one Cap should be sent though.
|
||
|
|
||
|
Cap objects can be passed around. The target of the invocation cannot see if
|
||
|
the original recipient is calling, or some other process that was given access.
|
||
|
The Receiver does allow to revoke a Cap; after this, any invocation no longer
|
||
|
sends a message to the Receiver. When sending a Cap, a flag specifies whether
|
||
|
it is mapped (the default), or copied. A mapped Cap is revoked when its source
|
||
|
is revoked; a copy is not. To give a Cap to another process and then drop it,
|
||
|
it must be copied. Otherwise the new Cap is immediately revoked.
|
||
|
|
||
|
|
||
|
# Interrupts
|
||
|
|
||
|
Interrupts are handled by one or a few interrupt handlers. In a microkernel,
|
||
|
it would be ideal to let userspace handle them, but that is not reasonable
|
||
|
given the hardware architecture. However, it is possible for the kernel to
|
||
|
find out who should handle it, and then pass it to userspace. In Linux-terms:
|
||
|
the top half is in kernel space, but the bottom half is not. (Note that those
|
||
|
two halves are highly asymmetrical; the top half is very small, the bottom half
|
||
|
can be very large.) So this is what Iris does. A process can register as an
|
||
|
interrupt handler, the kernel masks the interrupt when it arrives, so it isn't
|
||
|
immediately triggered again, enables all interrupts and sends a message to the
|
||
|
registered process. It will normally clear the interrupt condition and
|
||
|
reregister itself as the interrupt handler. The reregistration is required to
|
||
|
avoid queueing of interrupts; if they are not reregistered, they are no longer
|
||
|
handled.
|
||
|
|
||
|
|
||
|
# Userspace
|
||
|
|
||
|
When the system boots, the kernel is started with its first process. This
|
||
|
process sets up userspace. Unlike Linux init systems, the first process does
|
||
|
not continue running; it is hard to change (because the filesystem is not yet
|
||
|
accessible) and so it must be as simple as possible.
|
||
|
|
||
|
As part of the startup, drivers for built in devices are started. These are
|
||
|
regular userspace programs, most of them handle interrupts and all of them have
|
||
|
access to memory mapped I/O. Note that this means they are just as critical as
|
||
|
the kernel; in a monolithical system, only the kernel needs to be ultimately
|
||
|
trusted (if it is compromised, all is lost). With a microkernel, it's both the
|
||
|
kernel and some parts of userspace. The total amount of trusted code is likely
|
||
|
smaller in a microkernel design, because it is easier to split parts that don't
|
||
|
need to be critical into their own process.
|
||
|
|
||
|
A user session is a process which can start other processes and switch between
|
||
|
them. For this, it contains the following components:
|
||
|
|
||
|
- A bag of device Cap objects, which can be mapped to the active process (and
|
||
|
revoked when they are deactivated). What's in the bag can change. For
|
||
|
example, if the user wants sound to continue playing while switching to
|
||
|
another user, the sound Cap must not be in the bag.
|
||
|
- An interface for task switching: when the user makes a system request (which
|
||
|
is some dedicated hardware, such as a button), the active process is
|
||
|
deactivated and the session itself (or a designated helper) is activated. It
|
||
|
allows switching to a different process, or starting a new one, or stopping
|
||
|
or ending running processes. The session can also allow communication
|
||
|
between certain processes. (The processes need to cooperate to actually make
|
||
|
the link; they ask the session for a link of a certain type (for example, a
|
||
|
file system) and the session responds with the Cap or an error.
|
||
|
- There is a list of things that can be started; an important one is a shell,
|
||
|
which allows control over the session. In other words, the shell is able to
|
||
|
start and end other processes, make and break communication links, and define
|
||
|
which programs can be started.
|
||
|
|
||
|
|
||
|
# Multi user support
|
||
|
|
||
|
For multi user support, a login manager is required which can start user
|
||
|
sessions and switch between them. This is very similar to what a user session
|
||
|
does, and so the same process is used for it. Just a few changes are required,
|
||
|
and those can be implemented by choosing different helper programs.
|
||
|
|
||
|
The login manager lets the user select an identity to log in. The login
|
||
|
program itself is run by the user session, so that users can change the way
|
||
|
they log in without asking the administrator to set it up for them. For
|
||
|
example, one user may set up to only allow logging in with a physical crypto
|
||
|
device, while a guest login may be set up that doesn't require credentials at
|
||
|
all.
|