mirror of
git://projects.qi-hardware.com/iris.git
synced 2024-12-28 11:39:53 +02:00
add architecture description
This commit is contained in:
parent
10ea4a0725
commit
c705511200
151
doc/kernel.txt
Normal file
151
doc/kernel.txt
Normal file
@ -0,0 +1,151 @@
|
||||
This file describes the kernel architecture. It does no go into detail on all
|
||||
the fields of structs; for that, refer to the source code.
|
||||
|
||||
# Overview
|
||||
|
||||
Iris is an operating system. The kernel should be called "the Iris kernel",
|
||||
but sometimes it is simply called "Iris". If there can be confusion, the terms
|
||||
"kernel" and "userspace" are used to clarify.
|
||||
|
||||
Iris uses a capability based microkernel. Being a microkernel means that most
|
||||
parts that would be part of a monolithic kernel are not part of the Iris
|
||||
kernel, but of the Iris userspace. Being capability based means that there is
|
||||
no public dictionary of running processes; in order to communicate with another
|
||||
process, the caller must have received a capability to them.
|
||||
|
||||
|
||||
# First class objects
|
||||
|
||||
First class objects are implemented by the kernel. They can be used through
|
||||
capabilities.
|
||||
|
||||
- Cap: a single capability. Can be invoked and passed to others.
|
||||
- Caps: storage container for a fixed number of Cap objects. Every Thread has
|
||||
at least one of these so it can communicate with the kernel, its parent, and
|
||||
other processes.
|
||||
- Receiver: an object that allows to create Cap objects. When those are
|
||||
invoked, the Receiver's listener receives the message.
|
||||
- Thread: an execution context. On creation, a number of slots is specified
|
||||
and space is reserved for that many Caps pointers. Only Cap objects in those
|
||||
Caps can be invoked from the thread.
|
||||
- Page: a single page of memory, always 4kB. A Page can be mapped in a Memory
|
||||
and then accessed by a Thread.
|
||||
- Memory: Everything[*] needs a Memory object to be stored in. In addition to
|
||||
storing first class objects, a Memory can own Page objects and map them. A
|
||||
mapped page is accessible for running Threads that stored in the Memory.
|
||||
- List: Helper for implementing a list of Cap objects, which are stored by the
|
||||
caller. A List allows servers to keep a list of clients without paying for
|
||||
its storage. This prevents a denial of service attack. Each item is stored
|
||||
with a code that is set and only accessible by the List owner.
|
||||
- ListItem: an item in a List object.
|
||||
|
||||
[*] There is of course one exception to the rule that everything is stored in a
|
||||
Memory. Everything is a tree, with Memory objects as nodes and all other
|
||||
objects as leaves. The root of the tree is not stored in anything. This
|
||||
node is called the "top Memory".
|
||||
|
||||
Example: A new process consists of a Memory with one or more mapped Page
|
||||
objects that hold the code and data, a Receiver, a Thread, and a Caps
|
||||
that contains a Cap for each of those objects, plus one for its parent
|
||||
process. That Caps is stored in slot 0.
|
||||
|
||||
Note that the kernel provides system calls through capabilities. If a thread
|
||||
doesn't hold the capability, it cannot make the system call. The parent Cap is
|
||||
used to request access to other processes, or devices. The Thread has no way
|
||||
to know if it is talking to the thing it requested, or something that simulates
|
||||
it. That is intentional; Threads should not be able to detect that they are
|
||||
being debugged.
|
||||
|
||||
|
||||
# Capability invocations
|
||||
|
||||
When a Cap is invoked, a message is sent to the Receiver that created it (or,
|
||||
if it was created by the kernel, to the kernel). This message contains three
|
||||
64 bit numbers (which are usually treated as two 32 bit numbers each) and two
|
||||
Cap objects. Two of the numbers, named d0 and d1, are passed with the
|
||||
invocation, the third one is named protected_data and is defined when the Cap
|
||||
is created. The owner of the Cap cannot see or change protected_data; it is
|
||||
the target's way of recognizing who's sending the message.
|
||||
|
||||
The Cap objects in the message are called arg and reply. By convention, a call
|
||||
that requires a reply passes a Cap for it, which will be invoked with the
|
||||
reply. However, this is only a convention; if a program wants, it can use both
|
||||
arg and reply as regular arguments if no reply is required. Normally a Caps is
|
||||
passed in arg if more than one Cap should be sent though.
|
||||
|
||||
Cap objects can be passed around. The target of the invocation cannot see if
|
||||
the original recipient is calling, or some other process that was given access.
|
||||
The Receiver does allow to revoke a Cap; after this, any invocation no longer
|
||||
sends a message to the Receiver. When sending a Cap, a flag specifies whether
|
||||
it is mapped (the default), or copied. A mapped Cap is revoked when its source
|
||||
is revoked; a copy is not. To give a Cap to another process and then drop it,
|
||||
it must be copied. Otherwise the new Cap is immediately revoked.
|
||||
|
||||
|
||||
# Interrupts
|
||||
|
||||
Interrupts are handled by one or a few interrupt handlers. In a microkernel,
|
||||
it would be ideal to let userspace handle them, but that is not reasonable
|
||||
given the hardware architecture. However, it is possible for the kernel to
|
||||
find out who should handle it, and then pass it to userspace. In Linux-terms:
|
||||
the top half is in kernel space, but the bottom half is not. (Note that those
|
||||
two halves are highly asymmetrical; the top half is very small, the bottom half
|
||||
can be very large.) So this is what Iris does. A process can register as an
|
||||
interrupt handler, the kernel masks the interrupt when it arrives, so it isn't
|
||||
immediately triggered again, enables all interrupts and sends a message to the
|
||||
registered process. It will normally clear the interrupt condition and
|
||||
reregister itself as the interrupt handler. The reregistration is required to
|
||||
avoid queueing of interrupts; if they are not reregistered, they are no longer
|
||||
handled.
|
||||
|
||||
|
||||
# Userspace
|
||||
|
||||
When the system boots, the kernel is started with its first process. This
|
||||
process sets up userspace. Unlike Linux init systems, the first process does
|
||||
not continue running; it is hard to change (because the filesystem is not yet
|
||||
accessible) and so it must be as simple as possible.
|
||||
|
||||
As part of the startup, drivers for built in devices are started. These are
|
||||
regular userspace programs, most of them handle interrupts and all of them have
|
||||
access to memory mapped I/O. Note that this means they are just as critical as
|
||||
the kernel; in a monolithical system, only the kernel needs to be ultimately
|
||||
trusted (if it is compromised, all is lost). With a microkernel, it's both the
|
||||
kernel and some parts of userspace. The total amount of trusted code is likely
|
||||
smaller in a microkernel design, because it is easier to split parts that don't
|
||||
need to be critical into their own process.
|
||||
|
||||
A user session is a process which can start other processes and switch between
|
||||
them. For this, it contains the following components:
|
||||
|
||||
- A bag of device Cap objects, which can be mapped to the active process (and
|
||||
revoked when they are deactivated). What's in the bag can change. For
|
||||
example, if the user wants sound to continue playing while switching to
|
||||
another user, the sound Cap must not be in the bag.
|
||||
- An interface for task switching: when the user makes a system request (which
|
||||
is some dedicated hardware, such as a button), the active process is
|
||||
deactivated and the session itself (or a designated helper) is activated. It
|
||||
allows switching to a different process, or starting a new one, or stopping
|
||||
or ending running processes. The session can also allow communication
|
||||
between certain processes. (The processes need to cooperate to actually make
|
||||
the link; they ask the session for a link of a certain type (for example, a
|
||||
file system) and the session responds with the Cap or an error.
|
||||
- There is a list of things that can be started; an important one is a shell,
|
||||
which allows control over the session. In other words, the shell is able to
|
||||
start and end other processes, make and break communication links, and define
|
||||
which programs can be started.
|
||||
|
||||
|
||||
# Multi user support
|
||||
|
||||
For multi user support, a login manager is required which can start user
|
||||
sessions and switch between them. This is very similar to what a user session
|
||||
does, and so the same process is used for it. Just a few changes are required,
|
||||
and those can be implemented by choosing different helper programs.
|
||||
|
||||
The login manager lets the user select an identity to log in. The login
|
||||
program itself is run by the user session, so that users can change the way
|
||||
they log in without asking the administrator to set it up for them. For
|
||||
example, one user may set up to only allow logging in with a physical crypto
|
||||
device, while a guest login may be set up that doesn't require credentials at
|
||||
all.
|
Loading…
Reference in New Issue
Block a user