more

2025-04-21 12:27:27 +03:00 · 2009-06-01 01:12:54 +02:00
parent e7f42c6b7e
commit ef1b9bfe10
12 changed files with 1129 additions and 572 deletions
--- a/report/kernel.tex
+++ b/report/kernel.tex
@@ -9,7 +9,8 @@ This document briefly describes the inner workings of my kernel, including the
 reasons for the choices that were made.  It is meant to be understandable (with
 effort) for people who know nothing of operating systems.  On the other hand,
 it should also be readable for people who know about computer architecture, but
-want to know about this kernel.
+want to know about this kernel.  It is probably better suited for the latter
+category.
 \end{abstract}

 \tableofcontents
@@ -126,16 +127,21 @@ any part of the system, except the parts that it really needs to perform its
 operation, it cannot leak or damage the other parts of the system either.  The
 reason that this is relevant is not that users will run programs that try to
 ruin their system (although this may happen as well), but that programs may
-break and damage random parts of the system, or be taken over by crackers.  If
-the broken or malicious process has fewer rights, it will also do less damage
-to the system.
+break and damage random parts of the system, or be taken over by
+crackers\footnote{Crackers are better known by the public as ``hackers''.
+However, I use this word to describe people who like to play with software (or
+sometimes also with other things).  Therefore the malicious people who use
+hacking skills for evil need a different name.}.  If the broken or malicious
+process has fewer rights, it will also do less damage to the system.

 This leads to the goal of giving each process as little rights as possible.
 For this, it is best to have rights in a very fine-grained way.  Every
 operation of a driver (be it a hardware device driver, or just a shared program
 such as a file system) should have its own key, which can be given out without
 giving keys to the entire driver (or even multiple drivers).  Such a key is
-called a capability.
+called a capability.  For example, a capability can allow the holder to access
+a single file, or to use one specific network connection, or to see what keys
+are typed by the user.

 Some operations are performed directly on the kernel itself.  For those, the
 kernel can provide its own capabilities.  Processes can create their own
@@ -143,9 +149,9 @@ objects which can receive capability calls, and capabilities for those can be
 generated by them.  Processes can copy capabilities to other processes, if they
 have a channel to send them (using an existing capability).  This way, any
 operation of the process with the external world goes through a capability, and
-only one system call is needed, namely \textit{invoke}.
+only one system call is needed: \textit{invoke}.

-This has a very nice side-effect, namely that it becomes very easy to tap
+This has a very nice side-effect, which is that it becomes very easy to tap
 communication of a task you control.  This means that a user can redirect
 certain requests from programs which don't do exactly what is desired to do
 nicer things.  For example, a program can be prevented from opening pop-up
@@ -155,64 +161,143 @@ This is a very good thing.

 \section{Kernel objects}
 This section describes all the kernel objects, and the operations that can be
-performed on them.
+performed on them.  One operation is possible on any kernel object (except a
+message and reply and call Capabilities).  This operation is \textit{degrade}.
+It creates a copy of the capability with some rights removed.  This can be
+useful when giving away a capability.

 \subsection{Memory}
-A memory object is a container for storing things.  All objects live inside a
-memory object.  A memory object can contain other memory objects, capabilities,
-receivers, threads and pages.
+A Memory object is a container for storing things.  All objects live inside a
+Memory object.  A Memory object can contain other Memory objects, Capabilities,
+Receivers, Threads, Pages and Cappages.

-A memory object is also an address space.  Pages can be mapped (and unmapped).
-Any Thread in a memory object uses this address space while it is running.
+A Memory object is also an address space.  Pages can be mapped (and unmapped).
+Any Thread in a Memory object uses this address space while it is running.

-Every memory object has a limit.  When this limit is reached, no more pages can
-be allocated for it (including pages which it uses to store other objects).
-Using a new page in a memory object implies using it in all ancestor memory
+Every Memory object has a limit.  When this limit is reached, no more Pages can
+be allocated for it (including Pages which it uses to store other objects).
+Using a new Page in a Memory object implies using it in all ancestor Memory
 objects.  This means that setting a limit which is higher than the parent's
 limit means that the parent's limit applies anyway.

-Operations on memory objects:
+Operations on Memory objects:
 \begin{itemize}
-\item
+\item Create a new item of type Receiver, Memory, Thread, Page, or Cappage.
+\item Destroy an item of any type, which is owned by the Memory.
+\item List items owned by the Memory, Pages mapped in it, and messages in owned
+Receiver's queues.
+\item Map a Page at an address.
+\item Get the Page which is mapped at a certain address.
+\item Get and set the limit, which is checked when allocating pages for this
+Memory or any sub-structure.
+\item Drop a capability.  This can only be done by Threads owned by the Memory,
+because only they can present capabilities owned by it.\footnote{The kernel
+checks if presented capabilities are owned by the Thread's Memory.  If they
+aren't, no capability is passed instead.  The destroy operation destroys an
+object that a capability points to.  Drop destroys the capability itself.  If a
+Thread from an other Memory would try to drop a capability, the kernel would
+refuse to send it in the message, or it would not be dropped because it would
+be owned by a different Memory.}
 \end{itemize}

-\subsection{Page}
-A page can be used to store user data.  It can be mapped into an address space (a memory object).  Threads can then use the data directly.
-
-A page has no operations of itself; mapping a page is achieved using an
-operation on a memory object.
-
 \subsection{Receiver}
 A receiver object is used for inter-process communication.  Capabilities can be
 created from it.  When those are invoked, the receiver can be used to retrieve
 the message.

-Operations on receiver objects:
+Operations on Receiver objects:
 \begin{itemize}
-\item
+\item Set the owner.  This is the Thread that messages will be sent to when
+they arrive.  Messages are stored in the receiver until the owner is ready to
+accept them.  If it is waiting while the message arrives, it is immediately
+delivered.
+\item Create a capability.  The new capability should be given to Threads who
+need to send a message to the receiver.
+\item Create a call capability.  This is an optimization.  Because
+\textit{calls} happen a lot, where a capability is created, sent in a message,
+then a reply is sent over this new capability, and then it is dropped.  This
+can be done using a call capability.  The call capability is invoked instead of
+the target, and the target is specified where the reply capability should be.
+The message is sent to the call capability (which is handled by the Receiver in
+the kernel).  It creates a new reply capability and sends the message to the
+target with it.  When the reply capability is invoked, the message is sent to
+the owner, and the capability is dropped.  This approach reduces the number of
+kernel calls from four (create, call, reply, drop) to two (call, reply).
+\end{itemize}
+
+\subsection{Thread}
+Thread objects hold the information about the current state of a thread.  This
+state is used to continue running the thread.  The address space is used to map
+the memory for the Thread.  Different Threads in the same address space have
+the same memory mapping.  All Threads in one address space (often there is only
+one) together are called a process.
+
+Because all threads have a capability to their own Thread object (for claiming
+Receivers), this is also used to make some calls which don't actually need an
+object.  The reason that these are not operations on some fake object which
+every process implicitly owns, is that for debugging it may be useful to see
+every action of a process.  In that case, all its capabilities must be pointing
+to the watcher, which will send them through to the actual target (or not).
+With such an implicit capability, it would be impossible to intercept these
+calls.
+
+Operations on Thread objects:
+\begin{itemize}
+\item Get information about the thread.  Details of this are
+architecture-specific.  Standard ways are defined for getting and setting some
+flags (whether the process is running or waiting for a message, setting these
+flags is a way to control this for other Threads), the program counter and the
+stack pointer.  This call is also used to get the contents of processor
+registers and possibly other information which is different per Thread.
+\item Let the kernel schedule the next process.  This is not thread-specific.
+\item Get the top Memory object.  This is not thread-specific.  Most Threads
+are not allowed to perform this operation.  It is given to the initial Threads.
+They can pass it on to Threads that need it (mostly device drivers).
+\item In the same category, register a Receiver for an interrupt.  Upon
+registration, the interrupt is enabled.  When the interrupt arrives, the
+registered Receiver gets a message from the kernel and the interrupt is
+disabled again.  After the Thread has handled the interrupt, it must reregister
+it in order to enable it again.
+\item And similarly, allow these priviledged operations (or some of them) in an
+other thread.  This is a property of the caller, because the target thread
+normally doesn't have the permission to do this (otherwise the call would not
+be needed).  The result of this operation is a new Thread capability with all specified rights set.  Normally this is inserted in a priviledged process's address space during setup, before it is run (instead of the capability which is obtained during Thread creation).
+\end{itemize}
+
+\subsection{Page and Cappage}
+A Page can be used to store user data.  It can be mapped into an address space
+(a Memory object).  Threads can then use the data directly.  A Cappage is very
+similar, in that it is owned by the user.  However, the user cannot see its
+contents directly.  It contains a frame with Capabilities.  They can be invoked
+like other owned capabilities.  The main feature of a Cappage, however, is that
+they can be shared.  It is a fast way to copy many capabilities to a different
+address space.  Capabilities in a Cappage are not directly owned by the Memory,
+and thus cannot be dropped.
+
+Operations on Page and Cappage objects:
+\begin{itemize}
+\item Copy or move the frame to a different Page, which is usually in a
+different Memory.  This way, large amounts of data can be copied between
+address spaces without needing to really copy it.
+\item Set or get flags, which contain information on whether the page is
+shared, is writable, has a frame allocated, and is paying for the frame.  Not
+all flags can be set in all cases.
+\item Cappages can also set a capability in the frame (pointed to with an index).
 \end{itemize}

 \subsection{Capability}
 A capability object can be invoked to send a message to a receiver or the
 kernel.  The owner cannot see from the capability where it points.  This is
 important, because the user must be able to substitute the capability for a
-different one, without the program noticing.
+different one, without the program noticing.  In some cases, it is needed to
+say things about capabilities.  For example, a Memory can list the Capabilities
+owned by it.  In such a case, the list consists of Capabilities which point to
+other Capabilities.  These capabilities can also be used to destroy the target
+capability (using an operation on the owning Memory object), for example.

 Operations or capability objects:
 \begin{itemize}
-\item
-\end{itemize}
-
-\subsection{Thread}
-Thread objects hold the information about the current state of a thread.  This
-state is used to continue running the thread.  The address space is used to map
-the memory for the thread.  Different threads in the same address space have
-the same memory mapping.  All threads in one address space (often just one)
-together are called a process.
-
-Operations on thread objects:
-\begin{itemize}
-\item
+\item Get a copy of the capability.
 \end{itemize}

 \end{document}
--- a/report/making-of.tex
+++ b/report/making-of.tex
@@ -80,6 +80,28 @@ document about how to do this.  Please read that if you don't have a working
 cross-compiler, or if you would like to install libraries for cross-building
 more easily.

+\section{Choosing a language to write in}
+Having a cross-compiler, the next thing to do is choose a language.  I prefer
+to use C++ for most things.  I have used C for a previous kernel, though,
+because it is more low-level.  This time, I decided to try C++.  But since I'm
+not linking any libraries, I need to avoid things like new and delete.  For
+performance reasons I also don't use exceptions.  They might need library
+support as well.  So what I use C++ for is classes with member functions, and
+default function arguments.  I'm not even using these all the time, and the
+whole thing is very much like C anyway.
+
+Except for one change I made: I'm using a \textit{pythonic preprocessor} I
+wrote.  It changes python-style indented code into something a C compiler
+accepts.  It shouldn't be too hard to understand if you see the kernel source.
+Arguments to flow control instructions (if, while, for) do not need
+parenthesis, but instead have a colon at the end of the line.  After a colon at
+the end of a line follows a possibly empty indented block, which is put in
+brackets.  Indenting a line with respect to the previous one without a colon
+will not do anything: it makes it a continuation.  Any line which is not empty
+or otherwise special gets a semicolon at the end, so you don't need to type
+those.  When using both spaces and tabs (which I don't recommend), set the tab
+width to 8 spaces.
+
 \section{Making things run}
 For loading a program, it must be a binary executable with a header.  The
 header is inserted by mkimage.  It needs a load address and an entry point.
@@ -414,6 +436,27 @@ personal copy of the data without actually copying anything.  The result for
 the sender is a Page without a frame.  Any mappings it has are remembered, but
 until a new frame is requested, no frame will be mapped at the address.  A Page
 is also able to \textit{forget} its frame, thereby freeing some of its memory
-quota.
+quota (if it stops paying for it as well; a payed-for frame costs quota, and is
+guaranteed to be allocatable at any time).
+
+Another optimization is to specify a minimum number of bytes for a page move.
+If the page needs to be copied, this reduces the time needed to complete that
+operation.  The rest of the page should not contain secret data: it is possible
+that the entire page is copied, for example if it doesn't need to be copied,
+but can be reused.
+
+\section{Copy on write}
+Another nice optimization is \textit{copy on write}: a page is shared
+read-only, and when a page-fault happens, the kernel will copy the contents, so
+that the other owner(s) don't see the changes.  For the moment, I don't
+implemnt this.  I'm not sure if I want it in the kernel at all.  It can well be
+implemented using an exception handler in user space, and is not used enough to
+spend kernel space on, I think.  But I can change my mind on that later.
+
+\section{Memory listing}
+The last thing to do for now is allowing a memory to be listed.  That is,
+having a suitably priviledged capability to a Memory should allow a program to
+see what's in it.  In particular, what objects it holds, and where pages are
+mapped.  Probably also what messages are in a receiver's queue.

 \end{document}