more

2025-04-21 12:27:27 +03:00 · 2009-05-28 23:35:27 +02:00
parent 3fc672f50f
commit e7f42c6b7e
7 changed files with 272 additions and 79 deletions
--- a/report/making-of.tex
+++ b/report/making-of.tex
@@ -54,7 +54,8 @@ format.
 To understand at least something of addresses, it's important to understand the
 memory model of the mips architecture:
 \begin{itemize}
-\item usermode code will never reference anything in the upper half of the memory (above 0x80000000).  If it does, it receives a segmentation fault.
+\item usermode code will never reference anything in the upper half of the
+memory (above 0x80000000).  If it does, it receives a segmentation fault.
 \item access in the lower half is paged and can be cached.  This is called
 kuseg when used from kernel code.  It will access the same pages as non-kernel
 code finds there.
@@ -201,13 +202,16 @@ simple rule in my system: everyone must pay for what they use.  For memory,
 this means that a process brings its own memory where the kernel can write
 things about it.  The kernel does not need its own allocation system, because
 it always works for some process.  If the process doesn't provide the memory,
-the operation will fail.
+the operation will fail.\footnote{There are some functions with \textit{alloc}
+in their name.  However, they allocate pieces of memory which is owned by the
+calling process.  The kernel never allocates anything for itself, except during
+boot.}

 Memory will be organized hierarchically.  It belongs to a container, which I
-shall call \textit{memory}.  The entire memory is the property of another
-memory, its parent.  This is true for all but one, which is the top level
-memory.  The top level memory owns all memory in the system.  Some of it
-directly, most of it through other memories.
+shall call \textit{Memory}.  The entire Memory is the property of another
+Memory, its parent.  This is true for all but one, which is the top level
+Memory.  The top level Memory owns all memory in the system.  Some of it
+directly, most of it through other Memories.

 The kernel will have a list of unclaimed pages.  For optimization, it actually
 has two lists: one with pages containing only zeroes, one with pages containing
@@ -226,15 +230,15 @@ task, and then jumping to it.
 There are two options for the idle task, again with their own drawbacks.  The
 idle task can run in kernel mode.  This is easy, it doesn't need any paging
 machinery then.  However, this means that the kernel must read-modify-write the
-status register of coprocessor 0, which contains the operating mode, on every
+Status register of coprocessor 0, which contains the operating mode, on every
 context switch.  That's quite an expensive operation for such a critical path.

 The other option is to run it in user mode.  The drawback there is that it
 needs a page directory and a page table.  However, since the code is completely
 trusted, it may be possible to sneak that in through some unused space between
 two interrupt handlers.  That means there's no fault when accessing some memory
-owned by others, but the idle task is so trivial that it can be assumed to run
-without affecting them.
+owned by others (which is a security issue), but the idle task is so trivial
+that it can be assumed to run without affecting them.

 \section{Intermezzo: some problems}
 Some problems came up while working.  First, I found that the code sometimes
@@ -246,10 +250,16 @@ In all compiled code, functions are called as \verb+jalr $t9+.  It took quite
 some time to figure this out, but setting t9 to the called function in my
 assembly code does indeed solve the problem.

+I also found that every compiled function starts with setting up gp.  This is
+complete nonsense, since gp is not changed by any code (and it isn't restored
+at the end of a function either).  I'll report this as a but to the compiler.
+Because it is done for every function, it means a significant performance hit
+for any program.
+
 The other problem is that the machine was still doing unexpected things.
 Appearantly, u-boot enables interrupts and handles them.  This is not very nice
 when I'm busy setting up interrupt handlers.  So before doing anything else, I
-first switch off all interrupts by writing 0 to the status register of CP0.
+first switch off all interrupts by writing 0 to the Status register of CP0.

 This also reminded me that I need to flush the cache, so that I can be sure
 everything is correct.  For that reason, I need to start at 0xa0000000, not
@@ -262,12 +272,12 @@ worry about it.

 Finally, I read in the books that k0 and k1 are in fact normal general purpose
 registers.  So while they are by convention used for kernel purposes, and
-compilers will likely not touch them.  However, the kernel can't actually rely
-on them not being changed by user code.  So I'll need to use a different
-approach for saving the processor state.  The solution is trivial: use k1 as
-before, but first load it from a fixed memory location.  To be able to store k1
-itself, a page must be mapped in kseg3 (wired into the tlb), which can then be
-accessed with a negative index to \$zero.
+compilers will likely not touch them, the kernel can't actually rely on them
+not being changed by user code.  So I'll need to use a different approach for
+saving the processor state.  The solution is trivial: use k1 as before, but
+first load it from a fixed memory location.  To be able to store k1 itself, a
+page must be mapped in kseg3 (wired into the tlb), which can then be accessed
+with a negative index to \$zero.

 At this point, I was completely startled by crashes depending on seemingly
 irrelevant changes.  After a lot of investigation, I saw that I had forgotten
@@ -277,11 +287,11 @@ lead to random behaviour.

 \section{Back to the idle task}
 With all this out of the way, I continued to implement the idle task.  I hoped
-to be able to never write to the status register.  However, this is not
+to be able to never write to the Status register.  However, this is not
 possible.  The idle task must be in user mode, and it must call wait.  That
 means it needs the coprocessor 0 usable bit set.  This bit may not be set for
 normal processes, however, or they would be able to change the tlb and all
-protection would be lost.  However, writing to the status register is not a
+protection would be lost.  However, writing to the Status register is not a
 problem.  First of all, it is only needed during a task switch, and they aren't
 as frequent as context switches (every entry to the kernel is a context switch,
 only when a different task is entered from the kernel than exited to the kernel
@@ -289,7 +299,7 @@ is it a task switch).  Furthermore, and more importantly, coprocessor 0 is
 intgrated into the cpu, and writing to it is actually a very fast operation and
 not something to be avoided at all.

-So to switch to user mode, I set up the status register so that it looks like
+So to switch to user mode, I set up the Status register so that it looks like
 it's handling an exception, set EPC to the address of the idle task, and use
 eret to ``return'' to it.

@@ -308,4 +318,102 @@ Having a timer is important for preemptive multitasking: a process needs to be
 interrupted in order to be preempted, so there needs to be a periodic interrupt
 source.

+During testing it is not critical to have a timer interrupt.  Without it, the
+system can still do cooperative multitasking, and all other aspects of the
+system can be tested.  So I decided to leave the timer interrupts until I'm
+going to write the drivers for the rest of the hardware as well.
+
+\section{Invoke}
+So now I need to accept calls from programs and handle them.  For this, I need
+to decide what such a call looks like.  It will need to send a capability to
+invoke, and a number of capabilities and numbers as arguments.  I chose to send
+four capabilities (so five in total) and also four numbers.  The way to send
+these is by setting registers before making a system call.  Similarly, when the
+kernel returns a message, it sets the registers before returing to the program.
+
+I wrote one file with assembly for receiving interrupts and exceptions
+(including system calls) and one file with functions called from this assembly
+to do most of the work.  For syscall, I call an arch-specific\footnote{I split
+off all arch-specific parts into a limited number of files.  While I am
+currently writing the kernel only for the Trendtac, I'm trying to make it easy
+to port it to other machines later.} invoke function, which reads the message,
+puts it in variables, and calls the real invoke function.
+
+The real invoke function analyzes the called capability: if it is in page 0
+(which is used by the interrupt handlers, and cannot hold real capabilities),
+it must be a kernel-implemented object.  If not, it is a pointer to a Receiver.
+
+Then kernel object calls are handled, and messages to receivers are sent.  When
+all is done, control is returned to the current process, which may or may not
+be the calling process.  If it isn't, the processor state is initialized for
+the new process by setting the coprocessor 0 usable bit in the Status register
+and the asid bits in the EntryHi register of CP0.
+
+\section{Paging}
+While implementing user programs, I needed to think about paging as well.  When
+a TLB miss occurs, the processor must have a fast way to reload it.  For this,
+page tables are needed.  On Intel processors, these need to be in the format
+that Intel considered useful.  On a mips processor, the programmer can choose
+whatever they want.  The Intel format is a page containing the
+\textit{directory}, 1024 pointers to other pages.  Each of those pages contains
+1024 pointers to the actual page.  That way, 10 bits of the virtual address
+come from the directory, 10 bits from the page table, and 12 from the offset
+within the page, leading to a total of 32 bits of virtual memory addressing.
+
+On mips, we need 31 bits, because addresses with the upper bit set will always
+result in an address error.  So using the same format would waste half of the
+page directory.  However, it is often useful to have address to mapped page
+information as well.  For this, a shadow page table structure would be needed.
+It seems logical to use the upper half of the directory page for the shadow
+directory.  However, I chose a different approach: I used the directory for
+bits 21 to 30 (as opposed to 22 to 31).  Since there are still 12 bit
+addressable pages, this leaves 9 bits for the page tables.  I split every page
+table in two, with the data for EntryLo registers in the lower half, and a
+pointer to page information in the upper half of the page.  This way, my page
+tables are smaller, and I waste less space for mostly empty page tables.
+
+To make a TLB refill as fast as possible, I implemented it directly in the
+assembly handler.  First, I check if k0 and k1 are both zero.  If not, I use
+the slow handler.  If they are, I can use them as temporaries, and simply set
+them to zero before returning.  Then I read the current directory (which I save
+during a task switch), get the proper entry from it, get the page table from
+there, get the proper entry from that as well, and put that in the TLB.  Having
+done that, I reset k0 and k1, and return.  No other registers are changed, so
+they need not be saved either.  If anything unexpected happens (there is no
+page table or no page entry at the faulting address), the slow handler is
+called, which will fail as well, but it will handle the failure.  This is
+slightly slower than handling the failure directly, but speed is no issue in
+case of such a failure.
+
+While implementing this, I have been searching for a problem for some time.  In
+the end, I found that the value in the EntryLo registers does not have the bits
+at their normal locations, but 6 bits back.  I was mapping the wrong page in,
+and thus got invalid data when it was being used.
+
+\section{Sharing}
+The next big issue is sharing memory.  In order to have efficient
+communication, it is important to use shared memory.  The question is how to
+implement it.  A Page can be mapped to memory in the address space that owns
+it.  It can be mapped to multiple locations in that address space.  However I
+may remove this feature for performance reasons.  It doesn't happen much
+anyway, and it is always possible to map the same frame (a page in physical
+memory) to multiple virtual addresses by creating an multiple Pages.
+
+For sharing, a frame must also be mappable in a different address space.  In
+that case, an operation must be used which copies or moves the frame from one
+Page to another.  There is a problem with rights, though: if there is an
+operation which allows a frame to be filled into a Page, then the rights of
+capabilities to that Page may not be appropriate for the frame.  For example,
+if I have a frame which I am not allowed to write, and a frame which I am
+allowed to write, I should not be able to write to the first frame by
+transferring it to the second Page.  So some frame rights must be stored in the
+Page, and they must be updated during copy and move frame operations.
+
+Move frame is only an optimization.  It allows the receiver to request a
+personal copy of the data without actually copying anything.  The result for
+the sender is a Page without a frame.  Any mappings it has are remembered, but
+until a new frame is requested, no frame will be mapped at the address.  A Page
+is also able to \textit{forget} its frame, thereby freeing some of its memory
+quota.
+
 \end{document}