These benchmarks are intended to measure the performance of pthreads. They aren't necessarily complete; see the TODO file for some possible room for expansion. The tests themselves are self-documenting; you can do a make doc to see documentation for all tests. BUILDING THE TESTS ================== Just type make. This will build all tests for all ABIs. Note that all of the tests depend on a dso in the build directory: either libsthrsproc.so, or libsthrpthread.so. By default, the binaries are built to look for these dsos in the directory in which they were built. However, if you want to carry the binaries to other machines, you will need to consider the dsos as well. RUNNING THE TESTS ================= Environment ----------- All of the benchmarks understand two environment variables: THRLIB and HRN_VRB. HRN_VRB controls the verbosity of test output; by default, the tests will produce no output. HRN_VRB can also be set to exactly one of "result", "info", or "vrb", in increasing order of verbosity. result - output only test results; most likely timing information. info - print out everything output by HRN_VRB=result, and any informational messages about the test's progress. vrb - print out everything output by HRN_VRB=info, and any other output in the test; this might include debugging output, printouts in tight inner loops, and other such noise you don't always want to see. If the environment variable HRN_SPROC is set, certain of these benchmarks will use sproc(2) to create threads rather than pthread_create(3P). Such benchmarks will necessarily also use different synchronization constructs when running with sproc's. See the section on the "sthr" library below. Command-line Args ----------------- There is a standard set of command-line arguments that each test accepts. The individual tests also accept their own, idiosyncratic arguments: by convention, standard arguments have a lower-case option letter (-z) and test-defined ones have an upper-case option letter (-Z). The standard command-line arguments are: -t # use # threads. This option is not supported for all tests. Specifically, tests which use more than one type of thread will offer their own command-line arguments to independently set the number of threads of each type. -c # Call pthread_setconcurrency(3P) with # as an argument. This is a no-op when HRN_SPROC is set. -i # Repeat the test # times. -d Print out verbose documentation of this test. -u Print a short usage message, with brief explanations of all command-line options. -h Print a long usage message, with verbose explanations of all command-line options. Use <test> -[uhd] to figure out what other options <test> supports. Results ------- All output from the tests is preceeded with the thread-id of the calling thread. For sprocs, this id is a pid; for pthreads, it is a pthread_t. All timing information is obtained via the SGI cycle counters. Each timer is "named" by the event it times; when a timer completes, if HRN_VRB=result is set, the test will print a message containing the timer's name and the time in microseconds its corresponding event took. For example: $ HRN_VRB=result ./mutex thr0x00010001: single : 1305119.840us thr0x00010000: total : 1362364.000us $ indicates that a single thread took ~1.31 seconds to complete, and that all threads took ~1.36 seconds to complete. Automated Workloads ------------------- There is a somewhat obscure framework in place for specifying workloads; however, if you're in a hurry, cd results and type ./runall to run the standard workload, and record the results. To aid in repetitively running a set of tests, there is a script in the results/ directory called runtests. runtests accepts as an argument a file describing a set of tests you would like to run. Each test's name is given on a line beginning with ':'. Subsequent lines are considered arguments to the command. The script will expand lines like "-n 1,2,3" into 3 separate runs of the command, with the arguments "-n 1", "-n 2", and "-n 3". This could also have been written as "-n 1-3". runtests.pl runs cpp over the file before reading it. This is all easier to see than explain, so look at results/testfile for an example. runtests examines the output of each command, and places a formatted output file in a file named $cmdname.$(hostname).$(date +%H%M%S). runtests.pl. Each line of this file is of the format: <timer-name> <command-line args> <averaged timer value in usecs> For example, running runtests on a file like this: :../mutex -t 1-4 Would produce a file named mutex.babylon.153843, containing something like: single -t 1 467216.8 total -t 1 468403.2 single -t 2 553371.11111111 total -t 2 527496.5 single -t 3 609364.4 total -t 3 711393.72727272 single -t 4 828111.425 total -t 4 850190.56 Graphing Tests -------------- Since the amount of data resulting from a run of runtests can be overwhelming, graphs can help visualize the output. Running ./graphall with an argument that is the directory containing data files from a run of the benchmarks will put the data from the experiments in several postscript files. graphall is a wrapper around grapho. In its simplest form, grapho takes to arguments: a file generated by runtests.pl, and a perl regular expression. The regex is used to match lines in the file that should be considered for the graph. The regex argument to grpaho is slightly different from perl re's in that + a space in the re matches any positive number of whitespace characters; + the symbol '#' in the re indicates that grapho should match a floating point number, and use that number as a parameter to the test. For example, to graph threads vs. time for all threads to complete, using mutex.babylon.153843 above, we could type: $ ./grapho mutex.babylon.153843 "total -t # #" If two '#'s are specified in the re, grapho will make a two-dimensional plot. If more are specified, grapho will make a parametric three-dimensional plot. It is implicit that the first '#' corresponds to the x-axis, the second to the y-axis, and the third (if any) to the z-axis. gnuplot is used to do the graphing; by default, the graph will be in a postscript file called "graph.ps"; the -o option lets you specify a different output file. The -x, -y, and -z flags allow you to label the x, y, and z axes respectively. For example: $ ./grapho -o mutex.ps -x nthreads -y "time (usecs)" mutex.babylon.153843 \ "total -t # #" will produce a graph similar to the one from the previous example, but with labelled axes, in a file called mutex.ps. Special Tests ------------- slock creates real-time threads, and so can only be run as root. affinity uses the mmci interfaces, only available on SN0. It can be run with its -N option on other machines, however. WRITING NEW BENCHMARKS ====================== These tests have been written using a common harness and set of APIs. Any additional tests written for this suite of benchmarks should, for consistency's sake, use the same ones. Use template.c as a starting point for writing new benchmarks; it should be mostly possible to fill in the blanks, using the existing tests as examples. sthr.h defines an interface for tests which can run with both sproc's and pthreads. It restricts itself to the LCD of these two interfaces. Tests which exclusively use the sthr_* interfaces are able to be run with either sproc's or pthreads, depending on the HRN_SPROC environment variable. The harness handles this change transparently to the benchmark writer. All output is done using the trace facilities available in trc.h Things to update ---------------- When you add a benchmark, you should update the following files: localdefs-- to make sure your experiment gets built with everything else results/workload-- find a reasonable set of parameters to your experiment, and include it in the workload file. results/graphall-- add rules to results/graphall if you often want to generate specific graphs from your test runs.