Open policy decisions
=====================

- what to do about cyclic dependencies ?

  A cyclic dependency can be bad new or something perfectly normal,
  depending on how we define the semantics of package A depending on
  package B, and what policy we adopt with respect to the existence of
  cyclic dependencies:

  1) "B must be installed before A"

     In this case, a cyclic dependency means that the package in
     question cannot be installed using the respective sequence of
     installations.

     However, this does not mean that no other sequence can exist in which
     the package could be installed.

     Example:

     A depends on B. There are two versions of B: B_0 depends on nothing
     else while B_1 depends on A.

     If we try to resolve A's dependency with B_1, we enter a circular
     dependency and fail. If we use B_0 instead, there is no problem.

     This means that there are (at least) the following three possible
     policies:

     1A) Cyclic dependencies are tolerated and just mean that the package
         in question may not be installable (for whatever reason).

     1B) A cyclic dependency is always considered an error.

     1C) Cyclic dependencies are tolerated as long as there is a way around
         them, as in the example above.

  2) "B must be installed with A"

     In this case, the cyclic dependency would not be a problem as long as
     all the packages in the cycle are installed together.

     Should an installation get interrupted and cause only part of the
     packages to get installed, the system would then be in an anomalous
     configuration.

     If cyclic dependencies are to be interpreted this way, they are not a
     problem per se. Policy may still discourage their use, though.

- what to do if we need something that's "provided" ?

  When determining prerequisites, we may encounter a dependency on an item
  that only appears in the Provides: field of a package but it not an
  installable package itself.

  Should we

  1) consider installing the package that provides the requested item, or
 
  2) ignore the package, leaving it to the user to choose what to do.

  3) if there's only one choice do 1) else do 2).

  ?

  Policy 1 would make sense if this is merely an alias or if a package
  enumerates its constituents, which at some point in time - in the past
  or in the future - are separate packages.

  Example:

  - package "dwarf-pluto" could provide "planet-pluto", for packages that
    haven't been updated yet,

  - "binutils" could provide "as", "ld", etc., to allow packages that only
    need specific parts to depend on them (with the option of breaking
    binutils into its constituents in the future),

  - similarly, if "as", "ld", etc., where individual packages in the past
    but are now combined into "binutils", "binutils" could still provide
    its constituents for compatibility with packages whose dependencies
    have not been updated yet.
  
  Policy 2 would seem more appropriate in the common case of multiple
  choices.


  Example:

  - packages "emacs" and "vim" could both provide "editor", leaving the
    choice to the user.

  - similarly, message packages "foo-en", "foo-zh", etc., could both
    provide "foo-messages".

  In the above example, "Provides" could also be use to prioritize choices,
  e.g., if "foo-en" provides "lang-en" and "foo-zh" provides "lang-zh",
  future installations could prefer prerequisites that introduce fewer new
  items. So a package "bar-en" providing "bar-messages" and "lang-en" would
  be chosen over "bar-zh" providing "bar-messages" and "lang-zh" if we have
  already installed "foo-en" but not "foo-zh" (or vice versa).


Still left to do
================

- consider reducing the size of the lists of conflicts, e.g., by making
  them unique via a red-black tree

- handle Provides:

  Update: Provides data is now parsed and properly integrated in the
  package database, but not yet used to resolve prerequisites.

- sort prerequisites such that they can be installed in the specified order

- consider Architecture:

  Update: we parse and record it now but don't use it yet.

- what to do with explicit and implicit replacement ?

- if we can't resolve the prerequisites, give at least a hint of what one
  can do to improve the situation

- check database for internal consistency

  Update: added detection of cyclic dependencies (in progress)

  Update: added test for QPKG_ADDING cleanup bug

- implement keyword search

- consider also supporting the similar but not identical (parent ?) format
  of /var/lib/dpkg/status and /var/lib/apt/lists/*Packages

  Update: added as much as my Ubuntu system can reach before hitting |


Done
====

- optimize the search trees. Right now, we have 81812 calls to make_id
  for 14601 packages, resulting in 7420560 calls to comp_id.

  There can be at most 2 new identifiers per package (package name and
  version), so a perfectly balanced tree should have a depth of no more
  than 14. If we assume that each call to make_id searches to the bottom,
  we'd get 1145368 calls to comp_id, about 15% of the current number.

  So the tree is clearly degenerated.

  Update: after switching to red-black trees, we get only 1497604 calls
  to comp_id. This is 130% of the "good case" estimate above. Insertion
  of a new node is currently done with two lookups, so we'll get rid of
  some more lookups after further optimization.

  Update: after merging the two lookups per new node into one, we're at
  1172642 calls to comp_id, or 102% of the predicted "good case".

- if there are multiple choices, try to prefer more recent versions

- check whether introducing a new package would cause a conflict

  Update: conflicts among the packages considered for installation are now
  checked.

- compile the list of conflicts of installed packages