1
0
Files
2022-09-29 17:59:04 +03:00

831 lines
22 KiB
Plaintext

.so common.nr
.nr H1 2
.H 1 "Changes and Additions"
The major additions and changes for the basic services and tools
of the Performance Co-Pilot are described in the following sections.
.P
Refer to the reference pages of the individual utilities for a complete
description of any new functionality.
.H 2 "Infrastructure Changes"
The following changes have been made to the PCP infrastructure that
affect both collector and monitor configurations.
.H 3 "PCP 2.0 to PCP \*(Vn"
.AL 1
.LI
To help with PCP deployments on systems running operating systems other
than IRIX, the Performance Metrics Name Space (PMNS) has been overhauled
to remove the
.B irix.
prefix from the names of the system-centric performance metrics, e.g. \c
.B irix.disk.dev.read_bytes
has become
.BR disk.dev.read_bytes .
In addition to changing the PMNS,
translations are also handled dynamically in the PCP libraries, so
all clients will continue to operate correctly using either the new
or the old names. As a consequence no configuration files will need to
be changed, and monitoring tools will work correctly in environments
with a mixture of new and old style PMNS deployments.
.LI
The PCP inference engine
.BR pmie (1)
has migrated from the
.I pcp.sw.monitor
subsystem to the
.I pcp_eoe.sw.eoe
subsystem,
and the licensing restrictions have been relaxed to allow
.B pmie
to be used to monitor performance on the local host without any
PCP licenses.
.LI
Support for running
.BR pmie (1)
as a daemon has been added.
This
has many similarities to the
.BR pmcd (1)
and
.BR pmlogger (1)
daemon support \-
.BR pmie
can be controlled through the
.BR chkconfig (1)
interface, and
the startup and shutdown script,
.B /etc/init.d/pmie
which supports starting and stopping multiple
.B pmie
instances
monitoring one or more hosts.
This is achieved with the assistance of
another script,
.BR pmie_check (1)
which is similar to the
.B pmlogger
support script
.BR pmlogger_check (1).
.LI
New capabilities have been added to assist in the estimation of PCP
archive sizes.
The
.B \-r
option for
.BR pmlogger (1)
causes the size of the
physical record(s) for each group of metrics and the expected
contribution of the group to the size of the PCP archive for one full
day of collection to be reported in the log file.
The
.B \-s
option to
.BR pmdumplog (1)
will report the size in bytes of each physical record in
the archive.
.LI
Changes to
.BR pmlogger (1)
have greatly reduced the size of the
.I *.meta
files created when logging metrics with
instance domains that change over time.
.LI
As an aid to creating
.B pmlogger
configuration files,
.BR pmlogconf (1)
is a new tool that allows selection of groups of commonly desired
metrics and customization of
.B pmloggger
configurations from a simple interactive dialog.
.LE
.H 3 "PCP 1.3 to PCP 2.0"
.AL 1
.LI
The PCP Performance Metrics Name Space (PMNS) has previously been local
to the application wishing to make reference to PCP metrics by name.
In PCP 2.0 the PMNS operations are directed to the host or archive that is
the source of the desired performance metrics, see
.BR pmns (4).
.P
The default name space is now associated with the PCP collector host
rather than PCP monitor host.
The distributed PMNS involves changes to both the PCP protocols between
client applications and
.BR pmcd ,
and the internal format of PCP archive files (the
.BR pmlogger (1)
utility is able to write PCP archive logs in either the old or
new formats).
.LI
PCP monitor hosts do not require any
name space files, unless the local PCP
client applications need to connect to PCP collector hosts
that have not yet been upgraded to PCP \*(Vn.
The distributed PMNS
is implemented by having the name space functions of the PCP
Performance Metrics Application Programming Interface (PMAPI)
send and receive messages to and from the
collector
.BR pmcd (1),
just like the other PMAPI functions.
.LI
Full interoperability is supported for PCP 2.0, so new PCP components
will operate correctly with either new or old PCP components.
For
example, a PCP client that connects to a PCP 1.x
.B pmcd
or attempts to
process a PCP archive created by a PCP 1.x
.B pmlogger
will revert to using the local PMNS.
.P
The following table describes the supported interoperability
between the client,
.BR pmcd (1)
and
PCP archive log components of PCP 2.0 and PCP 1.x:
.TS
box,center;
l || c | c.
PCP 1.x client PCP 2.0 client
=
PCP 1.x \fIpmcd\fR yes yes
PCP 1.x archive yes yes
PCP 2.0 \fIpmcd\fR yes yes
PCP 2.0 archive no yes
.TE
.LI
The protocols between a PMDA and
.BR pmcd (1)
have also changed, through a new version of the \fIlibpcp_pmda\fR
library.
.P
The following table describes the supported interoperability
between the
.BR pmcd (1)
and PMDA
components of PCP 2.0 and PCP 1.x:
.TS
box,center;
l || c | c | c | c.
PCP 1.x PCP 1.x PCP 2.0 PCP 2.0
daemon PMDA DSO PMDA daemon PMDA DSO PMDA
=
PCP 1.x \fIpmcd\fR yes yes no no
PCP 2.0 \fIpmcd\fR yes no yes yes
.TE
.LI
Product structural changes.
.P
The PCP product images have been re-arranged as follows.
.BL
.LI
.I pcp_eoe
contains all of the pieces that are common to any successful PCP
installation, particularly for a collector system, i.e. a place where
.B pmcd
is running.
.LI
.I pcp_eoe
will ship in IRIX 6.5 and as part of PCP 2.0
for earlier IRIX releases.
.LI
.I pcp_eoe.sw.monitor
includes the
.B oview
monitoring application (for
monitoring the performance of ORIGIN systems, from SC4-PCPORIGIN)
and does not require any PCP licenses.
.LI
.I pcp.sw.base
provides the core PCP components that are outside
.IR pcp_eoe .
.LI
.I pcp.sw.eoe
for IRIX 6.2 or later has been replaced by
.I pcp_eoe.sw.eoe
(more formally the latter updates the former).
For IRIX 5.3,
.I pcp.sw.eoe
is almost empty, but allows for clean upgrades from all earlier PCP versions.
.LI
.I pcp.sw.monitor
provides all of the PCP client tools for monitoring performance.
.LI
The other
.I pcp
subsystems provide optional PCP components.
.LE
.LI
PCP 2.0 introduces some new extensions to the PMAPI, mostly to
accommodate the distributed PMNS and some underlying protocol changes to support
enhanced inter-operability.
The \fIlibpcp\fR and \fIlibpcp_pmda\fR libraries
have been updated to version sgi2.0. The older versions of these
libraries are still supported and may be installed from the
.I pcp.sw.compat
subsystem if required for older PMDAs or customer-developed PCP
applications.
.P
The API has also changed in some ways related to symbol hiding and
changes in function arguments. The
.B xlate_pmapi
script in the
.I pcp_gifts.sw.migrate
subsystem may be used to automate the bulk of the
source code translation from the 1.x
version to the 2.0 version of the PMAPI, i.e. from
.I libpcp.so.1
and
.I libpcp_pmda.so.1
to
.I libpcp.so.2
and
.IR libpcp_pmda.so.2 .
.LI
In the transition from PCP 1.2 to PCP 1.3, the
.I libpcp_lite
library was removed and the functionality replaced by
the new
.B PM_CONTEXT_LOCAL
option to
.BR pmNewContext (3)
in
.IR libpcp .
The old
.I libpcp_lite
library is still available in the
.I pcp.sw.compat
subsystem.
.LI
.BR pmLookupName (3)
now produces an extra error code,
.B PM_ERR_NONLEAF
which means that the given name refers to a non-leaf node in the PMNS
and so no PMID can be returned.
Previously, if a non-leaf
name was given then
.B PM_ERR_NAME
would be returned.
Now the error
.B PM_ERR_NAME
means only that the name does not exist in the name space.
.LI
The function
.BR pmGetChildrenStatus (3)
was added to the PMAPI.
It was introduced mainly to satisfy a new need of
.BR pmchart (1).
As well as getting the names of all the children nodes,
.B pmGetChildrenStatus
returns a parallel array of the status of each node,
indicating if it is a leaf or non-leaf node.
.LI
A new (much smaller) format for PMDA help text files has
been implemented, with support in the
.I libpcp_pmda
library and the
.BR newhelp (1)
utility.
.LI
A
.BR \-L
option was added to
.BR pmdumplog (1)
to produce a more verbose variant of
.B \-l
and report all of the details from a PCP archive label.
.LI
.BR pmlogger (1)
uses the additional
.B pmcd
state change information to embed "mark" records in PCP
archives when new PMDAs are started by
.BR pmcd .
During replay,
this prevents interpolation of values in
the PCP archive across the life of an old and a new instance of the
same PMDA.
.LI
The temporal index file
(``.index'' suffix) is optional when PCP archives are being replayed.
.LI
Changes to track the object format of the booted IRIX kernel, briefly:
.BL
.LI
``MACH'' tag key PCP binaries in the images to ensure installation of
o32, n32 or n64 versions as appropriate, based upon the installation
platform and the IRIX version.
.LI
Only one version of the
.B pmcd
binary will be installed, now in
.IR /usr/etc/pmcd .
.LI
A new
.BR pmobjstyle (1)
command is used to determine the appropriate kernel object style at run
time as required, e.g. when
.BR pmcd (1)
is attaching a DSO PMDA.
.LE
.LE
.H 2 "Collector Changes"
The following changes effect PMCD and the PMDAs that provide the
collection services.
.H 3 "PCP 2.0 to PCP \*(Vn"
.AL 1
.LI
The
.B proc
agent has been changed to use
.I /proc/pinfo/xxxx
if possible and only use
.I /proc/xxxx
if there is no alternative.
Previously this agent always used
.I /proc/xxxx
to extract process information, and this caused unnecessary access
checking to take place and some NFS contention problems were reported
as a result.
.LI
The new
.B espping
PMDA provides quality of service metrics for consumption by the Embedded
Support Partner (ESP) infrastructure (released in IRIX 6.5.5).
This PMDA can be used in conjunction with
.BR pmie (1)
rules generated by the new
.BR pmieconf (1)
tool to detect service failure on either local or remote hosts.
Among the services which can be probed are ICMP, SMTP,
NNTP,
.BR pmcd ,
and local HIPPI interfaces using the new
.BR hipprobe (1)
utility.
.LI
Changes to the way
.BR pmcd (1)
and
.BR pmlogger (1)
are started from
.BR /etc/init.d/pcp .
.AL a
.LI
When
.B pmlogger
is chkconfig'd
.BR on ,
.B pmlogger
instances are launched in the background from
.BR "/etc/init.d/pcp start" ,
as this helps faster system reboots.
In some cases this results in diagnostics from
.B pmlogger
and/or
.B /usr/pcp/bin/pmlogger_check
that previously appeared when
.B /etc/init.d/pcp
was run to now be generated asynchronously \- any such messages
are forwarded to the
.B root
user as e-mail.
These messages are in addition to those
already written to
.I /var/adm/pcp/NOTICES
by
.BR pmpost (1)
from
.BR /etc/init.d/pcp .
.LI
A new utility,
.BR pmcd_wait (1),
provides a more reliable mechanism for detecting that
.B pmcd
is ready to accept client connections.
.LE
.LI
In concert with changes to
.BR pmie ,
the
.B pmcd
PMDA has been extended to export information
about executing
.B pmie
instances and their progress in terms of rule evaluations and action
execution rates.
Refer to the
.B pmcd.pmie.*
metrics.
.LE
.H 3 "PCP 1.3 to PCP 2.0"
.AL 1
.LI
.BR pmcd (1)
has been modified to support the distributed name space.
This has meant the following changes:
.BL
.LI
.BR pmcd (1)
now loads the default name space on startup or an alternative
name space if specified by
.BR \-n .
.LI
On a SIGHUP signal
it will reload the name space if it has been modified since the
last load time.
.LI
.BR pmcd (1)
responds to name space PDU requests using its local name space.
.LE
.LI
.BR pmcd (1)
no longer requires a PCP collector license to start,
however connections will be refused for most PCP clients unless the
PCP collector license is installed and valid.
Some clients (most notably
.BR oview (1))
can connect to
.B pmcd
independent of any PCP licenses.
To support this, the client-pmcd connection protocols have been extended.
.LI
Support has been added for multiple protocol versions for
the client-pmcd IPC and the interaction between
.B pmcd
and the PMDAs.
.LI
The logic used by
.B pmcd
for locating DSO PMDAs of the correct object format has been reworked.
.LI
The diagnostic event tracing options of
.B pmcd
have been extended to include an "unbuffered"
mode where every event is reported as it happens, as opposed to
buffering events in a ring buffer and only reporting on errors and in
response to explicit requests.
.LI
The distributed PMNS changes mean that
.B pmcd
must load and export the PMNS in
response to requests from client applications.
.BR pmcd 's
SIGHUP processing has been extended to include reloading the PMNS if
the external PMNS file has been modified.
.LI
In concert with a change to
.B pmFetch
in
.IR libpcp ,
.B pmcd
is now able to export out-of-band information about
.B pmcd
state changes (specifically PMDA add, drop and
restart) back to client applications.
.LE
.if\n(PC==0\{
.\" IRIX product images
.H 3 "Libirixpmda Changes for IRIX 6.5"
.br
The following incidents were resolved for IRIX 6.5.5.
.VL 8n
.LI 649767
Added a cluster of metrics (\f3stream.\f1*) for streams data,
which is also exported by \f3netstat -m\f1.
.LI 682896
The semantics of the metrics of \f3xbow.\f1{\f3port\f1|\f3total\f1}\f3.\f1{\f3src\f1|\f3dst\f1}
have changed from reporting transfer of bytes to transfer
of micropackets as it is impossible to tell how many
bytes of data are really transferred.
.LE
.P
The following incidents were resolved for IRIX 6.5.4.
.VL 8n
.LI 675673
Added metrics \f3xfs.iflush_count\f1 and \f3xfs.icluster_flushzero\f1
which export xfs inode cluster statistics.
.LE
.P
The following incidents were resolved for IRIX 6.5.3.
.VL 8n
.LI 628012
Added wait I/O metrics \f3kernel.\f1{\f3all\f1|\f3percpu\f1}\f3.waitio.queue\f1 and
\f3kernel.\f1{\f3all\f1|\f3percpu\f1}\f3.waitio.occ\f1 to allow the calculation
of the new \f3sar -q\f1 fields of \f3wioq-sz\f1 and \f3%wioocc\f1.
.LE
.P
The following incidents were resolved for IRIX 6.5.2.
.VL 8n
.LI 558773
Added metrics for the instantaneous disk queue length and for the
running sum of the disk queue lengths.
.LE
.P
The following incidents were resolved for IRIX 6.5.1.
.VL 8n
.LI 588158
A section called "Enabling of Statistics Collection" has
been added to the
.BR libirixpmda (5)
man page.
.LI 603178
Extra diagnostic messages were added to log the state
changes of turning the xlv statistics gathering on or off.
.LE
\} \"PC==0
.H 2 "Monitor Changes"
The major additions and changes for the performance visualization and
analysis tools are described below.
.H 3 "PCP 2.0 to PCP \*(Vn"
.AL 1
.LI
.B pmie
.AL a
.LI
A syntactic restriction in the specification language has been relaxed, and actions may
now have an arbitrary number of quoted arguments (previously at most two arguments
were allowed).
At the same time a problem with the
.B syslog
action was resolved,
allowing the
.B \-p
option to be passed to
.BR logger (1).
For example, this is now valid:
.Ex
some_inst (
(100 * filesys.used / filesys.capacity) > 98 )
-> syslog "-p daemon.info 'file system close to full"
" %h:[%i] %v% " "'";
.Ee
.LI
Metrics with dynamic instance domains are now correctly handled by
.BR pmie .
Previously
.B pmie
instantiated the instance domain when it started, and was oblivious to any
subsequent changes in the instance domain.
This is most useful for rules using the
metrics of the
.B hotproc
PMDA that is part of the
.B pcp
product.
.LI
The
.B pmie
language has been extended to allow two new operators
.B match_inst
and
.B nomatch_inst
that take a regular expression and a boolean expression.
The result is the
boolean AND of the expression and the result of matching (or not matching) the
associated instance name against the regular expression.
.P
For example, this rule evaluates error rates on various 10BaseT Ethernet network
interfaces (e.g. ecN, etN or efN):
.Ex
some_inst
match_inst "^(ec|et|ef)"
network.interface.total.errors > 10 count/sec
-> syslog "Ethernet errors:" " %i";
.Ee
The following rule evaluates available free space for all filesystems except the root
filesystem:
.Ex
some_inst
nomatch_inst "/dev/root"
filesys.free < 10 Mbytes
-> print "Low filesystem free (Mb):" " [%i]:%v";
.Ee
.LI
During rule evaluation,
.B pmie
keeps track of the expected number of rule evaluations,
number of rules actually evaluated, the number of predicates that are true and false, the
number of actions executed, etc.
These statistics are maintained as binary data structures
in the
.BR mmap 'ed
files
.IR /var/tmp/pmie/<pid> .
If
.B pmie
is running on a system with a PCP
collector deployment, the
.B pmcd
PMDA exports these metrics via the new
.B pmcd.pmie.*
group of metrics.
.LI
Some restrictions on the expansion of macros (e.g. $name) have been removed, so macro
expansion can occur anywhere in the
.B pmie
rule specifications.
.LI
There has been some changes to improve the formatting of numeric values reported with
the options
.BR \-v ,
.B \-V
and
.BR \-W ,
and for the expansion of
.B %v
in actions.
In general terms
these have removed extra white space and reduced the likelihood of scientific notation
being used.
.LE
.LI
A set of parameterized
.B pmie
rules have been developed which are applicable to most systems
and will allow
.B pmie
to be used by new users without knowledge of the
.B pmie
syntax.
A new utility,
.BR pmieconf (1)
has been built which allows these rules to be enabled or disabled, or the
parameters and thresholds adjusted for a specific system.
.P
The combination of
.BR pmie ,
.BR pmieconf ,
.B pmie_check
and
.B /etc/init.d/pmie
provides the
infrastructure required for PCP to search for behavior indicative of performance problems in a
fully automated manner with little or no local customization required.
Where customization is
needed,
.BR pmieconf (1)
provides a convenient way of doing this.
.LI
A new utility,
.BR hipprobe (1),
has been added which will check the status of HIPPI interfaces on a
system.
More sophisticated monitoring of HIPPI interfaces will be supported with a
.B hippi
PMDA that will be released as part of the forthcoming PCP for HPC add-on product.
.LE
.H 3 "PCP 1.3 to PCP 2.0"
.AL 1
.LI
Monitor applications no longer use the local default name
space to get PMIDs for metric names unless the name space file
is explicitly given as a
.B \-n
option or the target
.BR pmcd ,
specified by
.BR \-h ,
is a lower revision than PCP \*(Vn.
Monitor applications use the distributed name space by sending PDU
requests to
.BR pmcd .
.LI
To convert an existing user-written monitor application to use
the distributed name space, the following must be done:
.BL
.LI
Only call
.BR pmLoadNameSpace (3)
in the case of being given a
.B \-n
option to specify a particular name space file.
If one only wants to use the distributed name space then
the call to
.BR pmLoadNameSpace (3)
can be deleted altogether.
As soon as
.BR pmLoadNameSpace (3)
is called, the application is in local name space mode.
.LI
All calls to the PMAPI name space functions must be
made in a current PMAPI context.
Previously, since a context
was not used to service name space functions, it
was possible to call these functions before any new
context was created.
Now, if there is no current PMAPI context and
.BR pmLoadNameSpace (3)
has not previously been called
when a name space function is called an error code will be returned.
.LI
The calls to the PMAPI name space functions (in distributed mode) will
take longer to execute (as they must send/receive a PDU to/from a
remote
.BR pmcd ).
This means that the name space calls should not be made
unnecessarily.
.LI
All name space calls should be tested for failure.
This was, of course, always the case, but previously there was
more chance that one could get away without testing some calls
for failure. However, now, every call could have some failure due
to problems with sending or receiving a PDU.
.LE
.LI
The
.BR oview (1)
application for monitoring Origin 200 and Origin 2000 systems has
migrated from the SC4-PCPORIGIN product and is now
included in the
.IR pcp_eoe.sw.monitor
subsystem.
This application is unlicensed and will operate with an unlicensed
.BR pmcd (1).
.LI
Many of the GUI PCP tools have been integrated into the
IRIX Interactive Desktop\(tm environment, including launch
by icon, drag-n-drop behavior for hosts or archives dropped onto PCP
tools and a new
.B pmrun (1)
command to support optional command line arguments for PCP tools
launched from the desktop.
.LI
A
.B PerfTools
page in the Icon Catalog has been created for launching PCP tools.
.LI
The handling of error messages for both GUI and command-line
invocations of most tools has been unified and made more consistent.
.LI
The
.BR pmtime (1)
application that provides time control services to other PCP tools
has been enhanced in a number of ways:
.BL
.LI
A new Archive Time Bounds dialog (available only in archive mode)
allows time window bounds to be constrained or expanded at run-time
(particularly useful when replaying archives that are still growing).
.LI
Extensions to the
.B pmtime
client protocol to allow clients to force VCR state changes and alter
the time window bounds.
.LI
The
.B pmtime
protocol is
.B not
backwards compatible with the PCP 1.x implementation, but all
clients of
.B pmtime
are re-released with the images on the PCP 2.0 CD
and both
.B pmtime
and its clients execute on the same system.
.LE
.LI
The
.BR pmkstat (1)
command has been restructured in the wake of
.IR libpcp_lite 's
demise, and in particular a new
.B \-L
option supports stand alone use without
.BR pmcd (1).
.LE
.H 2 "Features Removed or Deprecated"
In PCP \*(Vn, the following features from earlier PCP versions have been
removed or are deprecated.
.H 3 "PCP 1.3 to PCP 2.0"
.AL 1
.LI
The original intent in the Performance Co-Pilot framework
was to support seamless VCR-replay between a current ``live''
source and a recently created archive source.
.P
In practice, the semantic and operational difficulties associated with
transitions between ``live'' and ``archive'' modes are so significant that
the feature has been abandoned.
.P
Earlier PCP releases included a ``stubbed-out'' application
.BR pmvcr (1)
that did not really do anything
and assorted infrastructure ``hooks''. These have been removed.
.LI
.I libpcp_lite.so
\- replaced by
.B PM_CONTEXT_LOCAL
The standalone (without
.BR pmcd )
library
.I libpcp_lite.so
has been replaced by the
.B PM_CONTEXT_LOCAL
option to the
.BR pmNewContext (3)
routine.