1
0
Files
irix-657m-src/eoe/man/man1/sort.1
2022-09-29 17:59:04 +03:00

344 lines
10 KiB
Groff

'\"macro stdmacro
.if n .pH g1.sort @(#)sort 41.11 of 5/26/91
.\" Copyright 1991 UNIX System Laboratories, Inc.
.\" Copyright 1989, 1990 AT&T
.nr X
.if \nX=0 .ds x} sort 1 "Essential Utilities" "\&"
.if \nX=1 .ds x} sort 1 "Essential Utilities"
.if \nX=2 .ds x} sort 1 "" "\&"
.if \nX=3 .ds x} sort "" "" "\&"
.TH \*(x}
.SH NAME
\f4sort\f1 \- sort and/or merge files
.SH SYNOPSIS
\f4sort\f1
[\f4\-cmu\f1]
[\f4\-o\f2output\f1]
[\f4\-y\f2kmem\f1]
[\f4\-z\f2recsz\f1]
[\f4\-bdfiMnr\f1]
[\f4\-t\f2x\f1]
.br
[\f4\-k\f2keydef\f1]
[\f4+\f2pos1\f1
[\f4\-\f2pos2\f1]]
[\f4\-T\ \f2tdir\f1]
[\f2files\f1]
.SH DESCRIPTION
The \f4sort\fP command
sorts
lines of all the named files together
and writes the result on
the standard output.
The standard input is read if
\f4\-\f1
is used as a filename
or no input files are named.
.PP
Comparisons are based on one or more sort keys extracted
from each line of input.
By default, there is one sort key, the entire input line,
and ordering is lexicographic by bytes in machine
collating sequence.
.PP
\f4sort\f1 processes supplementary code set characters
according to the locale specified in the \f4LC_CTYPE\fP
and \f4LC_COLLATE\f1
environment variables [see \f4LANG\fP on \f4environ\fP(5)],
except as noted below.
Supplementary code set characters are collated in code order.
.PP
The following options alter the default behavior:
.TP 5
\f4\-c\f1
Check that the input file is sorted according to the ordering rules;
give no output unless the file is out of sort.
.TP
\f4\-m\f1
Merge only, the input files are assumed to be already sorted.
.TP
\f4\-u\f1
Unique: suppress all but one in each set of lines having equal keys.
If used with \f4\-c\f1, check that there are no lines with duplicate keys,
in addition to checking that the input file is sorted.
.TP
\f4\-o\f2output\f1
The argument given is the name of an output file
to use instead of the standard output.
This file may be the same as one of the inputs.
.TP
\f4\-y\f2kmem\f1
The amount of main memory used by \f4sort\f1
has a large impact on its performance.
Sorting a small file in a large amount
of memory is a waste.
If this option is omitted,
\f4sort\fP
begins using a system default memory size (64Kb),
and continues to use more space as needed, up to a maximum equal to
one eighth of physical memory.
If this option is presented with a value (\f2kmem\fP),
\f4sort\fP will start
using that number of kilobytes of memory,
unless the administrative minimum or maximum is violated,
in which case the corresponding extremum will be used.
\f4sort\fP will continue to allocate memory until a limit is reached.
This limit is capped at 2Gb and computed to be the larger of the value
given to the \f4\-y\fP option and one half of physical memory.
Thus, \f4\-y\f10
is guaranteed to start with minimum memory.
By convention,
\f4\-y\f1 (with no argument) starts with maximum memory;
on SGI systems, \f4\-y\f1 (with no argument) starts with half
of the total physical memory on the system (and the limit on memory
usage is set to one half of physical memory as well).
.TP
\f4\-z\f2recsz\f1
The size of the longest line read is recorded
in the sort phase so buffers can be allocated
during the merge phase.
If the sort phase is omitted via the
\f4\-c\f1
or
\f4\-m\f1
options, a popular system default size will be used.
Lines longer than the buffer size will cause
\f4sort\fP
to terminate abnormally.
Supplying the actual number of bytes in the longest line
to be merged (or some larger value)
will prevent abnormal termination. If the sort phase is not omitted,
then the maximum line size is calculated
and used as the \f2recsz\fP,
overriding the value of \f4-z\fP.
Thus, the \f4-z\fP option is significant
only when used with \f4-c\fP or \f4-m\fP.
.TP
\f4\-T\ \f2tdir\f1
The argument given is used as a directory where any temporary files
required will be created.
.PP
Sort keys can be specified using the options:
.TP 5
\f4\-k\f2field_start[type][,field_end[type]]\f1
Defines a key field that begins at
\f2field_start\f1
and ends at
\f2field_end\f1
inclusive(provided that
\f2field_end\f1
does not precede
\f2field_start\f1).
A missing
\f2field_end\f1
means the end of the line.
A field comprises a maximal sequence of non-separating characters
and, in the absence of option
\f4\-t\f1
, any preceding field separator.
.PP
The
\f2field_start\f1
portion of the argument has the form:
.IP
field_number[.first_character]
.PP
Fields and characters within fields are numbered starting with 1.
The field_number and first_character pieces, interpreted as positive
decimal integers, specify the first character to be used as part of a
sort key. If first_character is omitted, it refers to the first
character of the field.
.PP
The
\f2field_end\f1
portion of the argument has the form:
.IP
field_number[.last_character]
.PP
The field_number is as described above for
\f2field_start\f1.
The last_character piece, interpreted as a non-negative decimal
integer, specifies
the last character to be used as part of the sort key. If
last_character evaluates to zero or last_character
is omitted, it refers to the last character of the field specified by
field_number.
.PP
type is a modifier from the list of characters \f4bdfinr\f1. Each
modifier behaves like the corresponding options(see below), but apply
only to the key field to which they are attached.
.TP 5
\f4+\f2pos1\f1, \f4\-\f2pos2\f1
This is an obsolescent form of \f4\-k\f1. \f2pos1\f1 corresponds to
\f2field_start\f1, \f2pos2\f1 corresponds to \f2field_end\f1, except
that both fields and characters are numbered from zero instead of
one. The optional type modifiers are the same as in \f4\-k\f1 option.
.PP
The following options override the default ordering rules.
.TP 5
\f4\-d\f1
``Dictionary'' order: only letters, digits, and blanks (spaces and tabs)
are significant in comparisons.
No comparison is performed for multibyte characters.
.TP
\f4\-f\f1
Fold lowercase
letters into uppercase for the purpose of comparison.
Does not apply to multibyte characters.
.TP
\f4\-i\f1
Ignore non-printable characters.
Multibyte and embedded NULL characters are also ignored.
.TP
\f4\-M\f1
Compare as months.
The first three non-blank characters
of the field are folded to uppercase
and compared.
Month names are processed according to the locale specified
in the \f4LC_TIME\f1 environment variable
[see \f4LANG\f1 on \f4environ\f1(5)].
For example, in an English locale the sorting order
would be ``\s-1JAN\s0'' < ``\s-1FEB\s0''
< . . . < ``\s-1DEC\s0.''
Invalid fields compare low to ``\s-1JAN\s0.''
The \f4\-M\f1 option implies the \f4\-b\f1 option
(see below).
.TP
\f4\-n\f1
An initial numeric string,
consisting of optional blanks, an optional minus sign,
and zero or more digits with an optional decimal point,
is sorted by arithmetic value. An empty digit string is treated
as zero. Leading zeros and signs on zeros do not affect ordering.
The \f4\-n\f1 option implies the \f4\-b\f1 option
(see below).
.TP
\f4\-r\f1
Reverse the sense of comparisons.
.TP
\f4\-b\f1
Ignore leading blanks when determining the starting and ending
positions of a restricted sort key.
.TP
\f4\-t\f2x\f1
Use
.I x
as the field separator character;
.I x
is not considered to be part of a field
(although it may be included in a sort key).
Each occurrence of
.I x
is significant
(for example,
.I xx
delimits an empty field).
\f2x\f1 may be a supplementary code set character.
The default field separators are blank characters.
.PP
When ordering options appear before restricted
sort key specifications, the requested ordering rules are
applied globally to all sort keys.
When attached to a specific sort key (described below),
the specified ordering options override all global ordering options
for that key.
.br
.ne 5
.PP
When there are multiple sort keys, later keys
are compared only after all earlier keys
compare equal. Except when the -u option is specified, lines that
otherwise compare equal are ordered as if none of the options -d,
-f, -i, -n or -k were present (but with -r still in effect, if it
was specified) and with all bytes in the lines significant to the
comparison.
.SH EXAMPLES
Sort the contents of
.I infile
with the second field as the sort key:
.IP
\f4sort \-k2 \f2infile\f1
.PP
Sort, in reverse order, the contents of
.I infile1
and
.IR infile2 ,
placing the output in
.I outfile
and using the first character of the second field
as the sort key:
.IP
\f4sort \-r \-o \f2outfile\fP -k 2.1,2.1 \f2infile1 infile2\f1
.PP
Sort, in reverse order, the contents of
.I infile1
and
.I infile2
using the first non-blank character of the second field
as the sort key:
.IP
\f4sort \-r +1.0b \-1.1b \f2infile1 infile2\f1
.PP
Print the password file
[\f4passwd\f1(4)]
sorted by the numeric user
.SM ID
(the third colon-separated field):
.IP
\f4sort \-t: +2n \-3 /etc/passwd\f1
.PP
Sort the contents of the password file using the group ID (third field) as
the primary sort key and the user ID (second field) as the secondary sort
key:
.IP
\f4sort \-t: +3 \-4 +2 \-3 /etc/passwd\f1
.PP
Print the lines of the already sorted file
.IR infile ,
suppressing all but the first occurrence of lines
having the same third field
(the options
\f4\-um\f1
with just one input file make the choice of a unique
representative from a set of equal lines predictable):
.IP
\f4sort \-um +2 \-3 \f2infile\f1
.SH FILES
.PD 0
.TP
\f4/var/tmp/stm???\f1
.TP
\f4/usr/lib/locale/\f2locale\f4/LC_MESSAGES/uxcore.abi\f1
language-specific message file [See \f4LANG\fP on \f4environ\f1 (5).]
.SH SEE ALSO
\f4comm\fP(1),
\f4join\fP(1),
\f4uniq\fP(1)
.SH NOTES
Comments and exits with non-zero status for various trouble
conditions
(for example, when input lines are too long),
and for disorder discovered under the
\f4\-c\f1
option.
.SP
When the last line of an input file
is missing a \f4newline\f1 character,
\f4sort\fP appends one,
prints a warning message, and continues.
\f4sort\fP does not guarantee
preservation of relative line ordering on equal keys.
.PP
If the \f4\-i\f1 and \f4\-f\f1
options are both used, or if the \f4\-i\f1 and \f4\-d\f1
are both used,
the last one given controls the sort behavior;
it is not currently possible to sort with folded case or dictionary order
and non-printing characters
ignored, because of the method used to implement these options.
.\" @(#)sort.1 6.3 of 9/2/83
.Ee