grep
The name grep is short for ''generalized regular expression
parser.'' The command grep is a UNIX filter that allows searches
for regular expressions and fixed strings within ascii documents.
Regular expressions are patterns or templates that are defined by
a combination of ascii strings and metacharacters. The metacharacters,
characters that represent something other than their literal meaning,
allow you to specify search tasks such as ''Find all strings that start
at the beginning of a line that contain a character sequence with three
g's in it''.
There is actually a family of grep commands, each command designed for a different task:
Which grep command you need to use depends on the complexity
of the search task. fgrep is the command to use for strings
that contain no wildcards, or other metacharacters, just a single
text pattern that must be matched exactly. grep is the command
to use for general purpose searching that requires the use of
wildcards and other metacharacters, specifying string position,
string size, character class, or closure. egrep is the
extended version of grep. It can handle expressions just like
the grep command, but it allows more variability in the search
by allowing the search pattern to be ''string1 or string2,
followed by string3 or string4''.
The general syntax for grep commands that search for regular
expressions is:
grep <expression> <filename> [<filename> ...]
The fgrep command is similar, except that it searches only for fixed
strings:
fgrep <string(s)> <filename> [<filename> ...]
Grep commands can be restricted to a single filename, or can be told
to search a series of files, either by listing them in order, or by using
wildcard characters.
For specific details on the specific metacharacters used and options
available for the grep commands, see the online manual page for
grep.
sort
A filter for sorting alpha-numeric text fields is appropriately
called sort. sort accepts input from stdin by default,
so it can be used in a chain of commands, or it can accept
input from a file:
cat <file1> <file2> | sort | more
or the equivalent:
sort <file1> <file2> | more
By default, the sort is done according to the character or numeric
value in the leftmost column of a field. Fields are separated by
tab or space characters by default, but any other field separators
can be used by defining them on the command line.
Useful command-line options for sort are as follows:
-b Ignore leading space characters in the startingEXAMPLES:and ending positions of a field.
-d Dictionary order. Only letters, digits, space, and tab
are significant in the sort.
-f Treat upper and lower case characters as equivalent.
-n Numeric sort. Sort by arithmetic value.
-r Reverse the order of the sort.
-t<c> Use the character <c> as the field delimiter.
+sp.o sp is the starting position for the sort. +0 is
leftmost field. .o is the optional character offset
into a field which indicates where the sort should begin.
-ep.o ep specifies the field number before which the
sort is ended. .o is optional; it specifies that the
sort will end at the character just prior to the .o
offset into the ep field.