| |||||||||
The term pipeline has meaning in electrical and mechanical systems, as well as in software. In general, the term represents the concept of splitting a job into subprocesses in which the output of one subprocess feeds into the next (much like water flows from one pipe segment to the next).
A mechanical example of a pipeline is a washer/dryer system for clothing. Instead of having one unit that both washes and dries, we have two units that together form a pipeline (the output of the washer enters the drier). If washing takes 1 hour and drying takes 1 hour, the pipeline allows us to finish a full load of laundry every hour, compared to every 2 hours if you had a single (non-pipelined) unit that washed and then dried. It still requires two hours for an item of clothing to complete its wash/dry cycle of course.
Electrically, pipelines are used in microprocessors to allow complex logic sequences to execute at faster speeds. Pipelines are related to the engineering concepts of throughput and latency. See Instruction pipeline and Classic RISC pipeline for a better discussion.
In computer software, a pipeline is a command line feature prevalent in UNIX and other UNIX-like operating systems. Douglas McIlroy, one of the authors of the early UNIX command shells, noticed that much of the time they were processing the output of one program as the input to another. The UNIX pioneers established a means of chaining the running programs together as co-processes so that the output of the first program becomes the input to the second. This was to become the famous pipes and filters design pattern. A pipeline may be extended to any number of commands with the output of one serving as the input to the next.
Commonly filter programs are used in a UNIX pipeline and they usually obey a few conventions: line structured records, reading data from the standard input, and writing to the standard output.
Below is an example of a pipeline that implements a kind of spell checker for this page.
Here is an explanation of the pipeline:
John Hartmann, a Danish engineer with IBM, extended the basic pipes and filters paradigm in a number of useful ways. His product, a/k/a CMS Pipelines, is available on a number of IBM platforms.
Some of the salient characteristics that distinguish Hartmann Pipeline from ordinary Unix pipes are:
The utility of the many filters supplied with the program is exemplified by the LOOKUP filter:
LOOKUP matches records in its primary input stream with records in its secondary input stream and writes matched and unmatched records to different output streams. The records are matched on the basis of a key field (the contents of a specified range of columns in the records).
LOOKUP reads records from its primary and secondary input streams and writes records to its primary, secondary, and tertiary output streams, if each is connected. The secondary input stream must be defined and connected.
The records in the secondary input stream are the master records. LOOKUP first reads the master records into a buffer, where records with duplicate key fields are discarded; the first occurrence of a key is retained. The records in the buffer are referred to as the reference.
The records in the primary input stream are the detail records. LOOKUP compares detail records to records in the reference. LOOKUP writes records to three output streams, if each is connected:
key fields. The primary and secondary output streams are severed at the end of file on the primary input stream before records are written to the tertiary output stream.
This arrangement allows one to use other filters to prepare the dictionary, or master records for input to LOOKUP from whatever source is required. The many Input/Output filters, or drivers, allow a Hartmann Pipe to interact directly with a variety data sources, from files, to the system itself, and such things as TCP/IP ports. The repertoir of filters and drivers is rich enough that one could, for example, write a server that consisted solely of a Hartmann pipeline.