Recent Articles



































Linked list



         


In computer science, a linked list is one of the simplest data structures used in computer programming. It consists of a sequence of nodes, each containing arbitrary data fields and one or two references ("links") pointing to the next and/or previous nodes. Linked lists permit insertion and removal of nodes at any point in the list in constant time, but do not allow random access.

[Top]

Variants

[Top]

Singly-linked list

The simplest kind of linked list is a singly linked list, which has one link per node. This link points to the next node in the list, or to a null value or empty list if it is the last node.


A singly linked list containing three integer values

To specify a singly-linked list (e.g. when handling it over to a procedure) it suffices to specify the address of its first element.

[Top]

Doubly-linked list

A more sophisticated kind of linked list is a doubly linked list. Each node has two links, one to the previous node and one to the next.


An example of a doubly linked list.

In some very low level languages, Xor-linking offers a way to implement doubly-linked lists using a single word for both links, although the use of this technique is usually discouraged.

[Top]

Circular list

In a circularly linked list, the first and last nodes are linked together. This can be done for both singly and doubly linked lists. To traverse a circular linked list, you begin at any node and follow the list in either direction until you return to the original node.

Singly linked lists are often handled by giving the address of the last element — because that element also provides quick access to the first one, whereas the converse is not true.

[Top]

Sentinel nodes

Linked lists sometimes have a special dummy or sentinel node at the beginning and/or at the end of the list, which is not used to store data. Its purpose is to simplify or speed up some operations, by ensuring that every data node always has a previous and/or next node, and that every list (even one that contains no data elements) always has a "first" and "last" node.

[Top]

Applications of linked lists

Linked lists are used as a building block for many other data structures, such as stacks, queues and their variations.

The "data" field of a node can be another linked list. By this device, one can construct many linked data structures with lists; this practice originated in the Lisp programming language, where linked lists are a primary (though by no means the only) data structure, and is now a common feature of the functional programming style.

Sometimes, linked lists are used to implement associative arrays, and are in this context called association lists. There is very little good to be said about this use of linked lists; they are easily outperformed by other data structures such as self-balancing binary search trees even on small data sets (see the discussion in associative array). However, sometimes a linked list is dynamically created out of a subset of nodes in such a tree, and used to more efficiently traverse that set.

[Top]

Tradeoffs

As for most choices in computer programming and design, no method is well suited to all circumstances. A linked list data structure might work well in one case, and cause problems in another. This is a list of some of the common tradeoffs involving linked list structures. In general, if you have a dynamic collection, where elements are frequently being added and deleted, and the location of new elements added to the list is significant, then benefits of a linked list increase.

[Top]

Linked lists vs. arrays

Linked lists have several advantages over arrays. For one thing, a linked list allows a new element to be inserted or deleted in any position, with a constant number of operations (changing a couple of pointers). Linked lists can also save memory in applications that need one or more lists of unpredictable size, since the total space used will be proportional to the number of elements actually present in all lists (as opposed to the number of lists times the maximum size of each list).

Further memory savings can be achieved, in certain cases, by sharing the same tail elements among two or more lists. In this way, one can add new elements to the front of the list while keeping a reference to both the new and the old versions — which provides the simplest example of a persistent data structure.

On the other hand, compared to an array, the chief disadvantage of a singly linked list is that it only allows sequential access to elements, and only in one direction (from front to back). In particular, when removing or inserting an element in the middle of the list, one needs to know the address of the preceding element.

Another disadvantage of linked lists is the extra storage needed for the address fields, which often makes them impractical for lists of small data items such as characters or boolean values. Even more significant is the time cost of allocating memory separately for each new element (and, in some systems, of locating and recycling elements that cannot be reached; see garbage collection). There are a number of techniques to reduce this overhead, such as using a free list or memory pool, storing the data itself inside the nodes, and CDR coding.

[Top]

Double-linking vs. single-linking

Double-linked lists require more space per node (unless one uses xor-linking), and their elementary operations are more expensive; but they are often easier to manipulate because they allow sequential access to the list in both directions. In particular, one can insert or delete a node in a constant number of operations given only that node's address. Some algorithms, such as quicksort, require access in both directions. On the other hand, they do not allow tail-sharing, and cannot be used as persistent data structures.

[Top]

Circular lists

Circular linked lists are most useful for describing naturally circular structures, and have the advantage of regular structure and being able to traverse the list starting at any point. They also allow quick access to the first and last records through a single pointer (the address of the last element). Their main disadvantage is the complexity of iteration, which has subtle special cases.

[Top]

Sentinel nodes

Sentinel nodes may simplify certain list operations, by ensuring that the next and/or previous nodes exist for every element. However sentinel nodes use up extra space (especially in applications that use many short lists), and they may complicate other operations.

[Top]

Linked List Operations

When manipulating linked lists in-place, care must be taken to not use values that you have invalidated in previous assignments. This makes algorithms for inserting or deleting linked list nodes somewhat subtle. This section gives pseudocode for adding or removing nodes from singly, doubly, and circularly linked lists in-place. Throughout we will use null to refer to an end-of-list marker or sentinel, which may be implemented in a number of ways.

[Top]

Singly-linked lists

Our node data structure will have two fields. We also keep a variable head which always points to the first node in the list, or is null for an empty list.

The following is wikicode, a proposed pseudocode for BambooWeb articles.

record Node { data // The data being stored in the node; often a reference to the actual data next // A reference to the next node } Node head // points to front of list Node nodeBeingRemoved // used in later examples

Traversal of a singly-linked list is easy:

node := head // start at front of list while node not null { // loop through list (do something with node.data) node := node.next // go to next node in list }

The following code inserts a node after an existing node in a singly linked list. Inserting a node before an existing one cannot be done; instead, you have to locate it while keeping track of the previous node.

function insertAfter(Node node, Node newNode) { // insert newNode after node newNode.next := node.next // point new node to next node node.next  := newNode // add new node to list }

Inserting at the beginning of the list requires a separate function. This requires updating head.

function insertBeginning(Node newNode) { // insert node before current head newNode.next := head // point to rest of list head  := newNode // new front of list }

Similarly, we have functions for removing the node after a given node, and for removing a node from the beginning of the list. To find and remove a particular node, one must again keep track of the previous element.

function removeAfter(Node node) { // remove node past this one nodeBeingRemoved := node.next // this is the node to delete node.next := node.next.next // point past node (the above statement could be node.next = nodeBeingRemoved.next) destroy nodeBeingRemoved // free up deleted node }
function removeBeginning(Node node) { // remove head node nodeBeingRemoved := head // this is the node to delete head := head.next // point past deleted node destroy nodeBeingRemoved // free up deleted node }

Notice that removeBeginning() sets head to null when removing the last node in the list.

[Top]

Doubly-linked lists

With doubly-linked lists there are even more pointers to update, but also less information is needed, since we can use backwards pointers to observe preceding elements in the list. This enables new operations, and eliminates special-case functions. We will add a prev field to our nodes, pointing to the previous element, and a tail variable which always points to the last node in the list. Both head and tail are null for an empty list.

Iterating through a doubly linked list can be done in either direction. In fact, direction can change many times, if desired.

Forwards

node := head while node &neq; null do something with node.value node := node.next

Backwards

node := tail while node ≠ null <do something with node.value> node := node.prev

These symmetric functions add a node either after or before a given node:

function insertAfter(node, newNode) newNode.prev := node newNode.next := node.next if node.next = null tail := newNode else node.next.prev := newNode node.next := newNode
function insertBefore(node, newNode) newNode.prev := node.prev newNode.next := node if node.prev is null head := newNode else node.prev.next := newNode node.prev  := newNode

We also need a function to insert a node at the beginning of a possibly-empty list:

function insertBeginning(newNode) if head = null head := newNode tail := newNode newNode.prev := null newNode.next := null else insertBefore(head, newNode)

A symmetric function inserts at the end:

function insertEnd(newNode) if tail = null insertBeginning(newNode) else insertAfter(tail, newNode)

Removing a node is easier, only requiring care with head and tail:

function remove(node) if node.prev = null head = node.next else node.prev.next = node.next if node.next = null tail := node.prev else node.next.prev = node.prev destroy node

One subtle consequence of this procedure is that deleting the last element of a list sets both head and tail to null.

[Top]

Circularly-linked lists

Circularly-linked lists can be either singly or doubly linked. For both, their circular versions, which link all the nodes into a continuous circle without using null, benefit from the ability to traverse the full list beginning at any given node. This often allows us to avoid storing head and tail, although if the list may be empty we need a special representation for the empty list, such as a head variable which points to some node in the list or is null if it's empty; we use such a head here. This representation significantly simplifies adding and removing nodes with a non-empty list, but empty lists are then a special case.

Assuming that someNode is some node in a non-empty list, this code iterates through that list starting with someNode:

Forwards

node := someNode do do something with node.value node := node.next while node &neq; someNode

Backwards

node := someNode do do something with node.value node := node.prev while node &neq; someNode

Notice the postponing of the test to the end of the loop. This is important for the case where the list contains only the single node someNode.

This simple function inserts a node into a doubly-linked circularly linked list after a given element:

function insertAfter(node, newNode) newNode.next := node.next newNode.prev := node node.next.prev := newNode node.next  := newNode

Inserting an element in a possibly empty list requires a special function:

function insertBeginning(node) if head = null node.prev := node node.next := node else insertAfter(head.prev, node) head := node

Finally, removing a node must deal with the case where the list empties:

function remove(node) if node.next = node head := null else node.next.prev := node.prev node.prev.next := node.next destroy node
[Top]

Linked lists using arrays of nodes

Languages that don't support any type of reference can still create links by replacing pointers with array indices. The approach is to keep an array of records, where each record has integer fields indicating the index of the next and (possibly) previous node in the array. Not all nodes in the array need be used. If even records are not supported, parallel arrays can often be used instead.

As an example, consider the following type structure in Visual Basic:

TYPE foo Next as Integer Prev as Integer ' if a double linked list is desired Name as String Balance as Float END TYPE

By creating an array of these structures, and an integer variable to store the index of the head element, a linked list can be built:

Dim ListHead as Integer Dim FooBar(1000) as foo

Links between elements are formed by placing the array index of the next (or previous) cell into the Next or Prev field within a given element. For example:

IndexNextPrevNameBalance
014Jones, John123.45
1-10Smith, Joseph234.56
24-1Adams, Adam0.00
3Ignore, Ignatius999.99
402Another, Anita876.54
5
6
7

In the above example, ListHead would be set to 2, the location of the first entry in the list. Notice that entry 3 and 5 through 7 are not part of the list. These cells are available for any additions to the list. By creating a ListFree integer variable, a serialized for storage on disk or transfer over a network.

This approach has one main disadvantage, however: it creates and manages a private memory space for its nodes. Not only does this increase complexity of the implementation, but growing it may be difficult or impossible, because it is large, whereas finding space for a new linked list node in a large, general memory pool is easier. This slow growth also affects algorithmic performance, as it causes a few insert operations to unexpectedly take linear (O(n)) instead of constant time (although it's still amortized constant). Using a general memory pool also leaves more memory for other data if the list is smaller than expected or if many nodes are freed.

[Top]

Language Support

Many programming languages, such as Lisp and Scheme have singly linked lists built in. In many functional languages, these lists are constructed from nodes, each called a cons or cons cell. The cons has two fields: the car, a reference to the data for that node, and the cdr, a reference to the next node. Although cons cells can be used to build other data structures, this is their primary purpose.

In other languages, linked list are typically built using references together with records. Here is a complete example in C:

#include <stdlib.h> /* for malloc */ typedef struct ns { int data; struct ns *next; } node; node *list_add(node ** p, int i) { node *n = malloc(sizeof(*n)); n->next = *p; *p = n; n->data = i; return n; } void list_remove(node ** p) { if (p) { node *n = *p; *p = (*p)->next; free(n); } } node **list_search(node ** n, int i) { if (!n) return NULL; while (*n) { if ((*n)->data == i) { return n; } n = &(*n)->next; } return NULL; } void list_print(node * n) { if (!n) printf("list is empty\n"); while (n) { printf("print %p %p %i \n", n, n->next, n->data); n = n->next; } } main() { node *n = NULL; list_add(&n, 0); list_add(&n, 1); list_add(&n, 2); list_add(&n, 3); list_add(&n, 4); list_print(n); list_remove(&n); /* remove head */ list_remove(&n->next); /* remove second */ list_remove(list_search(&n, 1)); list_remove(&n->next); /* remove tail */ list_remove(&n); /* remove last */ list_print(n); }
[Top]

Internal and external storage

When constructing a linked list, one is faced with the choice of whether to store the data of the list directly in the linked list nodes, called internal storage, or to merely store a reference to the data, called external storage. Internal storage has the advantage of making access to the data more efficient, requiring less storage overall, having better locality of reference, and simplifying memory management for the list (its data is allocated and deallocated at the same time as the list nodes).

External storage, on the other hand, has the advantage of being more generic, in that the same data structure and machine code can be used for a linked list no matter what the size of the data is. It also makes it easy to place the same data in multiple linked lists. Note, however, that even with internal storage the same data can be placed in multiple lists, either by including multiple next references in the node data structure, or by having other linked lists store references to the nodes of the linked list containing the data.

A common misconception is that every list using internal storage requires a new set of functions to operate on the linked list, consuming more effort and code space than an approach using external storage. This is true with a naive approach, but as long as the next pointers are located in the same place in each record, it is possible to create a generic implementation of linked lists with internal storage that only needs to be supplied for each type of list a way of creating and destroying nodes.

[Top]

Example

Suppose you wanted to create a linked list of families and their members. Using internal storage, the structure might look like the following:

record member { // structure for a member of a family member next // pointer to other members of family string fname // first name integer age // age of member } record family { // structure for the family itself family next // pointer to other families string lname // last name of family string address // address of family string phone // phone number of family member members // pointer to list of members of family }

To print a complete list of families and their members using internal storage, we could write:

aFamily := Families // start at head of list while aFamily &neq; null { // loop through list of families print information about family aMember := aFamily.members // get list of family members while aMember ≠ null { // loop through list of members print information about member aMember := aMember.next // to to next member } aFamily := aFamily.next // go to next family }

Using external storage, we would create the following structures:

record node { // generic link structure node next // pointer to next node pointer data // generic pointer for data at node } record member { // structure for family member string fname // first name integer age // age of member } record family { // structure for family string lname // last name of family string address // address of family string phone // phone number of family node members // pointer to list of members of family }

To print a complete list of families and their members using external storage, we could write:

famNode := Families // start at head of list while famNode &neq; null { // loop through list of families aFamily = (family)famNode.data // extract family from node print information about family memNode := aFamily.members // get list of family members while memNode &neq; null { // loop through list of members aMember := (member)memNode.data // extract member from node print information about member memNode := memNode.next // get next family member } famNode := famNode.next // get next family in list }

Notice that when using external storage, an extra step is needed to extract the record from the node and cast it into the proper data type.

[Top]

Related data structures

The skip list is a linked list augmented with layers of pointers for quickly jumping over large numbers of elements, and then descending to the next layer. This process continues down to the bottom layer, which is the actual list.

A binary tree can be seen as a type of linked list where the elements are themselves linked lists of the same nature. The result is that each node may include a reference to the head node of one or two other linked lists, which, together with their contents, form the subtrees below that node.

[Top]




  View Live Article   This article is from Wikipedia. All text is available under the terms of the GNU Free Documentation License