Welcome to the Reiser4 Wiki, the Wiki for users and developers of the ReiserFS and Reiser4 filesystems.

For now, most of the documentation is just a snapshot of the old Namesys site (archive.org, 2007-09-29).

There was also a Reiser4 Wiki (archive.org, 2007-07-06) once on pub.namesys.com.

Reiser4 Hybrid transaction model

From Reiser4 FS Wiki
Jump to: navigation, search

Reiser4 transaction model is a high-level block allocator. User can specify (by mount option) any of 3 currently existing transaction models. First and second ones implement classic journaling and write-anywhere (Copy-on-Write) transaction models. They are implemented as a hard-coded models in various file systems. Here we describe a unique Hybrid Transaction Model suggested by Joshua Macdonald and Hans Reiser in 2001.


Relocation Decisions (relocate or overwrite)

In this transaction model a part of atom's blocks is scheduled to be relocated (RELOCATE_SET). Another part - to be overwritten (OVERWRITE_SET). All system blocks (super-block, bitmap blocks, etc) for obvious reasons always fall to OVERWRITE set. All new nodes (which haven't had block numbers yet) always fall to RELOCATE_SET. For other blocks of the atom the relocation decision is based on the statistics accumulated by a special scanning procedure. A positive decision (relocate) is made in the case if enough (FLUSH_RELOCATE_THRESHOLD) nodes were found for flushing (see comments and source code in flush.c for details).


Block allocation policy for RELOCATE_SET

To allocate blocks for some set of nodes means to arrange mapping from that set to the space of disk addresses.

Let m be some mapping of linearly ordered sets A and B (m: A -> B).

Definition 1. Mapping m is said to keep linear order on A, iff the following implication is true:

for all a in A, b in B
(a < b) => m(a) < m(b)

Comment. "<" in the first part of the implication means linear order on A. "< in the second part of implication means linear order on B.

The important property of Hybrid Transaction Model is that this model keeps parent-first order on the RELOCATE_SET considered as a tree.

Definition 2. Parent-first order on a tree is a linear order on tree nodes determined by the following recursive procedure:

void parent_first(node)
{
    print_node (node);
    if (node->level > leaf) {
        for (i = 0; i < num_children; i += 1)
            parent_first (node->child[i]);
    }
}

Specifically, (node1 < node2) in the terms of parent-first order means that the procedure above prints node1 before node2.

Comment. The definition above assumes that all tree nodes on the same level were ordered by some linear manner (by the array child[i]). In our case the mentioned order is determined by their natural enumeration "from left to right" in the tree.

So, Reiser4 hybrid transaction model keeps parent-first order on the RELOCATE_SET. Specifically, it means that the following implication is true:

   (node1 < node2) => (block_nr(node1) < block_nr(node2))

"<" in the second part of implication means usual linear order on the set of disk addresses (block numbers).

It is achieved by finding and setting a "preceder" for the RELOCATE_SET at the very beginning of its relocation. Preceder is the maximal block number (disk address) with the following properties:

1) it is marked "busy" in the space map;

2) it is smaller than all old block numbers of the given RELOCATE_SET.

Further, during allocation of new block numbers for RELOCATE_SET the variable containing the preceder gets updated and is passed as a "hint" to the low-level space allocator.

Personal tools