The Vinum volume manager
The Vinum volume manager

Last updated: 16 April 1999

Previous Sections

Introduction
The problems
Current implementations
How Vinum addresses the Three Problems
The big picture
Some examples

Increased resilience: RAID-5

The alternative approach to resilience is RAID-5. A RAID-5 configuration might look like:
drive e device /dev/da6h
volume raid5
  plex org raid5 512k
    sd length 128m drive a
    sd length 128m drive b
    sd length 128m drive c
    sd length 128m drive d
    sd length 128m drive e
Although this plex has five subdisks, its size is the same as the plexes in the other examples, since the equivalent of one subdisk is used to store parity information. After processing the configuration, the system configuration is:
Drives:         5 (8 configured)
Volumes:        4 (4 configured)
Plexes:         5 (8 configured)
Subdisks:       12 (16 configured)

D a                     State: up       Device /dev/da3h        Avail: 1293/2573 MB (50%)
D b                     State: up       Device /dev/da4h        Avail: 1805/2573 MB (70%)
D c                     State: up       Device /dev/da5h        Avail: 2317/2573 MB (90%)
D d                     State: up       Device /dev/da6h        Avail: 2317/2573 MB (90%)
D e                     State: up       Device /dev/da6h        Avail: 2445/2573 MB (95%)

V myvol                 State: up       Plexes:       1 Size:        512 MB
V mirror                State: up       Plexes:       2 Size:        512 MB
V striped               State: up       Plexes:       1 Size:        512 MB
V raid5                 State: up       Plexes:       1 Size:        512 MB

P myvol.p0            C State: up       Subdisks:     1 Size:        512 MB
P mirror.p0           C State: up       Subdisks:     1 Size:        512 MB
P mirror.p1           C State: initializing     Subdisks:     1 Size:        512 MB
P striped.p0          S State: up       Subdisks:     1 Size:        512 MB
P raid5.p0            R State: up       Subdisks:     1 Size:        512 MB

S myvol.p0.s0           State: up       PO:        0  B Size:        512 MB
S mirror.p0.s0          State: up       PO:        0  B Size:        512 MB
S mirror.p1.s0          State: empty    PO:        0  B Size:        512 MB
S striped.p0.s0         State: up       PO:        0  B Size:        128 MB
S striped.p0.s1         State: up       PO:      512 kB Size:        128 MB
S striped.p0.s2         State: up       PO:     1024 kB Size:        128 MB
S striped.p0.s3         State: up       PO:     1536 kB Size:        128 MB
S raid5.p0.s0           State: init     PO:        0  B Size:        128 MB
S raid5.p0.s1           State: init     PO:      512 kB Size:        128 MB
S raid5.p0.s2           State: init     PO:     1024 kB Size:        128 MB
S raid5.p0.s3           State: init     PO:     1536 kB Size:        128 MB
S raid5.p0.s4           State: init     PO:     1536 kB Size:        128 MB
The following figure represents this volume graphically.

A RAID-5 Vinum volume

A RAID-5 Vinum volume

As with striped plexes, the darkness of the stripes indicates the position within the plex address space: the lightest stripes come first, the darkest last. The completely black stripes are the parity stripes.

On creation, RAID-5 plexes are in the init state: before they can be used, the parity data must be created. Vinum currently initializes RAID-5 plexes by writing binary zeros to all subdisks, though a conceivable alternative would be to rebuild the parity blocks, which would allow better recovery of crashed plexes.

Resilience and performance

With sufficient hardware, it is possible to build volumes which show both increased resilience and increased performance compared to standard UNIX partitions. Mirrored disks will always give better performance than RAID-5, so a typical configuration file might be:
volume raid10
  plex org striped 512k
    sd length 102480k drive a
    sd length 102480k drive b
    sd length 102480k drive c
    sd length 102480k drive d
    sd length 102480k drive e
  plex org striped 512k
    sd length 102480k drive c
    sd length 102480k drive d
    sd length 102480k drive e
    sd length 102480k drive a
    sd length 102480k drive b
The subdisks of the second plex are offset by two drives from those of the first plex: this helps ensure that writes do not go to the same subdisks even if a transfer goes over two drives.

The following figure represents the structure of this volume.

A mirrored, striped Vinum volume

A mirrored, striped Vinum volume

Following Sections

Object naming
Startup
Performance issues
The implementation
Driver structure
Availability
Future directions
References