Skip to content

Commit

Permalink
Merge tag 'dm-3.13-changes' of git://git.kernel.org/pub/scm/linux/ker…
Browse files Browse the repository at this point in the history
…nel/git/device-mapper/linux-dm

Pull device mapper changes from Mike Snitzer:
 "A set of device-mapper changes for 3.13.

  Improve reliability of buffer allocations for dm messages with a small
  number of arguments, a couple path group initialization fixes for dm
  multipath, a fix for resizing a dm array, various fixes and
  optimizations for dm cache, a fix for device mapper's Kconfig menu
  indentation.

  Features added include:
   - dm crypt support for activating legacy CBC TrueCrypt containers
     (useful for forensics of these old TCRYPT containers)
   - reduced dm-cache memory requirements for each block in the cache
   - basic support for shrinking a dm-cache's cache (fast) device
   - most notably, dm-cache support for managing cache coherency when
     deploying dm-cache with sophisticated origin volumes (that support
     hardware snapshots and/or clustering): these changes come in the
     form of a new passthrough operation mode and a cache block
     invalidation interface"

* tag 'dm-3.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (32 commits)
  dm cache: resolve small nits and improve Documentation
  dm cache: add cache block invalidation support
  dm cache: add remove_cblock method to policy interface
  dm cache policy mq: reduce memory requirements
  dm cache metadata: check the metadata version when reading the superblock
  dm cache: add passthrough mode
  dm cache: cache shrinking support
  dm cache: promotion optimisation for writes
  dm cache: be much more aggressive about promoting writes to discarded blocks
  dm cache policy mq: implement writeback_work() and mq_{set,clear}_dirty()
  dm cache: optimize commit_if_needed
  dm space map disk: optimise sm_disk_dec_block
  MAINTAINERS: add reference to device-mapper's linux-dm.git tree
  dm: fix Kconfig menu indentation
  dm: allow remove to be deferred
  dm table: print error on preresume failure
  dm crypt: add TCW IV mode for old CBC TCRYPT containers
  dm crypt: properly handle extra key string in initialization
  dm cache: log error message if dm_kcopyd_copy() fails
  dm cache: use cell_defer() boolean argument consistently
  ...
  • Loading branch information
Linus Torvalds committed Nov 14, 2013
2 parents 82cb6ac + 7b6b2bc commit 7f2dc5c
Show file tree
Hide file tree
Showing 21 changed files with 1,544 additions and 467 deletions.
6 changes: 4 additions & 2 deletions Documentation/device-mapper/cache-policies.txt
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,10 @@ multiqueue

This policy is the default.

The multiqueue policy has two sets of 16 queues: one set for entries
waiting for the cache and another one for those in the cache.
The multiqueue policy has three sets of 16 queues: one set for entries
waiting for the cache and another two for those in the cache (a set for
clean entries and a set for dirty entries).

Cache entries in the queues are aged based on logical time. Entry into
the cache is based on variable thresholds and queue selection is based
on hit count on entry. The policy aims to take different cache miss
Expand Down
57 changes: 51 additions & 6 deletions Documentation/device-mapper/cache.txt
Original file line number Diff line number Diff line change
Expand Up @@ -68,10 +68,11 @@ So large block sizes are bad because they waste cache space. And small
block sizes are bad because they increase the amount of metadata (both
in core and on disk).

Writeback/writethrough
----------------------
Cache operating modes
---------------------

The cache has two modes, writeback and writethrough.
The cache has three operating modes: writeback, writethrough and
passthrough.

If writeback, the default, is selected then a write to a block that is
cached will go only to the cache and the block will be marked dirty in
Expand All @@ -81,8 +82,31 @@ If writethrough is selected then a write to a cached block will not
complete until it has hit both the origin and cache devices. Clean
blocks should remain clean.

If passthrough is selected, useful when the cache contents are not known
to be coherent with the origin device, then all reads are served from
the origin device (all reads miss the cache) and all writes are
forwarded to the origin device; additionally, write hits cause cache
block invalidates. To enable passthrough mode the cache must be clean.
Passthrough mode allows a cache device to be activated without having to
worry about coherency. Coherency that exists is maintained, although
the cache will gradually cool as writes take place. If the coherency of
the cache can later be verified, or established through use of the
"invalidate_cblocks" message, the cache device can be transitioned to
writethrough or writeback mode while still warm. Otherwise, the cache
contents can be discarded prior to transitioning to the desired
operating mode.

A simple cleaner policy is provided, which will clean (write back) all
dirty blocks in a cache. Useful for decommissioning a cache.
dirty blocks in a cache. Useful for decommissioning a cache or when
shrinking a cache. Shrinking the cache's fast device requires all cache
blocks, in the area of the cache being removed, to be clean. If the
area being removed from the cache still contains dirty blocks the resize
will fail. Care must be taken to never reduce the volume used for the
cache's fast device until the cache is clean. This is of particular
importance if writeback mode is used. Writethrough and passthrough
modes already maintain a clean cache. Future support to partially clean
the cache, above a specified threshold, will allow for keeping the cache
warm and in writeback mode during resize.

Migration throttling
--------------------
Expand Down Expand Up @@ -161,7 +185,7 @@ Constructor
block size : cache unit size in sectors

#feature args : number of feature arguments passed
feature args : writethrough. (The default is writeback.)
feature args : writethrough or passthrough (The default is writeback.)

policy : the replacement policy to use
#policy args : an even number of arguments corresponding to
Expand All @@ -177,6 +201,13 @@ Optional feature arguments are:
back cache block contents later for performance reasons,
so they may differ from the corresponding origin blocks.

passthrough : a degraded mode useful for various cache coherency
situations (e.g., rolling back snapshots of
underlying storage). Reads and writes always go to
the origin. If a write goes to a cached origin
block, then the cache block is invalidated.
To enable passthrough mode the cache must be clean.

A policy called 'default' is always registered. This is an alias for
the policy we currently think is giving best all round performance.

Expand Down Expand Up @@ -231,12 +262,26 @@ The message format is:
E.g.
dmsetup message my_cache 0 sequential_threshold 1024


Invalidation is removing an entry from the cache without writing it
back. Cache blocks can be invalidated via the invalidate_cblocks
message, which takes an arbitrary number of cblock ranges. Each cblock
must be expressed as a decimal value, in the future a variant message
that takes cblock ranges expressed in hexidecimal may be needed to
better support efficient invalidation of larger caches. The cache must
be in passthrough mode when invalidate_cblocks is used.

invalidate_cblocks [<cblock>|<cblock begin>-<cblock end>]*

E.g.
dmsetup message my_cache 0 invalidate_cblocks 2345 3456-4567 5678-6789

Examples
========

The test suite can be found here:

https://github.com/jthornber/thinp-test-suite
https://github.com/jthornber/device-mapper-test-suite

dmsetup create my_cache --table '0 41943040 cache /dev/mapper/metadata \
/dev/mapper/ssd /dev/mapper/origin 512 1 writeback default 0'
Expand Down
11 changes: 9 additions & 2 deletions Documentation/device-mapper/dm-crypt.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,15 @@ dm-crypt
Device-Mapper's "crypt" target provides transparent encryption of block devices
using the kernel crypto API.

For a more detailed description of supported parameters see:
http://code.google.com/p/cryptsetup/wiki/DMCrypt

Parameters: <cipher> <key> <iv_offset> <device path> \
<offset> [<#opt_params> <opt_params>]

<cipher>
Encryption cipher and an optional IV generation mode.
(In format cipher[:keycount]-chainmode-ivopts:ivmode).
(In format cipher[:keycount]-chainmode-ivmode[:ivopts]).
Examples:
des
aes-cbc-essiv:sha256
Expand All @@ -19,7 +22,11 @@ Parameters: <cipher> <key> <iv_offset> <device path> \

<key>
Key used for encryption. It is encoded as a hexadecimal number.
You can only use key sizes that are valid for the selected cipher.
You can only use key sizes that are valid for the selected cipher
in combination with the selected iv mode.
Note that for some iv modes the key string can contain additional
keys (for example IV seed) so the key contains more parts concatenated
into a single string.

<keycount>
Multi-key compatibility mode. You can define <keycount> keys and
Expand Down
1 change: 1 addition & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -2647,6 +2647,7 @@ M: dm-devel@redhat.com
L: dm-devel@redhat.com
W: http://sources.redhat.com/dm
Q: http://patchwork.kernel.org/project/dm-devel/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git
T: quilt http://people.redhat.com/agk/patches/linux/editing/
S: Maintained
F: Documentation/device-mapper/
Expand Down
22 changes: 11 additions & 11 deletions drivers/md/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,17 @@ config DM_MIRROR
Allow volume managers to mirror logical volumes, also
needed for live data migration tools such as 'pvmove'.

config DM_LOG_USERSPACE
tristate "Mirror userspace logging"
depends on DM_MIRROR && NET
select CONNECTOR
---help---
The userspace logging module provides a mechanism for
relaying the dm-dirty-log API to userspace. Log designs
which are more suited to userspace implementation (e.g.
shared storage logs) or experimental logs can be implemented
by leveraging this framework.

config DM_RAID
tristate "RAID 1/4/5/6/10 target"
depends on BLK_DEV_DM
Expand All @@ -323,17 +334,6 @@ config DM_RAID
RAID-5, RAID-6 distributes the syndromes across the drives
in one of the available parity distribution methods.

config DM_LOG_USERSPACE
tristate "Mirror userspace logging"
depends on DM_MIRROR && NET
select CONNECTOR
---help---
The userspace logging module provides a mechanism for
relaying the dm-dirty-log API to userspace. Log designs
which are more suited to userspace implementation (e.g.
shared storage logs) or experimental logs can be implemented
by leveraging this framework.

config DM_ZERO
tristate "Zero target"
depends on BLK_DEV_DM
Expand Down
104 changes: 97 additions & 7 deletions drivers/md/dm-cache-metadata.c
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,13 @@

#define CACHE_SUPERBLOCK_MAGIC 06142003
#define CACHE_SUPERBLOCK_LOCATION 0
#define CACHE_VERSION 1

/*
* defines a range of metadata versions that this module can handle.
*/
#define MIN_CACHE_VERSION 1
#define MAX_CACHE_VERSION 1

#define CACHE_METADATA_CACHE_SIZE 64

/*
Expand Down Expand Up @@ -134,6 +140,18 @@ static void sb_prepare_for_write(struct dm_block_validator *v,
SUPERBLOCK_CSUM_XOR));
}

static int check_metadata_version(struct cache_disk_superblock *disk_super)
{
uint32_t metadata_version = le32_to_cpu(disk_super->version);
if (metadata_version < MIN_CACHE_VERSION || metadata_version > MAX_CACHE_VERSION) {
DMERR("Cache metadata version %u found, but only versions between %u and %u supported.",
metadata_version, MIN_CACHE_VERSION, MAX_CACHE_VERSION);
return -EINVAL;
}

return 0;
}

static int sb_check(struct dm_block_validator *v,
struct dm_block *b,
size_t sb_block_size)
Expand Down Expand Up @@ -164,7 +182,7 @@ static int sb_check(struct dm_block_validator *v,
return -EILSEQ;
}

return 0;
return check_metadata_version(disk_super);
}

static struct dm_block_validator sb_validator = {
Expand Down Expand Up @@ -198,7 +216,7 @@ static int superblock_lock(struct dm_cache_metadata *cmd,

/*----------------------------------------------------------------*/

static int __superblock_all_zeroes(struct dm_block_manager *bm, int *result)
static int __superblock_all_zeroes(struct dm_block_manager *bm, bool *result)
{
int r;
unsigned i;
Expand All @@ -214,10 +232,10 @@ static int __superblock_all_zeroes(struct dm_block_manager *bm, int *result)
return r;

data_le = dm_block_data(b);
*result = 1;
*result = true;
for (i = 0; i < sb_block_size; i++) {
if (data_le[i] != zero) {
*result = 0;
*result = false;
break;
}
}
Expand Down Expand Up @@ -270,7 +288,7 @@ static int __write_initial_superblock(struct dm_cache_metadata *cmd)
disk_super->flags = 0;
memset(disk_super->uuid, 0, sizeof(disk_super->uuid));
disk_super->magic = cpu_to_le64(CACHE_SUPERBLOCK_MAGIC);
disk_super->version = cpu_to_le32(CACHE_VERSION);
disk_super->version = cpu_to_le32(MAX_CACHE_VERSION);
memset(disk_super->policy_name, 0, sizeof(disk_super->policy_name));
memset(disk_super->policy_version, 0, sizeof(disk_super->policy_version));
disk_super->policy_hint_size = 0;
Expand Down Expand Up @@ -411,7 +429,8 @@ static int __open_metadata(struct dm_cache_metadata *cmd)
static int __open_or_format_metadata(struct dm_cache_metadata *cmd,
bool format_device)
{
int r, unformatted;
int r;
bool unformatted = false;

r = __superblock_all_zeroes(cmd->bm, &unformatted);
if (r)
Expand Down Expand Up @@ -666,19 +685,85 @@ void dm_cache_metadata_close(struct dm_cache_metadata *cmd)
kfree(cmd);
}

/*
* Checks that the given cache block is either unmapped or clean.
*/
static int block_unmapped_or_clean(struct dm_cache_metadata *cmd, dm_cblock_t b,
bool *result)
{
int r;
__le64 value;
dm_oblock_t ob;
unsigned flags;

r = dm_array_get_value(&cmd->info, cmd->root, from_cblock(b), &value);
if (r) {
DMERR("block_unmapped_or_clean failed");
return r;
}

unpack_value(value, &ob, &flags);
*result = !((flags & M_VALID) && (flags & M_DIRTY));

return 0;
}

static int blocks_are_unmapped_or_clean(struct dm_cache_metadata *cmd,
dm_cblock_t begin, dm_cblock_t end,
bool *result)
{
int r;
*result = true;

while (begin != end) {
r = block_unmapped_or_clean(cmd, begin, result);
if (r)
return r;

if (!*result) {
DMERR("cache block %llu is dirty",
(unsigned long long) from_cblock(begin));
return 0;
}

begin = to_cblock(from_cblock(begin) + 1);
}

return 0;
}

int dm_cache_resize(struct dm_cache_metadata *cmd, dm_cblock_t new_cache_size)
{
int r;
bool clean;
__le64 null_mapping = pack_value(0, 0);

down_write(&cmd->root_lock);
__dm_bless_for_disk(&null_mapping);

if (from_cblock(new_cache_size) < from_cblock(cmd->cache_blocks)) {
r = blocks_are_unmapped_or_clean(cmd, new_cache_size, cmd->cache_blocks, &clean);
if (r) {
__dm_unbless_for_disk(&null_mapping);
goto out;
}

if (!clean) {
DMERR("unable to shrink cache due to dirty blocks");
r = -EINVAL;
__dm_unbless_for_disk(&null_mapping);
goto out;
}
}

r = dm_array_resize(&cmd->info, cmd->root, from_cblock(cmd->cache_blocks),
from_cblock(new_cache_size),
&null_mapping, &cmd->root);
if (!r)
cmd->cache_blocks = new_cache_size;
cmd->changed = true;

out:
up_write(&cmd->root_lock);

return r;
Expand Down Expand Up @@ -1182,3 +1267,8 @@ int dm_cache_save_hint(struct dm_cache_metadata *cmd, dm_cblock_t cblock,

return r;
}

int dm_cache_metadata_all_clean(struct dm_cache_metadata *cmd, bool *result)
{
return blocks_are_unmapped_or_clean(cmd, 0, cmd->cache_blocks, result);
}
5 changes: 5 additions & 0 deletions drivers/md/dm-cache-metadata.h
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,11 @@ int dm_cache_begin_hints(struct dm_cache_metadata *cmd, struct dm_cache_policy *
int dm_cache_save_hint(struct dm_cache_metadata *cmd,
dm_cblock_t cblock, uint32_t hint);

/*
* Query method. Are all the blocks in the cache clean?
*/
int dm_cache_metadata_all_clean(struct dm_cache_metadata *cmd, bool *result);

/*----------------------------------------------------------------*/

#endif /* DM_CACHE_METADATA_H */
7 changes: 6 additions & 1 deletion drivers/md/dm-cache-policy-internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,12 @@ static inline int policy_writeback_work(struct dm_cache_policy *p,

static inline void policy_remove_mapping(struct dm_cache_policy *p, dm_oblock_t oblock)
{
return p->remove_mapping(p, oblock);
p->remove_mapping(p, oblock);
}

static inline int policy_remove_cblock(struct dm_cache_policy *p, dm_cblock_t cblock)
{
return p->remove_cblock(p, cblock);
}

static inline void policy_force_mapping(struct dm_cache_policy *p,
Expand Down
Loading

0 comments on commit 7f2dc5c

Please sign in to comment.