-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
dm mpath: add service time load balancer
This patch adds a service time oriented dynamic load balancer, dm-service-time, which selects the path with the shortest estimated service time for the incoming I/O. The service time is estimated by dividing the in-flight I/O size by a performance value of each path. The performance value can be given as a table argument at the table loading time. If no performance value is given, all paths are considered equal. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
- Loading branch information
Kiyoshi Ueda
authored and
Alasdair G Kergon
committed
Jun 22, 2009
1 parent
fd5e033
commit f392ba8
Showing
4 changed files
with
441 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
dm-service-time | ||
=============== | ||
|
||
dm-service-time is a path selector module for device-mapper targets, | ||
which selects a path with the shortest estimated service time for | ||
the incoming I/O. | ||
|
||
The service time for each path is estimated by dividing the total size | ||
of in-flight I/Os on a path with the performance value of the path. | ||
The performance value is a relative throughput value among all paths | ||
in a path-group, and it can be specified as a table argument. | ||
|
||
The path selector name is 'service-time'. | ||
|
||
Table parameters for each path: [<repeat_count> [<relative_throughput>]] | ||
<repeat_count>: The number of I/Os to dispatch using the selected | ||
path before switching to the next path. | ||
If not given, internal default is used. To check | ||
the default value, see the activated table. | ||
<relative_throughput>: The relative throughput value of the path | ||
among all paths in the path-group. | ||
The valid range is 0-100. | ||
If not given, minimum value '1' is used. | ||
If '0' is given, the path isn't selected while | ||
other paths having a positive value are available. | ||
|
||
Status for each path: <status> <fail-count> <in-flight-size> \ | ||
<relative_throughput> | ||
<status>: 'A' if the path is active, 'F' if the path is failed. | ||
<fail-count>: The number of path failures. | ||
<in-flight-size>: The size of in-flight I/Os on the path. | ||
<relative_throughput>: The relative throughput value of the path | ||
among all paths in the path-group. | ||
|
||
|
||
Algorithm | ||
========= | ||
|
||
dm-service-time adds the I/O size to 'in-flight-size' when the I/O is | ||
dispatched and substracts when completed. | ||
Basically, dm-service-time selects a path having minimum service time | ||
which is calculated by: | ||
|
||
('in-flight-size' + 'size-of-incoming-io') / 'relative_throughput' | ||
|
||
However, some optimizations below are used to reduce the calculation | ||
as much as possible. | ||
|
||
1. If the paths have the same 'relative_throughput', skip | ||
the division and just compare the 'in-flight-size'. | ||
|
||
2. If the paths have the same 'in-flight-size', skip the division | ||
and just compare the 'relative_throughput'. | ||
|
||
3. If some paths have non-zero 'relative_throughput' and others | ||
have zero 'relative_throughput', ignore those paths with zero | ||
'relative_throughput'. | ||
|
||
If such optimizations can't be applied, calculate service time, and | ||
compare service time. | ||
If calculated service time is equal, the path having maximum | ||
'relative_throughput' may be better. So compare 'relative_throughput' | ||
then. | ||
|
||
|
||
Examples | ||
======== | ||
In case that 2 paths (sda and sdb) are used with repeat_count == 128 | ||
and sda has an average throughput 1GB/s and sdb has 4GB/s, | ||
'relative_throughput' value may be '1' for sda and '4' for sdb. | ||
|
||
# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4" \ | ||
dmsetup create test | ||
# | ||
# dmsetup table | ||
test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4 | ||
# | ||
# dmsetup status | ||
test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 1 8:16 A 0 0 4 | ||
|
||
|
||
Or '2' for sda and '8' for sdb would be also true. | ||
|
||
# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8" \ | ||
dmsetup create test | ||
# | ||
# dmsetup table | ||
test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8 | ||
# | ||
# dmsetup status | ||
test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 2 8:16 A 0 0 8 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.