- Oct 09, 2022
-
-
余快 authored
mainline inclusion from md-next commit ddc489e066cd267b383c0eed4f576f6bdb154588 category: performance bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5PRMO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=ddc489e066cd267b383c0eed4f576f6bdb154588 --------------------- Currently, wait_barrier() will hold 'resync_lock' to read 'conf->barrier', and io can't be dispatched until 'barrier' is dropped. Since holding the 'barrier' is not common, convert 'resync_lock' to use seqlock so that holding lock can be avoided in fast path. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-and-tested-by:
Logan Gunthorpe <logang@deltatee.com> Signed-off-by:
Song Liu <song@kernel.org> Reviewed-by:
Jason Yan <yanaijie@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
- May 31, 2018
-
-
Kent Overstreet authored
Convert md to embedded bio sets. Signed-off-by:
Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
- Feb 19, 2018
-
-
NeilBrown authored
The rdev pointer kept in the local 'config' for each for raid1, raid10, raid4/5/6 has non-obvious lifetime rules. Sometimes RCU is needed, sometimes a lock, something nothing. Add documentation to explain this. Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Shaohua Li <sh.li@alibaba-inc.com>
-
- Nov 02, 2017
-
-
Greg Kroah-Hartman authored
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by:
Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by:
Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Guoqing Jiang authored
Suspending the entire device for resync could take too long. Resync in small chunks. cluster's resync window is maintained in r10conf as cluster_sync_low and cluster_sync_high, and processed in raid10's sync_request(). If the current resync is outside the cluster resync window: 1. Set the cluster_sync_low to curr_resync_completed. 2. Set cluster_sync_high to cluster_sync_low + stripe size. 3. Send a message to all nodes so they may add it in their suspension list. Note: We only support "near" raid10 so far, resync a far or offset raid10 array could have trouble. So raid10_run checks the layout of clustered raid10, it will refuse to run if the layout is not correct. With the "near" layout we process one stripe at a time progressing monotonically through the address space. So we can have a sliding window of whole-stripes which moves through the array suspending IO on other nodes, and both resync which uses array addresses and recovery which uses device addresses can stay within this window. Signed-off-by:
Guoqing Jiang <gqjiang@suse.com> Signed-off-by:
Shaohua Li <shli@fb.com>
-
- Apr 12, 2017
-
-
NeilBrown authored
raid10 splits requests in two different ways for two different reasons. First, bio_split() is used to ensure the bio fits with a chunk. Second, multiple r10bio structures are allocated to represent the different sections that need to go to different devices, to avoid known bad blocks. This can be simplified to just use bio_split() once, and not to use multiple r10bios. We delay the split until we know a maximum bio size that can be handled with a single r10bio, and then split the bio and queue the remainder for later handling. As with raid1, we allocate a new bio_set to help with the splitting. It is not correct to use fs_bio_set in a device driver. Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Shaohua Li <shli@fb.com>
-
- Nov 23, 2016
-
-
NeilBrown authored
If a device is marked FailFast, and it is not the only device we can read from, we mark the bio as MD_FAILFAST. If this does fail-fast, we don't try read repair but just allow failure. If it was the last device, it doesn't get marked Faulty so the retry happens on the same device - this time without FAILFAST. A subsequent failure will not retry but will just pass up the error. During resync we may use FAILFAST requests, and on a failure we will simply use the other device(s). During recovery we will only use FAILFAST in the unusual case were there are multiple places to read from - i.e. if there are > 2 devices. If we get a failure we will fail the device and complete the resync/recovery with remaining devices. Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Shaohua Li <shli@fb.com>
-
- Jul 20, 2016
-
-
Tomasz Majchrzak authored
RAID10 random read performance is lower than expected due to excessive spinlock utilisation which is required mostly for rebuild/resync. Simplify allow_barrier as it's in IO path and encounters a lot of unnecessary congestion. As lower_barrier just takes a lock in order to decrement a counter, convert counter (nr_pending) into atomic variable and remove the spin lock. There is also a congestion for wake_up (it uses lock internally) so call it only when it's really needed. As wake_up is not called constantly anymore, ensure process waiting to raise a barrier is notified when there are no more waiting IOs. Signed-off-by:
Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by:
Shaohua Li <shli@fb.com>
-
- Sep 01, 2015
-
-
NeilBrown authored
When a write to one of the legs of a RAID10 fails, the failure is recorded in the metadata of the other legs so that after a restart the data on the failed drive wont be trusted even if that drive seems to be working again (maybe a cable was unplugged). Currently there is no interlock between the write request completing and the metadata update. So it is possible that the write will complete, the app will confirm success in some way, and then the machine will crash before the metadata update completes. This is an extremely small hole for a racy to fit in, but it is theoretically possible and so should be closed. So: - set MD_CHANGE_PENDING when requesting a metadata update for a failed device, so we can know with certainty when it completes - queue requests that experienced an error on a new queue which is only processed after the metadata update completes - call raid_end_bio_io() on bios in that queue when the time comes. Signed-off-by:
NeilBrown <neilb@suse.com>
-
- Feb 04, 2015
-
-
NeilBrown authored
There is currently no locking around calls to the 'congested' bdi function. If called at an awkward time while an array is being converted from one level (or personality) to another, there is a tiny chance of running code in an unreferenced module etc. So add a 'congested' function to the md_personality operations structure, and call it with appropriate locking from a central 'mddev_congested'. When the array personality is changing the array will be 'suspended' so no IO is processed. If mddev_congested detects this, it simply reports that the array is congested, which is a safe guess. As mddev_suspend calls synchronize_rcu(), mddev_congested can avoid races by included the whole call inside an rcu_read_lock() region. This require that the congested functions for all subordinate devices can be run under rcu_lock. Fortunately this is the case. Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Feb 26, 2013
-
-
Jonathan Brassow authored
The MD RAID10 'far' and 'offset' algorithms make copies of entire stripe widths - copying them to a different location on the same devices after shifting the stripe. An example layout of each follows below: "far" algorithm dev1 dev2 dev3 dev4 dev5 dev6 ==== ==== ==== ==== ==== ==== A B C D E F G H I J K L ... F A B C D E --> Copy of stripe0, but shifted by 1 L G H I J K ... "offset" algorithm dev1 dev2 dev3 dev4 dev5 dev6 ==== ==== ==== ==== ==== ==== A B C D E F F A B C D E --> Copy of stripe0, but shifted by 1 G H I J K L L G H I J K ... Redundancy for these algorithms is gained by shifting the copied stripes one device to the right. This patch proposes that array be divided into sets of adjacent devices and when the stripe copies are shifted, they wrap on set boundaries rather than the array size boundary. That is, for the purposes of shifting, the copies are confined to their sets within the array. The sets are 'near_copies * far_copies' in size. The above "far" algorithm example would change to: "far" algorithm dev1 dev2 dev3 dev4 dev5 dev6 ==== ==== ==== ==== ==== ==== A B C D E F G H I J K L ... B A D C F E --> Copy of stripe0, shifted 1, 2-dev sets H G J I L K Dev sets are 1-2, 3-4, 5-6 ... This has the affect of improving the redundancy of the array. We can always sustain at least one failure, but sometimes more than one can be handled. In the first examples, the pairs of devices that CANNOT fail together are: (1,2) (2,3) (3,4) (4,5) (5,6) (1, 6) [40% of possible pairs] In the example where the copies are confined to sets, the pairs of devices that cannot fail together are: (1,2) (3,4) (5,6) [20% of possible pairs] We cannot simply replace the old algorithms, so the 17th bit of the 'layout' variable is used to indicate whether we use the old or new method of computing the shift. (This is similar to the way the 16th bit indicates whether the "far" algorithm or the "offset" algorithm is being used.) This patch only handles the cases where the number of total raid disks is a multiple of 'far_copies'. A follow-on patch addresses the condition where this is not true. Signed-off-by:
Jonathan Brassow <jbrassow@redhat.com> Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Aug 18, 2012
-
-
NeilBrown authored
A 'struct r10bio' has an array of per-copy information at the end. This array is declared with size [0] and r10bio_pool_alloc allocates enough extra space to store the per-copy information depending on the number of copies needed. So declaring a 'struct r10bio on the stack isn't going to work. It won't allocate enough space, and memory corruption will ensue. So in the two places where this is done, declare a sufficiently large structure and use that instead. The two call-sites of this bug were introduced in 3.4 and 3.5 so this is suitable for both those kernels. The patch will have to be modified for 3.4 as it only has one bug. Cc: stable@vger.kernel.org Reported-by:
Ivan Vasilyev <ivan.vasilyev@gmail.com> Tested-by:
Ivan Vasilyev <ivan.vasilyev@gmail.com> Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Jul 31, 2012
-
-
Jonathan Brassow authored
md/raid10: Export is_congested test. In similar fashion to commits 11d8a6e3 1ed7242e we export the RAID10 congestion checking function so that dm-raid.c can make use of it and make use of the personality. The 'queue' and 'gendisk' structures will not be available to the MD code when device-mapper sets up the device, so we conditionalize access to these fields also. Signed-off-by:
Jonathan Brassow <jbrassow@redhat.com> Signed-off-by:
NeilBrown <neilb@suse.de>
-
Jonathan Brassow authored
MD RAID1/RAID10: Move some macros from .h file to .c file There are three macros (IO_BLOCKED,IO_MADE_GOOD,BIO_SPECIAL) which are defined in both raid1.h and raid10.h. They are only used in there respective .c files. However, if we wish to make RAID10 accessible to the device-mapper RAID target (dm-raid.c), then we need to move these macros into the .c files where they are used so that they do not conflict with each other. The macros from the two files are identical and could be moved into md.h, but I chose to leave the duplication and have them remain in the personality files. Signed-off-by:
Jonathan Brassow <jbrassow@redhat.com> Signed-off-by:
NeilBrown <neilb@suse.de>
-
Jonathan Brassow authored
MD RAID10: Rename the structure 'mirror_info' to 'raid10_info' The same structure name ('mirror_info') is used by raid1. Each of these structures are defined in there respective header files. If dm-raid is to support both RAID1 and RAID10, the header files will be included and the structure names must not collide. Signed-off-by:
Jonathan Brassow <jbrassow@redhat.com> Signed-off-by:
NeilBrown <neilb@suse.de>
-
- May 22, 2012
-
-
NeilBrown authored
A 'near' or 'offset' lay RAID10 array can be reshaped to a different 'near' or 'offset' layout, a different chunk size, and a different number of devices. However the number of copies cannot change. Unlike RAID5/6, we do not support having user-space backup data that is being relocated during a 'critical section'. Rather, the data_offset of each device must change so that when writing any block to a new location, it will not over-write any data that is still 'live'. This means that RAID10 reshape is not supportable on v0.90 metadata. The different between the old data_offset and the new_offset must be at least the larger of the chunksize multiplied by offset copies of each of the old and new layout. (for 'near' mode, offset_copies == 1). A larger difference of around 64M seems useful for in-place reshapes as more data can be moved between metadata updates. Very large differences (e.g. 512M) seem to slow the process down due to lots of long seeks (on oldish consumer graded devices at least). Metadata needs to be updated whenever the place we are about to write to is considered - by the current metadata - to still contain data in the old layout. [unbalanced locking fix from Dan Carpenter <dan.carpenter@oracle.com>] Signed-off-by:
NeilBrown <neilb@suse.de>
-
- May 21, 2012
-
-
NeilBrown authored
When RAID10 supports reshape it will need a 'previous' and a 'current' geometry, so introduce that here. Use the 'prev' geometry when before the reshape_position, and the current 'geo' when beyond it. At other times, use both as appropriate. For now, both are identical (And reshape_position is never set). When we use the 'prev' geometry, we must use the old data_offset. When we use the current (And a reshape is happening) we must use the new_data_offset. Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
We will shortly be adding reshape support for RAID10 which will require it having 2 concurrent geometries (before and after). To make that easier, collect most geometry fields into 'struct geom' and access them from there. Then we will more easily be able to add a second set of fields. Note that 'copies' is not in this struct and so cannot be changed. There is little need to change this number and doing so is a lot more difficult as it requires reallocating more things. So leave it out for now. Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Dec 23, 2011
-
-
NeilBrown authored
Allow each slot in the RAID10 to have 2 devices, the want_replacement and the replacement. Also an r10bio to have 2 bios, and for resync/recovery allocate the second bio if there are any replacement devices. Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Oct 11, 2011
-
-
NeilBrown authored
RAID1 and RAID10 handle write requests by queuing them for handling by a separate thread. This is because when a write-intent-bitmap is active we might need to update the bitmap first, so it is good to queue a lot of writes, then do one big bitmap update for them all. However writeback request devices to appear to be congested after a while so it can make some guesstimate of throughput. The infinite queue defeats that (note that RAID5 has already has a finite queue so it doesn't suffer from this problem). So impose a limit on the number of pending write requests. By default it is 1024 which seems to be generally suitable. Make it configurable via module option just in case someone finds a regression. Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
Having mddev_t and 'struct mddev_s' is ugly and not preferred Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
The typedefs are just annoying. 'mdk' probably refers to 'md_k.h' which used to be an include file that defined this thing. Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Jul 28, 2011
-
-
NeilBrown authored
When we get a write error (in the data area, not in metadata), update the badblock log rather than failing the whole device. As the write may well be many blocks, we trying writing each block individually and only log the ones which fail. Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
If we succeed in writing to a block that was recorded as being bad, we clear the bad-block record. This requires some delayed handling as the bad-block-list update has to happen in process-context. Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
This patch just covers the basic read path: 1/ read_balance needs to check for badblocks, and return not only the chosen slot, but also how many good blocks are available there. 2/ read submission must be ready to issue multiple reads to different devices as different bad blocks on different devices could mean that a single large read cannot be served by any one device, but can still be served by the array. This requires keeping count of the number of outstanding requests per bio. This count is stored in 'bi_phys_segments' On read error we currently just fail the request if another target cannot handle the whole request. Next patch refines that a bit. Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Jul 27, 2011
-
-
NeilBrown authored
When we get a read error during recovery, RAID10 previously arranged for the recovering device to appear to fail so that the recovery stops and doesn't restart. This is misleading and wrong. Instead, make use of the new recovery_disabled handling and mark the target device and having recovery disabled. Add appropriate checks in add_disk and remove_disk so that devices are removed and not re-added when recovery is disabled. Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Mar 31, 2011
-
-
Lucas De Marchi authored
Fixes generated by 'codespell' and manually reviewed. Signed-off-by:
Lucas De Marchi <lucas.demarchi@profusion.mobi>
-
- Jun 24, 2010
-
-
NeilBrown authored
Most array level changes leave the list of devices largely unchanged, possibly causing one at the end to become redundant. However conversions between RAID0 and RAID10 need to renumber all devices (except 0). This renumbering is currently being done in the ->run method when the new personality takes over. However this is too late as the common code in md.c might already have invalidated some of the devices if they had a ->raid_disk number that appeared to high. Moving it into the ->takeover method is too early as the array is still active at that time and wrong ->raid_disk numbers could cause confusion. So add a ->new_raid_disk field to mdk_rdev_s and use it to communicate the new raid_disk number. Now the common code knows exactly which devices need to be renumbered, and which can be invalidated, and can do it all at a convenient time when the array is suspend. It can also update some symlinks in sysfs which previously were not be updated correctly. Reported-by:
Maciej Trela <maciej.trela@intel.com> Signed-off-by:
NeilBrown <neilb@suse.de>
-
- May 18, 2010
-
-
Trela, Maciej authored
Signed-off-by:
Maciej Trela <maciej.trela@intel.com> Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Jun 16, 2009
-
-
NeilBrown authored
Having a macro just to cast a void* isn't really helpful. I would must rather see that we are simply de-referencing ->private, than have to know what the macro does. So open code the macro everywhere and remove the pointless cast. Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Mar 31, 2009
-
-
NeilBrown authored
This makes the includes more explicit, and is preparation for moving md_k.h to drivers/md/md.h Remove include/raid/md.h as its only remaining use was to #include other files. Signed-off-by:
NeilBrown <neilb@suse.de>
-
Christoph Hellwig authored
Move the headers with the local structures for the disciplines and bitmap.h into drivers/md/ so that they are more easily grepable for hacking and not far away. md.h is left where it is for now as there are some uses from the outside. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Oct 03, 2006
-
-
NeilBrown authored
It isn't needed as mddev->degraded contains equivalent info. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- Jun 27, 2006
-
-
NeilBrown authored
The "industry standard" DDF format allows for a stripe/offset layout where data is duplicated on different stripes. e.g. A B C D D A B C E F G H H E F G (columns are drives, rows are stripes, LETTERS are chunks of data). This is similar to raid10's 'far' mode, but not quite the same. So enhance 'far' mode with a 'far/offset' option which follows the layout of DDFs stripe/offset. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- Jan 07, 2006
-
-
NeilBrown authored
Add in correct read-error handling for resync and read-only situations. When read-only, we don't over-write, so we need to mark the failed drive in the r10_bio so we don't re-try it. During resync, we always read all blocks, so if there is a read error, we simply over-write it with the good block that we found (assuming we found one). Note that the recovery case still isn't handled in an interesting way. There is nothing useful to do for the 2-copies case. If there are 3 or more copies, then we could try reading from one of the non-missing copies, but this is a bit complicated and very rarely would be used, so I'm leaving it for now. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
NeilBrown authored
Largely just a cross-port from raid1. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-