linux-pinenote

Author	SHA1	Message	Date
Johannes Berg	2543a0c4c0	ar9170: interpret firmware debug commands This adds new commands that the original firmware will not send but we can use them to debug firmware. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-10 13:27:54 -04:00
matthieu castet	dacb6f1d8f	mac80211 : fix unaligned rx skb mac80211 is checking is the skb is aligned on 32 bit boundary. But it is checking against ethernet header, whereas Linux expect IP header aligned. And ethernet ether size is 6*2+2=14, so aligning ethernet header make IP header unaligned. Signed-off-by: Matthieu CASTET <castet.matthieu@free.fr> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-10 13:27:53 -04:00
Matthieu CASTET	b52a033c2c	b43: Fix possible unaligned u32 access Fix possible unaligned u32 access in b43_generate_plcp_hdr(). Unaligned data is read/write with a u32 pointer instead of using the packed structure. Some versions of gcc ignore the "packed" attribute, if the structure element is accessed through a local pointer. Signed-off-by: Matthieu CASTET <castet.matthieu@free.fr> Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-10 13:27:53 -04:00
Bob Copeland	5ee58d7e6a	mac80211: fix minstrel single-rate memory corruption The minstrel rate controller periodically looks up rate indexes in a sampling table. When accessing a specific row and column, minstrel correctly does a bounds check which, on the surface, appears to handle the case where mi->n_rates < 2. However, mi->sample_idx is actually defined as an unsigned, so the right hand side is taken to be a huge positive number when negative, and the check will always fail. Consequently, the RC will overrun the array and cause random memory corruption when communicating with a peer that has only a single rate. The max value of mi->sample_idx is around 25 so casting to int should have no ill effects. Without the change, uptime is a few minutes under load with an AP that has a single hard-coded rate, and both the AP and STA could potentially crash. With the change, both lasted 12 hours with a steady load. Thanks to Ognjen Maric for providing the single-rate clue so I could reproduce this. This fixes http://bugzilla.kernel.org/show_bug.cgi?id=12490 on the regression list (also http://bugzilla.kernel.org/show_bug.cgi?id=13000). Cc: stable@kernel.org Reported-by: Sergey S. Kostyliov <rathamahata@gmail.com> Reported-by: Ognjen Maric <ognjen.maric@gmail.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-10 13:27:51 -04:00
Sebastian Andrzej Siewior	4d1d49858c	net/libertas: remove GPIO-CS handling in SPI interface code This removes the dependency on GPIO framework and lets the SPI host driver handle the chip select. The SPI host driver is required to keep the CS active for the entire message unless cs_change says otherwise. This patch collects the two/three single SPI transfers into a message. Also the delay in read path in case use_dummy_writes are not used is moved into the SPI host driver. Tested-by: Mike Rapoport <mike@compulab.co.il> Tested-by: Andrey Yurovsky <andrey@cozybit.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-10 13:27:50 -04:00
Jussi Kivilinna	582241a084	rndis_wlan: cleanup: rename all rndis_wext* objects to rndis_wlan* Driver used to be named rndis_wext before inclusion to upstream. Since rndis_wlan is being converted to cfg80211, use of rndis_wext* names can be confusing. So rename all rndis_wext to rndis_wlan (as should have been when driver was renamed). Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-10 13:27:50 -04:00
Jussi Kivilinna	aa18294a28	rndis_wlan: cleanup: capitalize enum labels Capitalize enum labels as told in Documents/CodingStyle. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-10 13:27:49 -04:00
Johannes Berg	a60e77e5a4	iwlwifi: port to cfg80211 rfkill This ports the iwlwifi rfkill code to the new API offered by cfg80211 and thus removes a lot of useless stuff. The soft- rfkill is completely removed since that is now handled by setting the interfaces down. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-06-10 13:27:49 -04:00
Mike Frysinger	bc5c6c043d	ftrace/documentation: fix typo in function grapher name The function graph tracer is called just "function_graph" (no trailing "_tracer" needed). Signed-off-by: Mike Frysinger <vapier@gentoo.org> LKML-Reference: <1244623722-6325-1-git-send-email-vapier@gentoo.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-06-10 13:06:25 -04:00
Jeff Layton	58f7f68f22	cifs: add addr= mount option alias for ip= When you look in /proc/mounts, the address of the server gets displayed as "addr=". That's really a better option to use anyway since it's more generic. What if we eventually want to support non-IP transports? It also makes CIFS option consistent with the NFS option of the same name. Begin the migration to that option name by adding an alias for ip= called addr=. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-06-10 15:39:14 +00:00
Al Viro	7df336ec12	Fix btrfs when ACLs are configured out ... otherwise generic_permission() will allow anything for all files you don't own and that have some group permissions. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:36:43 -04:00
Hisashi Hifumi	524724ed1f	Btrfs: fdatasync should skip metadata writeout In btrfs, fdatasync and fsync are identical, but fdatasync should skip committing transaction when inode->i_state is set just I_DIRTY_SYNC and this indicates only atime or/and mtime updates. Following patch improves fdatasync throughput. --file-block-size=4K --file-total-size=16G --file-test-mode=rndwr --file-fsync-mode=fdatasync run Results: -2.6.30-rc8 Test execution summary: total time: 1980.6540s total number of events: 10001 total time taken by event execution: 1192.9804 per-request statistics: min: 0.0000s avg: 0.1193s max: 15.3720s approx. 95 percentile: 0.7257s Threads fairness: events (avg/stddev): 625.0625/151.32 execution time (avg/stddev): 74.5613/9.46 -2.6.30-rc8-patched Test execution summary: total time: 1695.9118s total number of events: 10000 total time taken by event execution: 871.3214 per-request statistics: min: 0.0000s avg: 0.0871s max: 10.4644s approx. 95 percentile: 0.4787s Threads fairness: events (avg/stddev): 625.0000/131.86 execution time (avg/stddev): 54.4576/8.98 Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:53 -04:00
David Woodhouse	163e783e6a	Btrfs: remove crc32c.h and use libcrc32c directly. There's no need to preserve this abstraction; it used to let us use hardware crc32c support directly, but libcrc32c is already doing that for us through the crypto API -- so we're already using the Intel crc32c acceleration where appropriate. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:53 -04:00
Christoph Hellwig	6cbff00f46	Btrfs: implement FS_IOC_GETFLAGS/SETFLAGS/GETVERSION Add support for the standard attributes set via chattr and read via lsattr. Currently we store the attributes in the flags value in the btrfs inode, but I wonder whether we should split it into two so that we don't have to keep converting between the two formats. Remove the btrfs_clear_flag/btrfs_set_flag/btrfs_test_flag macros as they were confusing the existing code and got in the way of the new additions. Also add the FS_IOC_GETVERSION ioctl for getting i_generation as it's trivial. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:52 -04:00
Chris Mason	c289811cc0	Btrfs: autodetect SSD devices During mount, btrfs will check the queue nonrot flag for all the devices found in the FS. If they are all non-rotating, SSD mode is enabled by default. If the FS was mounted with -o nossd, the non-rotating flag is ignored. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:52 -04:00
Chris Mason	451d7585a8	Btrfs: add mount -o ssd_spread to spread allocations out Some SSDs perform best when reusing block numbers often, while others perform much better when clustering strictly allocates big chunks of unused space. The default mount -o ssd will find rough groupings of blocks where there are a bunch of free blocks that might have some allocated blocks mixed in. mount -o ssd_spread will make sure there are no allocated blocks mixed in. It should perform better on lower end SSDs. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:52 -04:00
Chris Mason	c604480171	Btrfs: avoid allocation clusters that are too spread out In SSD mode for data, and all the time for metadata the allocator will try to find a cluster of nearby blocks for allocations. This commit adds extra checks to make sure that each free block in the cluster is close to the last one. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:51 -04:00
Chris Mason	3b30c22f64	Btrfs: Add mount -o nossd This allows you to turn off the ssd mode via remount. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:50 -04:00
Chris Mason	d644d8a1e3	Btrfs: avoid IO stalls behind congested devices in a multi-device FS The btrfs IO submission threads try to service a bunch of devices with a small number of threads. They do a congestion check to try and avoid waiting on requests for a busy device. The checks make sure we've sent a few requests down to a given device just so that we aren't bouncing between busy devices without actually sending down any IO. The counter used to decide if we can switch to the next device is somewhat overloaded. It is also being used to decide if we've done a good batch of requests between the WRITE_SYNC or regular priority lists. It may get reset to zero often, leaving us hammering on a busy device instead of moving on to another disk. This commit adds a new counter for the number of bios sent while servicing a device. It doesn't get reset or fiddled with. On multi-device filesystems, this fixes IO stalls in streaming write workloads. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:49 -04:00
Chris Mason	d84275c938	Btrfs: don't allow WRITE_SYNC bios to starve out regular writes Btrfs uses dedicated threads to submit bios when checksumming is on, which allows us to make sure the threads dedicated to checksumming don't get stuck waiting for requests. For each btrfs device, there are two lists of bios. One list is for WRITE_SYNC bios and the other is for regular priority bios. The IO submission threads used to process all of the WRITE_SYNC bios first and then switch to the regular bios. This commit makes sure we don't completely starve the regular bios by rotating between the two lists. WRITE_SYNC bios are still favored 2:1 over the regular bios, and this tries to run in batches to avoid seeking. Benchmarking shows this eliminates stalls during streaming buffered writes on both multi-device and single device filesystems. If the regular bios starve, the system can end up with a large amount of ram pinned down in writeback pages. If we are a little more fair between the two classes, we're able to keep throughput up and make progress on the bulk of our dirty ram. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:49 -04:00
Chris Mason	585ad2c379	Btrfs: fix metadata dirty throttling limits Once a metadata block has been written, it must be recowed, so the btrfs dirty balancing call has a check to make sure a fair amount of metadata was actually dirty before it started writing it back to disk. A previous commit had changed the dirty tracking for metadata without updating the btrfs dirty balancing checks. This commit switches it to use the correct counter. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:48 -04:00
Chris Mason	2c943de6ad	Btrfs: reduce mount -o ssd CPU usage The block allocator in SSD mode will try to find groups of free blocks that are close together. This commit makes it loop less on a given group size before bumping it. The end result is that we are less likely to fill small holes in the available free space, but we don't waste as much CPU building the large cluster used by ssd mode. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:48 -04:00
Chris Mason	cfbb930846	Btrfs: balance btree more often With the new back reference code, the cost of a balance has gone down in terms of the number of back reference updates done. This commit makes us more aggressively balance leaves and nodes as they become less full. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:47 -04:00
Chris Mason	b361242102	Btrfs: stop avoiding balancing at the end of the transaction. When the delayed reference code was added, some checks were added to avoid extra balancing while the delayed references were being flushed. This made for less efficient btrees, but it reduced the chances of loops where no forward progress was made because the balances made more delayed ref updates. With the new dead root removal code and the mixed back references, the extent allocation tree is no longer using precise back refs, and the delayed reference updates don't carry the risk of looping forever anymore. So, the balance avoidance is no longer required. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:47 -04:00
Yan Zheng	5d4f98a28c	Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE) This commit introduces a new kind of back reference for btrfs metadata. Once a filesystem has been mounted with this commit, IT WILL NO LONGER BE MOUNTABLE BY OLDER KERNELS. When a tree block in subvolume tree is cow'd, the reference counts of all extents it points to are increased by one. At transaction commit time, the old root of the subvolume is recorded in a "dead root" data structure, and the btree it points to is later walked, dropping reference counts and freeing any blocks where the reference count goes to 0. The increments done during cow and decrements done after commit cancel out, and the walk is a very expensive way to go about freeing the blocks that are no longer referenced by the new btree root. This commit reduces the transaction overhead by avoiding the need for dead root records. When a non-shared tree block is cow'd, we free the old block at once, and the new block inherits old block's references. When a tree block with reference count > 1 is cow'd, we increase the reference counts of all extents the new block points to by one, and decrease the old block's reference count by one. This dead tree avoidance code removes the need to modify the reference counts of lower level extents when a non-shared tree block is cow'd. But we still need to update back ref for all pointers in the block. This is because the location of the block is recorded in the back ref item. We can solve this by introducing a new type of back ref. The new back ref provides information about pointer's key, level and in which tree the pointer lives. This information allow us to find the pointer by searching the tree. The shortcoming of the new back ref is that it only works for pointers in tree blocks referenced by their owner trees. This is mostly a problem for snapshots, where resolving one of these fuzzy back references would be O(number_of_snapshots) and quite slow. The solution used here is to use the fuzzy back references in the common case where a given tree block is only referenced by one root, and use the full back references when multiple roots have a reference on a given block. This commit adds per subvolume red-black tree to keep trace of cached inodes. The red-black tree helps the balancing code to find cached inodes whose inode numbers within a given range. This commit improves the balancing code by introducing several data structures to keep the state of balancing. The most important one is the back ref cache. It caches how the upper level tree blocks are referenced. This greatly reduce the overhead of checking back ref. The improved balancing code scales significantly better with a large number of snapshots. This is a very large commit and was written in a number of pieces. But, they depend heavily on the disk format change and were squashed together to make sure git bisect didn't end up in a bad state wrt space balancing or the format change. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:46 -04:00
Yan Zheng	5c939df56c	btrfs: Fix set/clear_extent_bit for 'end == (u64)-1' There are some 'start = state->end + 1;' like code in set_extent_bit and clear_extent_bit. They overflow when end == (u64)-1. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-06-10 11:29:46 -04:00
Christoph Hellwig	ef14f0c157	xfs: use generic Posix ACL code This patch rips out the XFS ACL handling code and uses the generic fs/posix_acl.c code instead. The ondisk format is of course left unchanged. This also introduces the same ACL caching all other Linux filesystems do by adding pointers to the acl and default acl in struct xfs_inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net>	2009-06-10 17:07:47 +02:00
Arjan van de Ven	517d3cc15b	[libata] ata_piix: Enable parallel scan This patch turns on parallel scanning for the ata_piix driver. This driver is used on most netbooks (no AHCI for cheap storage it seems). The scan is the dominating time factor in the kernel boot for these devices; with this flag it gets cut in half for the device I used for testing (eeepc). Alan took a look at the driver source and concluded that it ought to be safe to do for this driver. Alan has also checked with the hardware team. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-06-10 11:05:34 -04:00
Tejun Heo	7f4774b38e	sata_nv: use hardreset only for post-boot probing When I thought it was finally defeated, it came back with vengeance. The failure cases are ever more convoluted. Now there is a single combination which fails boot probing - MCP5x + Intel SSD and there are two hotplug failure reports on different flavors where softreset fails to bring up the device. Through the many bug reports after the switch to hardreset, the following patterns emerged. - Softreset during boot always works. - Hardreset during boot sometimes fails to bring up the link on certain comibnations and device signature acquisition is unreliable. - Hardreset is often necessary after hotplug. It looks like the old behavior of preferring softreset was somehow pretty close to the working reset protocol although it could have lost a device during phy error handling by issuing hardreset. This patch implements nv_hardreset() which kicks in only for post-boot (!LOADING) device probing resets. This should be able to work around all known problem cases. This isn't perfect but given the various hardreset quirks on these controllers, I think this is as good as it can get. Tested on mcp5x (swncq), nf3 and ck804 for all both boot, warm and hot probing cases. Kudos to all the bug reporters and their painful hours with these damn controllers. ;-) Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Robert Hancock <hancockr@shaw.ca> Reported-by: David Lang <david@lang.hm> Reported-by: Samo Vodopivec <lament.email.si@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-06-10 11:05:26 -04:00
Shane Huang	58a09b38cf	[libata] ahci: Restore SB600 SATA controller 64 bit DMA Community reported one SB600 SATA issue(BZ #9412), which led to 64 bit DMA disablement for all SB600 revisions by driver maintainers with commits `c7a42156d9` and `4cde32fc4b`. But the root cause is ASUS M2A-VM system BIOS bug in old revisions like 0901, while forcing into 32bit DMA happens to work as workaround. Now it's time to withdraw `4cde32fc4b` so as to restore the SB600 SATA 64bit DMA capability. This patch is also adding the workaround for M2A-VM old BIOS revisions, but users are suggested to upgrade their system BIOS to the latest one if they meet this issue. Signed-off-by: Shane Huang <shane.huang@amd.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-06-10 11:05:00 -04:00
Peter Zijlstra	f7b7c26e01	perf_counter tools: Propagate signals properly Currently report and stat catch SIGINT (and others) without altering their exit state. This means that things like: while :; do perf stat ./foo ; done Loops become hard-to-interrupt, because bash never sees perf terminate due to interruption. Fix this. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-10 16:55:27 +02:00
Peter Zijlstra	4502d77c1d	perf_counter tools: Small frequency related fixes Create the counter in a disabled state and only enable it after we mmap() the buffer, this allows us to see the first few samples (and observe the frequency ramp). Furthermore, print the period in the verbose report. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-10 16:55:26 +02:00
Peter Zijlstra	bd2b5b1284	perf_counter: More aggressive frequency adjustment Also employ the overflow handler to adjust the frequency, this results in a stable frequency in about 40~50 samples, instead of that many ticks. This also means we can start sampling at a sample period of 1 without running head-first into the throttle. It relies on sched_clock() to accurately measure the time difference between the overflow NMIs. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-10 16:55:26 +02:00
Geert Uytterhoeven	2233123f27	ALSA: sound/ps3: Correct existing and add missing annotations probe functions should be __devinit Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2009-06-10 16:53:21 +02:00
Geert Uytterhoeven	cb6492e4a4	ALSA: sound/ps3: Restructure driver source Sort includes, and reorder code so we can kill the forward declarations No functional changes Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2009-06-10 16:53:09 +02:00
Geert Uytterhoeven	112ac808eb	ALSA: sound/ps3: Fix checkpatch issues Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2009-06-10 16:52:55 +02:00
Ryusuke Konishi	c3a7abf06c	nilfs2: support contiguous lookup of blocks Although get_block() callback function can return extent of contiguous blocks with bh->b_size, nilfs_get_block() function did not support this feature. This adds contiguous lookup feature to the block mapping codes of nilfs, and allows the nilfs_get_blocks() function to return the extent information by applying the feature. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:12 +09:00
Ryusuke Konishi	fa032744ad	nilfs2: add sync_page method to page caches of meta data This applies block_sync_page() function to the sync_page method of page caches for meta data files, gc page caches, and btree node buffers. This is a companion patch of ("nilfs2: enable sync_page mothod") which applied the function for data pages. This allows lock_page() for those meta data to unplug pending bio requests. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:12 +09:00
Ryusuke Konishi	a53b4751ae	nilfs2: use device's backing_dev_info for btree node caches Previously, default_backing_dev_info was used for the mapping of btree node caches. This uses device dependent backing_dev_info to allow detailed control of the device for the btree node pages. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:12 +09:00
Ryusuke Konishi	30c25be71f	nilfs2: return EBUSY against delete request on snapshot This helps userland programs like the rmcp command to distinguish error codes returned against a checkpoint removal request. Previously -EPERM was returned, and not discriminable from real permission errors. This also allows removal of the latest checkpoint because the deletion leads to create a new checkpoint, and thus it's harmless for the filesystem. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:12 +09:00
Ryusuke Konishi	fb6e7113ae	nilfs2: modify list of unsupported features in caveats This clarifies missing features of nilfs as a regular filesystem. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:11 +09:00
Ryusuke Konishi	e85dc1d529	nilfs2: enable sync_page method This adds a missing sync_page method which unplugs bio requests when waiting for page locks. This will improve read performance of nilfs. Here is a measurement result using dd command. Without this patch: # mount -t nilfs2 /dev/sde1 /test # dd if=/test/aaa of=/dev/null bs=512k 1024+0 records in 1024+0 records out 536870912 bytes (537 MB) copied, 6.00688 seconds, 89.4 MB/s With this patch: # mount -t nilfs2 /dev/sde1 /test # dd if=/test/aaa of=/dev/null bs=512k 1024+0 records in 1024+0 records out 536870912 bytes (537 MB) copied, 3.54998 seconds, 151 MB/s Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:11 +09:00
Ryusuke Konishi	30bda0b8ae	nilfs2: set bio unplug flag for the last bio in segment This sets BIO_RW_UNPLUG flag on the last bio of each segment during write. The last bio should be unplugged immediately because the caller waits for the completion after the submission. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:11 +09:00
Ryusuke Konishi	003ff182fd	nilfs2: allow future expansion of metadata read out via get info ioctl Nilfs has some ioctl commands to read out metadata from meta data files: - NILFS_IOCTL_GET_CPINFO for checkpoint file, - NILFS_IOCTL_GET_SUINFO for segment usage file, and - NILFS_IOCTL_GET_VINFO for Disk Address Transalation (DAT) file, respectively. Every routine on these metadata files is implemented so that it allows future expansion of on-disk format. But, the above ioctl commands do not support expansion even though nilfs_argv structure can handle arbitrary size for data exchanged via ioctl. This allows future expansion of the following structures which give basic format of the "get information" ioctls: - struct nilfs_cpinfo - struct nilfs_suinfo - struct nilfs_vinfo So, this introduces forward compatility of such ioctl commands. In this patch, a sanity check in nilfs_ioctl_get_info() function is changed to accept larger data structure [1], and metadata read routines are rewritten so that they become compatible for larger structures; the routines will just ignore the remaining fields which the current version of nilfs doesn't know. [1] The ioctl function already has another upper limit (PAGE_SIZE against a structure, which appears in nilfs_ioctl_wrap_copy function), and this will not cause security problem. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:11 +09:00
Hisashi Hifumi	258ef67e24	NILFS2: Pagecache usage optimization on NILFS2 Hi, I introduced "is_partially_uptodate" aops for NILFS2. A page can have multiple buffers and even if a page is not uptodate, some buffers can be uptodate on pagesize != blocksize environment. This aops checks that all buffers which correspond to a part of a file that we want to read are uptodate. If so, we do not have to issue actual read IO to HDD even if a page is not uptodate because the portion we want to read are uptodate. "block_is_partially_uptodate" function is already used by ext2/3/4. With the following patch random read/write mixed workloads or random read after random write workloads can be optimized and we can get performance improvement. I did a performance test using the sysbench. 1 --file-block-size=8K --file-total-size=2G --file-test-mode=rndrw --file-fsync-freq=0 --fil e-rw-ratio=1 run -2.6.30-rc5 Test execution summary: total time: 151.2907s total number of events: 200000 total time taken by event execution: 2409.8387 per-request statistics: min: 0.0000s avg: 0.0120s max: 0.9306s approx. 95 percentile: 0.0439s Threads fairness: events (avg/stddev): 12500.0000/238.52 execution time (avg/stddev): 150.6149/0.01 -2.6.30-rc5-patched Test execution summary: total time: 140.8828s total number of events: 200000 total time taken by event execution: 2240.8577 per-request statistics: min: 0.0000s avg: 0.0112s max: 0.8750s approx. 95 percentile: 0.0418s Threads fairness: events (avg/stddev): 12500.0000/218.43 execution time (avg/stddev): 140.0536/0.01 arch: ia64 pagesize: 16k Thanks. Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:11 +09:00
Ryusuke Konishi	7cde31d7d6	nilfs2: remove nilfs_btree_operations from btree mapping will remove indirect function calls using nilfs_btree_operations table. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:11 +09:00
Ryusuke Konishi	355c6b6103	nilfs2: remove nilfs_direct_operations from direct mapping will remove indirect function calls using nilfs_direct_operations table. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:11 +09:00
Ryusuke Konishi	d4b961576d	nilfs2: remove bmap pointer operations Previously, the bmap codes of nilfs used three types of function tables. The abuse of indirect function calls decreased source readability and suffered many indirect jumps which would confuse branch prediction of processors. This eliminates one type of the function tables, nilfs_bmap_ptr_operations, which was used to dispatch low level pointer operations of the nilfs bmap. This adds a new integer variable "b_ptr_type" to nilfs_bmap struct, and uses the value to select the pointer operations. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:10 +09:00
Ryusuke Konishi	3033342a0b	nilfs2: remove useless b_low and b_high fields from nilfs_bmap struct This will cut off 16 bytes from the nilfs_bmap struct which is embedded in the on-memory inode of nilfs. The b_high field was never used, and the b_low field stores a constant value which can be determined by whether the inode uses btree for block mapping or not. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:10 +09:00
Ryusuke Konishi	e473c1f265	nilfs2: remove pointless NULL check of bpop_commit_alloc_ptr function This indirect function is set to NULL only for gc cache inodes, but the gc cache inodes never call this function. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-06-10 23:41:10 +09:00

... 17 18 19 20 21 ...

150997 commits