linux-pinenote

Author	SHA1	Message	Date
Randy Dunlap	d187abb9a8	USB: gadget: fix composite kernel-doc warnings Warning(include/linux/usb/composite.h:284): No description found for parameter 'disconnect' Warning(drivers/usb/gadget/composite.c:744): No description found for parameter 'c' Warning(drivers/usb/gadget/composite.c:744): Excess function parameter 'cdev' description in 'usb_string_ids_n' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:16 -07:00
Bill Pemberton	6b8f1ca558	USB: ssu100: set tty_flags in ssu100_process_packet flag was never set in ssu100_process_packet. Add logic to set it before calling tty_insert_flip_* Signed-off-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:16 -07:00
Bill Pemberton	85dee135b8	USB: ssu100: add disconnect function for ssu100 Add a disconnect function to the functions of this device. The disconnect is a call to usb_serial_generic_disconnect() so it requires that symbol to be exported from generic.c. Signed-off-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:16 -07:00
Bill Pemberton	5c7efeb76e	USB: serial: export symbol usb_serial_generic_disconnect This is needed by the ssu100 driver to use this function. Signed-off-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:16 -07:00
Bill Pemberton	f81c83db56	USB: ssu100: rework logic for TIOCMIWAIT Rework the logic for TIOCMIWAIT to use wait_event_interruptible. This also adds support for TIOCGICOUNT. Signed-off-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:16 -07:00
Bill Pemberton	556f1a0e9c	USB: ssu100: add register parameter to ssu100_setregister The function ssu100_setregister was hard coded to only set the MCR register. Add a register parameter so that other registers can be set. Signed-off-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Bill Pemberton	79f203a26a	USB: ssu100: remove duplicate #defines in ssu100 The ssu100 uses a TI16C550C UART so the SERIAL_ defines in this code are duplicates of those found in serial_reg.h. Remove the defines in ssu100.c and use the ones in the header file. Signed-off-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Bill Pemberton	9b2cef31f2	USB: ssu100: refine process_packet in ssu100 The status information does not appear at the start of each incoming packet so the check for len < 4 at the start of ssu100_process_packet is wrong. Remove it. Signed-off-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Bill Pemberton	175230587b	USB: ssu100: add locking for port private data in ssu100 Signed-off-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Axel Lin	96f2a34d2c	USB: r8a66597-udc: return -ENOMEM if kzalloc() fails Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Greg Kroah-Hartman	0827a9ff2b	USB: io_ti: check firmware version before updating If we can't read the firmware for a device from the disk, and yet the device already has a valid firmware image in it, we don't want to replace the firmware with something invalid. So check the version number to be less than the current one to verify this is the correct thing to do. Reported-by: Chris Beauchamp <chris@chillibean.tv> Tested-by: Chris Beauchamp <chris@chillibean.tv> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Michael Wileczka	d1ab903d25	USB: ftdi_sio: fix endianess of max packet size The USB max packet size (always little-endian) was not being byte swapped on big-endian systems. Applicable since [USB: ftdi_sio: fix hi-speed device packet size calculation] approx 2.6.31 Signed-off-by: Michael Wileczka <mikewileczka@yahoo.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Craig Shelley	72916791cb	USB: CP210x Fix Break On/Off The definitions for BREAK_ON and BREAK_OFF are inverted, causing break requests to fail. This patch sets BREAK_ON and BREAK_OFF to the correct values. Signed-off-by: Craig Shelley <craig@microtron.org.uk> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Jef Driesen	f36ecd5de9	USB: pl2303: New vendor and product id Add support for the Zeagle N2iTiON3 dive computer interface. Since Zeagle devices are actually manufactured by Seiko, this patch will support other Seiko based models as well. Signed-off-by: Jef Driesen <jefdriesen@telenet.be> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Ming Lei	d92a3ca689	USB: serial: fix leak of usb serial module refrence count The patch with title below makes reference count of usb serial module always more than one after driver is bound. USB-BKL: Remove BKL use for usb serial driver probing In fact, the patch above only replaces lock_kernel() with try_module_get() , and does not use module_put() to do what unlock_kernel() did, so casue leak of reference count of usb serial module and the module can not be unloaded after serial driver is bound with device. This patch fixes the issue, also simplifies such things: -only call try_module_get() once in the entry of usb_serial_probe() -only call module_put() once in the exit of usb_serial_probe Signed-off-by: Ming Lei <tom.leiming@gmail.com> Cc: Johan Hovold <jhovold@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Ross Burton	0eee6a2b2a	USB: add device IDs for igotu to navman I recently bought a i-gotU USB GPS, and whilst hunting around for linux support discovered this post by you back in 2009: `5148644` >Try the navman driver instead. You can either add the device id to the > driver and rebuild it, or do this before you plug the device in: > modprobe navman > echo -n "0x0df7 0x0900" > /sys/bus/usb-serial/drivers/navman/new_id > > and then plug your device in and see if that works. I can confirm that the navman driver works with the right device IDs on my i-gotU GT-600, which has the same device IDs. Attached is a patch adding the IDs. From: Ross Burton <ross@linux.intel.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Michael Hennerich	ebb8a4e487	USB: isp1760: use a write barrier to ensure proper ndelay timing The ISP1760 has some timing requirements where it has to delay a short period after a write to a register has started. However, this delay is from the time the write hits the USB chip (the ISP1760), not from the time where the processor started processing the write. So on a quick enough processor, it is sometimes possible for the write to not hit the device before we start delaying, and we then violate the part's timing requirements, so things stop working. To avoid all this, insert a write barrier after the register write and before the timing delay/register read so we can guarantee we only start counting time after the write has hit the device. Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:15 -07:00
Michael Tokarev	76078dc4fc	USB: option: add Celot CT-650 Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:14 -07:00
Dan Carpenter	9a887162be	USB: uvc_v4l2: cleanup test for end of loop We're trying to test for the the end of the loop here. "format" is never NULL. We don't know what "format->fcc" is because we're past the end of the loop and I think "fmt->fmt.pix.pixelformat" comes from the user so we don't know what that is either. It works, but it's cleaner to just test to see if (i == ARRAY_SIZE(uvc_formats). Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-08-23 20:50:14 -07:00
Simon Horman	1726442e11	net: increase the size of priv_flags and add IFF_OVS_DATAPATH IFF_OVS_DATAPATH is a place-holder for the Open vSwitch datapath which I am preparing to submit for merging. As all 16 bits of priv_flags are already assigned flags, also increase the size of priv_flags to 32 bits. Unfortunately, by my calculations this increases the size of struct net_device by 4 bytes on 32bit architectures and 8 bytes on 64 bit architectures. I couldn't see an obvious way to avoid that. Cc: Jesse Gross <jesse@nicira.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:43:17 -07:00
stephen hemminger	0fdc100bdc	ethtool: allow non-netadmin to query settings The SNMP daemon uses ethtool to determine the speed of network interfaces. This fails on Debian (and probably elsewhere) because for security SNMP daemon runs as non-root user (snmp). Note: A similar patch was rejected previously because of a concern about the possibility that on some hardware querying the ethtool settings requires access to the PHY and could slow the machine down. But the security risk of requiring SNMP daemon (and related services) to run as root far out weighs the risk of denial-of-service. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:43:16 -07:00
Eric Dumazet	afdcba371f	net: copy_rtnl_link_stats64() simplification No need to use a temporary struct rtnl_link_stats64 variable, just copy the source to skb buffer. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:43:16 -07:00
Changli Gao	0eec32ff35	net_sched: act_csum: coding style cleanup Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:43:15 -07:00
David S. Miller	7abac68602	pkt_sched: Make act_csum depend upon INET. It uses ip_send_check() and stuff like that. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:42:11 -07:00
Dimitris Michailidis	ccea790ef0	cxgb4: update PCI ids Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:38:15 -07:00
Dimitris Michailidis	1707aec9ac	cxgb4: fix setting of the function number in transmit descriptors Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:38:14 -07:00
Dimitris Michailidis	1478b3ee93	cxgb4: support eeprom read/write on functions other than 0 Extend the address translation for eeprom read/write (code used by ethtool -[eE]) to functions other than 0. Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:38:14 -07:00
Dimitris Michailidis	e46dab4d4b	cxgb4: handle Rx/Tx queue ranges not starting at 0 Currently the driver assumes that queue IDs start at 0 but that's true only for function 0. To support operation on other functions get the start of the queue ranges from FW and offset accordingly. Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:38:13 -07:00
David S. Miller	f04b4dd2b1	bna: Delete get_flags and set_flags ethtool methods. This driver doesn't support LRO, NTUPLE, or the RXHASH features. So it should not set these ethtool operations. This also fixes the warning: drivers/net/bna/bnad_ethtool.c:1272: warning: initialization from incompatible pointer type Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:34:51 -07:00
Yinglin Luan	bf82791ed6	qlcnic: fix poll implementation Function qlcnic_intr has pointer to qlcnic_host_sds_ring as second parameter not pointer to qlcnic_adapter. Signed-off-by: Yinglin Luan <synmyth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:28:55 -07:00
Yinglin Luan	7b589a35a1	netxen: fix poll implementation Function netxen_intr has pointer to nx_host_sds_ring as second parameter not pointer to netxen_adapter. Signed-off-by: Yinglin Luan <synmyth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:28:55 -07:00
Rasesh Mody	8b230ed8ec	bna: Brocade 10Gb Ethernet device driver This is patch 1/6 which contains linux driver source for Brocade's BR1010/BR1020 10Gb CEE capable ethernet adapter. Signed-off-by: Debashis Dutt <ddutt@brocade.com> Signed-off-by: Rasesh Mody <rmody@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:24:12 -07:00
Changli Gao	4c3a76abd3	bridge: netfilter: fix a memory leak nf_bridge_alloc() always reset the skb->nf_bridge, so we should always put the old one. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:14:36 -07:00
Gerrit Renker	231cc2aaf1	dccp ccid-2: Replace broken RTT estimator with better algorithm The current CCID-2 RTT estimator code is in parts broken and lags behind the suggestions in RFC2988 of using scaled variants for SRTT/RTTVAR. That code is replaced by the present patch, which reuses the Linux TCP RTT estimator code. Further details: ---------------- 1. The minimum RTO of previously one second has been replaced with TCP's, since RFC4341, sec. 5 says that the minimum of 1 sec. (suggested in RFC2988, 2.4) is not necessary. Instead, the TCP_RTO_MIN is used, which agrees with DCCP's concept of a default RTT (RFC 4340, 3.4). 2. The maximum RTO has been set to DCCP_RTO_MAX (64 sec), which agrees with RFC2988, (2.5). 3. De-inlined the function ccid2_new_ack(). 4. Added a FIXME: the RTT is sampled several times per Ack Vector, which will give the wrong estimate. It should be replaced with one sample per Ack. However, at the moment this can not be resolved easily, since - it depends on TX history code (which also needs some work), - the cleanest solution is not to use the `sent' time at all (saves 4 bytes per entry) and use DCCP timestamps / elapsed time to estimated the RTT, which however is non-trivial to get right (but needs to be done). Reasons for reusing the Linux TCP estimator algorithm: ------------------------------------------------------ Some time was spent to find a better alternative, using basic RFC2988 as a first step. Further analysis and experimentation showed that the Linux TCP RTO estimator is superior to a basic RFC2988 implementation. A summary is on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/ccid2/rto_estimator/ In addition, this estimator fared well in a recent empirical evaluation: Rewaskar, Sushant, Jasleen Kaur and F. Donelson Smith. A Performance Study of Loss Detection/Recovery in Real-world TCP Implementations. Proceedings of 15th IEEE International Conference on Network Protocols (ICNP-07), 2007. Thus there is significant benefit in reusing the existing TCP code. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:13:31 -07:00
Gerrit Renker	c38c92a84a	dccp ccid-2: Simplify dec_pipe and rearming of RTO timer This removes the dec_pipe function and improves the way the RTO timer is rearmed when a new acknowledgment comes in. Details and justification for removal: -------------------------------------- 1) The BUG_ON in dec_pipe is never triggered: pipe is only decremented for TX history entries between tail and head, for which it had previously been incremented in tx_packet_sent; and it is not decremented twice for the same entry, since it is - either decremented when a corresponding Ack Vector cell in state 0 or 1 was received (and then ccid2s_acked==1), - or it is decremented when ccid2s_acked==0, as part of the loss detection in tx_packet_recv (and hence it can not have been decremented earlier). 2) Restarting the RTO timer happens for every single entry in each Ack Vector parsed by tx_packet_recv (according to RFC 4340, 11.4 this can happen up to 16192 times per Ack Vector). 3) The RTO timer should not be restarted when all outstanding data has been acknowledged. This is currently done similar to (2), in dec_pipe, when pipe has reached 0. The patch onsolidates the code which rearms the RTO timer, combining the segments from new_ack and dec_pipe. As a result, the code becomes clearer (compare with tcp_rearm_rto()). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:13:31 -07:00
Gerrit Renker	30564e3555	dccp ccid-2: Remove redundant sanity tests This removes the ccid2_hc_tx_check_sanity function: it is redundant. Details: The tx_check_sanity function performs three tests: 1) it checks that the circular TX list is sorted - in ascending order of sequence number (ccid2s_seq) - and time (ccid2s_sent), - in the direction from `tail' (hctx_seqt) to `head' (hctx_seqh); 2) it ensures that the entire list has the length seqbufc * CCID2_SEQBUF_LEN; 3) it ensures that pipe equals the number of packets that were not marked `acked' (ccid2s_acked) between `tail' and `head'. The following argues that each of these tests is redundant, this can be verified by going through the code. (1) is not necessary, since both time and GSS increase from one packet to the next, so that subsequent insertions in tx_packet_sent (which advance the `head' pointer) will be in ascending order of time and sequence number. In (2), the length of the list is always equal to seqbufc times CCID2_SEQBUF_LEN (set to 1024) unless allocation caused an earlier failure, because: * at initialisation (tx_init), there is one chunk of size 1024 and seqbufc=1; * subsequent calls to tx_alloc_seq take place whenever head->next == tail in tx_packet_sent; then a new chunk of size 1024 is inserted between head and tail, and seqbufc is incremented by one. To show that (3) is redundant requires looking at two cases. The `pipe' variable of the TX socket is incremented only in tx_packet_sent, and decremented in tx_packet_recv. When head == tail (TX history empty) then pipe should be 0, which is the case directly after initialisation and after a retransmission timeout has occurred (ccid2_hc_tx_rto_expire). The first case involves parsing Ack Vectors for packets recorded in the live portion of the buffer, between tail and head. For each packet marked by the receiver as received (state 0) or ECN-marked (state 1), pipe is decremented by one, so for all such packets the BUG_ON in tx_check_sanity will not trigger. The second case is the loss detection in the second half of tx_packet_recv, below the comment "Check for NUMDUPACK". The first while-loop here ensures that the sequence number of `seqp' is either above or equal to `high_ack', or otherwise equal to the highest sequence number sent so far (of the entry head->prev, as head points to the next unsent entry). The next while-loop ("while (1)") counts the number of acked packets starting from that position of seqp, going backwards in the direction from head->prev to tail. If NUMDUPACK=3 such packets were counted within this loop, `seqp' points to the last acknowledged packet of these, and the "if (done == NUMDUPACK)" block is entered next. The while-loop contained within that block in turn traverses the list backwards, from head to tail; the position of `seqp' is saved in the variable `last_acked'. For each packet not marked as `acked', a congestion event is triggered within the loop, and pipe is decremented. The loop terminates when `seqp' has reached `tail', whereupon tail is set to the position previously stored in `last_acked'. Thus, between `last_acked' and the previous position of `tail', - pipe has been decremented earlier if the packet was marked as state 0 or 1; - pipe was decremented if the packet was not marked as acked. That is, pipe has been decremented by the number of packets between `last_acked' and the previous position of `tail'. As a consequence, pipe now again reflects the number of packets which have not (yet) been acked between the new position of tail (at `last_acked') and head->prev, or 0 if head==tail. The result is that the BUG_ON condition in check_sanity will also not be triggered, hence the test (3) is also redundant. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:13:30 -07:00
Gerrit Renker	51c22bb510	dccp ccid-3: No more CCID control blocks in LISTEN state The CCIDs are activated as last of the features, at the end of the handshake, were the LISTEN state of the master socket is inherited into the server state of the child socket. Thus, the only states visible to CCIDs now are OPEN/PARTOPEN, and the closing states. This allows to remove tests which were previously necessary to protect against referencing a socket in the listening state (in CCID-3), but which now have become redundant. As a further byproduct of enabling the CCIDs only after the connection has been fully established, several typecast-initialisations of ccid3_hc_{rx,tx}_sock can now be eliminated: * the CCID is loaded, so it is not necessary to test if it is NULL, * if it is possible to load a CCID and leave the private area NULL, then this is a bug, which should crash loudly - and earlier, * the test for state==OPEN \|\| state==PARTOPEN now reduces only to the closing phase (e.g. when the node has received an unexpected Reset). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:13:30 -07:00
Gerrit Renker	67b67e365f	ccid: ccid-2/3 code cosmetics This patch collects cosmetics-only changes to separate these from code changes: * update with regard to CodingStyle and whitespace changes, * documentation: - adding/revising comments, - remove CCID-3 RX socket documentation which is either duplicate or refers to fields that no longer exist, * expand embedded tfrc_tx_info struct inline for consistency, removing indirections via #define. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-23 20:13:29 -07:00
Christoph Hellwig	b5420f2359	xfs: do not discard page cache data on EAGAIN If xfs_map_blocks returns EAGAIN because of lock contention we must redirty the page and not disard the pagecache content and return an error from writepage. We used to do this correctly, but the logic got lost during the recent reshuffle of the writepage code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Mike Gao <ygao.linux@gmail.com> Tested-by: Mike Gao <ygao.linux@gmail.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com>	2010-08-24 11:47:51 +10:00
Dave Chinner	3b93c7aaef	xfs: don't do memory allocation under the CIL context lock Formatting items requires memory allocation when using delayed logging. Currently that memory allocation is done while holding the CIL context lock in read mode. This means that if memory allocation takes some time (e.g. enters reclaim), we cannot push on the CIL until the allocation(s) required by formatting complete. This can stall CIL pushes for some time, and once a push is stalled so are all new transaction commits. Fix this splitting the item formatting into two steps. The first step which does the allocation and memcpy() into the allocated buffer is now done outside the CIL context lock, and only the CIL insert is done inside the CIL context lock. This avoids the stall issue. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-08-24 11:45:53 +10:00
Dave Chinner	a44f13edf0	xfs: Reduce log force overhead for delayed logging Delayed logging adds some serialisation to the log force process to ensure that it does not deference a bad commit context structure when determining if a CIL push is necessary or not. It does this by grabing the CIL context lock exclusively, then dropping it before pushing the CIL if necessary. This causes serialisation of all log forces and pushes regardless of whether a force is necessary or not. As a result fsync heavy workloads (like dbench) can be significantly slower with delayed logging than without. To avoid this penalty, copy the current sequence from the context to the CIL structure when they are swapped. This allows us to do unlocked checks on the current sequence without having to worry about dereferencing context structures that may have already been freed. Hence we can remove the CIL context locking in the forcing code and only call into the push code if the current context matches the sequence we need to force. By passing the sequence into the push code, we can check the sequence again once we have the CIL lock held exclusive and abort if the sequence has already been pushed. This avoids a lock round-trip and unnecessary CIL pushes when we have racing push calls. The result is that the regression in dbench performance goes away - this change improves dbench performance on a ramdisk from ~2100MB/s to ~2500MB/s. This compares favourably to not using delayed logging which retuns ~2500MB/s for the same workload. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-08-24 11:40:03 +10:00
Dave Chinner	1a387d3be2	xfs: dummy transactions should not dirty VFS state When we need to cover the log, we issue dummy transactions to ensure the current log tail is on disk. Unfortunately we currently use the root inode in the dummy transaction, and the act of committing the transaction dirties the inode at the VFS level. As a result, the VFS writeback of the dirty inode will prevent the filesystem from idling long enough for the log covering state machine to complete. The state machine gets stuck in a loop issuing new dummy transactions to cover the log and never makes progress. To avoid this problem, the dummy transactions should not cause externally visible state changes. To ensure this occurs, make sure that dummy transactions log an unchanging field in the superblock as it's state is never propagated outside the filesystem. This allows the log covering state machine to complete successfully and the filesystem now correctly enters a fully idle state about 90s after the last modification was made. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-08-24 11:46:31 +10:00
Stuart Brodsky	2fe33661fc	xfs: ensure f_ffree returned by statfs() is non-negative Because of delayed updates to sb_icount field in the super block, it is possible to allocate over maxicount number of inodes. This causes the arithmetic to calculate a negative number of free inodes in user commands like df or stat -f. Since maxicount is a somewhat arbitrary number, a slight over allocation is not critical but user commands should be displayed as 0 or greater and never go negative. To do this the value in the stats buffer f_ffree is capped to never go negative. [ Modified to use max_t as per Christoph's comment. ] Signed-off-by: Stu Brodsky <sbrodsky@sgi.com> Signed-off-by: Dave Chinner <dchinner@redhat.com>	2010-08-24 11:46:05 +10:00
Dave Chinner	efceab1d56	xfs: handle negative wbc->nr_to_write during sync writeback During data integrity (WB_SYNC_ALL) writeback, wbc->nr_to_write will go negative on inodes with more than 1024 dirty pages due to implementation details of write_cache_pages(). Currently XFS will abort page clustering in writeback once nr_to_write drops below zero, and so for data integrity writeback we will do very inefficient page at a time allocation and IO submission for inodes with large numbers of dirty pages. Fix this by only aborting the page clustering code when wbc->nr_to_write is negative and the sync mode is WB_SYNC_NONE. Cc: <stable@kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-08-24 11:44:56 +10:00
Dave Chinner	546a192422	writeback: write_cache_pages doesn't terminate at nr_to_write <= 0 I noticed XFS writeback in 2.6.36-rc1 was much slower than it should have been. Enabling writeback tracing showed: flush-253:16-8516 [007] 1342952.351608: wbc_writepage: bdi 253:16: towrt=1024 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0 flush-253:16-8516 [007] 1342952.351654: wbc_writepage: bdi 253:16: towrt=1023 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0 flush-253:16-8516 [000] 1342952.369520: wbc_writepage: bdi 253:16: towrt=0 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0 flush-253:16-8516 [000] 1342952.369542: wbc_writepage: bdi 253:16: towrt=-1 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0 flush-253:16-8516 [000] 1342952.369549: wbc_writepage: bdi 253:16: towrt=-2 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0 Writeback is not terminating in background writeback if ->writepage is returning with wbc->nr_to_write == 0, resulting in sub-optimal single page writeback on XFS. Fix the write_cache_pages loop to terminate correctly when this situation occurs and so prevent this sub-optimal background writeback pattern. This improves sustained sequential buffered write performance from around 250MB/s to 750MB/s for a 100GB file on an XFS filesystem on my 8p test VM. Cc:<stable@kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Wu Fengguang <fengguang.wu@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-08-24 11:44:34 +10:00
Dave Chinner	4536f2ad8b	xfs: fix untrusted inode number lookup Commit `7124fe0a5b` ("xfs: validate untrusted inode numbers during lookup") changes the inode lookup code to do btree lookups for untrusted inode numbers. This change made an invalid assumption about the alignment of inodes and hence incorrectly calculated the first inode in the cluster. As a result, some inode numbers were being incorrectly considered invalid when they were actually valid. The issue was not picked up by the xfstests suite because it always runs fsr and dump (the two utilities that utilise the bulkstat interface) on cache hot inodes and hence the lookup code in the cold cache path was not sufficiently exercised to uncover this intermittent problem. Fix the issue by relaxing the btree lookup criteria and then checking if the record returned contains the inode number we are lookup for. If it we get an incorrect record, then the inode number is invalid. Cc: <stable@kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-08-24 11:42:30 +10:00
Dave Chinner	5b3eed756c	xfs: ensure we mark all inodes in a freed cluster XFS_ISTALE Under heavy load parallel metadata loads (e.g. dbench), we can fail to mark all the inodes in a cluster being freed as XFS_ISTALE as we skip inodes we cannot get the XFS_ILOCK_EXCL or the flush lock on. When this happens and the inode cluster buffer has already been marked stale and freed, inode reclaim can try to write the inode out as it is dirty and not marked stale. This can result in writing th metadata to an freed extent, or in the case it has already been overwritten trigger a magic number check failure and return an EUCLEAN error such as: Filesystem "ram0": inode 0x442ba1 background reclaim flush failed with 117 Fix this by ensuring that we hoover up all in memory inodes in the cluster and mark them XFS_ISTALE when freeing the cluster. Cc: <stable@kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-08-24 11:42:41 +10:00
Dave Chinner	d17c701ce6	xfs: unlock items before allowing the CIL to commit When we commit a transaction using delayed logging, we need to unlock the items in the transaciton before we unlock the CIL context and allow it to be checkpointed. If we unlock them after we release the CIl context lock, the CIL can checkpoint and complete before we free the log items. This breaks stale buffer item unlock and unpin processing as there is an implicit assumption that the unlock will occur before the unpin. Also, some log items need to store the LSN of the transaction commit in the item (inodes and EFIs) and so can race with other transaction completions if we don't prevent the CIL from checkpointing before the unlock occurs. Cc: <stable@kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-08-24 11:42:52 +10:00
Linus Torvalds	d1b113bb02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits) netfilter: fix CONFIG_COMPAT support isdn/avm: fix build when PCMCIA is not enabled header: fix broken headers for user space e1000e: don't check for alternate MAC addr on parts that don't support it e1000e: disable ASPM L1 on 82573 ll_temac: Fix poll implementation netxen: fix a race in netxen_nic_get_stats() qlnic: fix a race in qlcnic_get_stats() irda: fix a race in irlan_eth_xmit() net: sh_eth: remove unused variable netxen: update version 4.0.74 netxen: fix inconsistent lock state vlan: Match underlying dev carrier on vlan add ibmveth: Fix opps during MTU change on an active device ehea: Fix synchronization between HW and SW send queue bnx2x: Update bnx2x version to 1.52.53-4 bnx2x: Fix PHY locking problem rds: fix a leak of kernel memory netlink: fix compat recvmsg netfilter: fix userspace header warning ...	2010-08-23 18:30:30 -07:00
Linus Torvalds	9c5ea3675d	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86: hp-wmi: Fix query interface ACPI_TOSHIBA needs LEDS support	2010-08-23 18:29:34 -07:00

... 18 19 20 21 22 ...

211154 commits