Currently we have a few issues with the way the workqueue code is used to
implement AIL pushing:
- it accidentally uses the same workqueue as the syncer action, and thus
can be prevented from running if there are enough sync actions active
in the system.
- it doesn't use the HIGHPRI flag to queue at the head of the queue of
work items
At this point I'm not confident enough in getting all the workqueue flags and
tweaks right to provide a perfectly reliable execution context for AIL
pushing, which is the most important piece in XFS to make forward progress
when the log fills.
Revert back to use a kthread per filesystem which fixes all the above issues
at the cost of having a task struct and stack around for each mounted
filesystem. In addition this also gives us much better ways to diagnose
any issues involving hung AIL pushing and removes a small amount of code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Tested-by: Stefan Priebe <s.priebe@profihost.ag>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
We need to check for pinned buffers even in .iop_pushbuf given that inode
items flush into the same buffers that may be pinned directly due operations
on the unlinked inode list operating directly on buffers. To do this add a
return value to .iop_pushbuf that tells the AIL push about this and use
the existing log force mechanisms to unpin it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Tested-by: Stefan Priebe <s.priebe@profihost.ag>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
If an item was locked we should not update xa_last_pushed_lsn and thus skip
it when restarting the AIL scan as we need to be able to lock and write it
out as soon as possible. Otherwise heavy lock contention might starve AIL
pushing too easily, especially given the larger backoff once we moved
xa_last_pushed_lsn all the way to the target lsn.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Tested-by: Stefan Priebe <s.priebe@profihost.ag>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
The btrfs file defrag code will loop through the extents and
force COW on them. But there is a concurrent truncate in the middle of
the defrag, it might end up defragging the same range over and over
again.
The problem is that writepage won't go through and do anything on pages
past i_size, so the cow won't happen, so the file will appear to still
be fragmented. defrag will end up hitting the same extents again and
again.
In the worst case, the truncate can actually live lock with the defrag
because the defrag keeps creating new ordered extents which the truncate
code keeps waiting on.
The fix here is to make defrag check for i_size inside the main loop,
instead of just once before the looping starts.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This UML breakage:
linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790
linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790
Is caused by commit 3ae36655 ("x86-64: Rework vsyscall emulation and add
vsyscall= parameter") - the vsyscall emulation code is not fully cooked
yet as UML relies on some rather fragile SIGSEGV semantics.
Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default
to vsyscall=native for now, this patch implements that.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Andrew Lutomirski <luto@mit.edu>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/20111005214047.GE14406@localhost.pp.htv.fi
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Follow those steps:
# mount -o autodefrag /dev/sda7 /mnt
# dd if=/dev/urandom of=/mnt/tmp bs=200K count=1
# sync
# dd if=/dev/urandom of=/mnt/tmp bs=8K count=1 conv=notrunc
and then it'll go into a loop: writeback -> defrag -> writeback ...
It's because writeback writes [8K, 200K] and then writes [0, 8K].
I tried to make writeback know if the pages are dirtied by defrag,
but the patch was a bit intrusive. Here I simply set writeback_index
when we defrag a file.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Breno Leitao <leitao@linux.vnet.ibm.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to the 16 bit access to mscan registers there's too much data copied to
the zero initialized CAN frame when having an odd number of bytes to copy.
This patch ensures that only the requested bytes are copied by using an
8 bit access for the remaining byte.
Reported-by: Andre Naujoks <nautsch@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6_gro_receive() doesn't update the protocol ops after pulling
the ext headers. It looks like a typo.
Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are some consolidations of NPAR configuration
when FCoE and iSCSI L2 clients will get the same id,
in this case FCoE ring will be non-functional.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The doorbell register was being unconditionally swapped. In x86, that
meant it was being swapped to BE and written to the descriptor and to
memory, depending on the case of blue frame support or writing to
doorbell register. On PPC, this meant it was being swapped to LE and
then swapped back to BE while writing to the register. But in the blue
frame case, it was being written as LE to the descriptor.
The fix is not to swap doorbell unconditionally, write it to the
register as BE and convert it to BE when writing it to the descriptor.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Reported-by: Richard Hendrickson <richhend@us.ibm.com>
Cc: Eli Cohen <eli@dev.mellanox.co.il>
Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If skb is NULL, then stack trace is thrown anyway on dereference.
Therefore, the stack trace triggered by BUG_ON is duplicate.
Signed-off-by: Daniel Borkmann <danborkmann@googlemail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe reported to me that right after a bring up of a r6040 interface
the ethtool output had no consistent output with respect to link duplex
and speed. Fix this by adding a missing phy_start call in r6040_up and
conversely a phy_stop call in r6040_down to properly initialize phy states.
Reported-by: Joe Chou <Joe.Chou@rdc.com.tw>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Query port will now identify a 40G Ethernet speed.
Signed-off-by: Alexander Guller <alexg@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netdevice was being freed without being unregistered first if
mlx4_SET_PORT_general or mlx4_INIT_PORT failed.
Signed-off-by: Alexander Guller <alexg@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Number of bits taken from mac table index in QP
calculation should be based on log_num_mac parameter.
Signed-off-by: Alexander Guller <alexg@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed a memory leak caused by missing iounmap when device
is being released.
Signed-off-by: Alexander Guller <alexg@mellanox.co.il>
Signed-off-by: Sharon Cohen <sharonc@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Moderation is now done per ring and coalescing is enabled
by set_ring_param in ethtool.
Signed-off-by: Alexander Guller <alexg@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed a bug where ring size change caused insufficient memory
upon driver restart due to unreleased EQs.
Signed-off-by: Alexander Guller <alexg@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now only RX rings used irq per ring
and TX used only one per port.
>From now on, both of them will use the
irq per ring while RX & TX ring[i] will
use the same irq.
Signed-off-by: Alexander Guller <alexg@mellanox.co.il>
Signed-off-by: Sharon Cohen <sharonc@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'fixes' of git://git.linaro.org/people/arnd/arm-soc:
ARM: mach-ux500: enable fix for ARM errata 754322
ARM: OMAP: musb: Remove a redundant omap4430_phy_init call in usb_musb_init
ARM: OMAP: Fix i2c init for twl4030
ARM: OMAP4: MMC: fix power and audio issue, decouple USBC1 from MMC1
This fixes a compilation error in cpu-tegra.c which was introduced in
dc8d966bcc ("ARM: convert PCI defines to variables") which removed the
now obsolete mach/hardware.h from the mach-tegra subtree.
Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Olof Johansson <olof@lixom.net>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/kms: use hardcoded dig encoder to transmitter mapping for DCE4.1
drm/radeon/kms: fix dp_detect handling for DP bridge chips
drm/radeon/kms: retry aux transactions if there are status flags
A couple of changes to the Tegra maintainership setup:
I'm very glad to bring on Stephen Warren on board as a maintainer. The
work he has done so far is excellent, and the fact that he works for
Nvidia means he has long-term interest in the platform.
Erik Gilling did an astounding amount of work on getting things up and
running but has been a silent partner on the maintainership side for a
while, and is stepping down. Thanks for your contributions so far, Erik.
Finally, update the git URL since I'll take over running the main repo
for a while.
Overall maintainership model isn't changing much at this time: We'll all
three review patches as appropriate, and one of us will collect the main
repo (me at this time).
Signed-off-by: Olof Johansson <olof@lixom.net>
Cc: Erik Gilling <konkers@android.com>
Acked-by: Colin Cross <ccross@android.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (29 commits)
MIPS: Call oops_enter, oops_exit in die
staging/octeon: Software should check the checksum of no tcp/udp packets
MIPS: Octeon: Enable C0_UserLocal probing.
MIPS: No branches in delay slots for huge pages in handle_tlbl
MIPS: Don't clobber CP0_STATUS value for CONFIG_MIPS_MT_SMTC
MIPS: Octeon: Select CONFIG_HOLES_IN_ZONE
MIPS: PM: Use struct syscore_ops instead of sysdevs for PM (v2)
MIPS: Compat: Use 32-bit wrapper for compat_sys_futex.
MIPS: Do not use EXTRA_CFLAGS
MIPS: Alchemy: DB1200: Disable cascade IRQ in handler
SERIAL: Lantiq: Set timeout in uart_port
MIPS: Lantiq: Fix setting the PCI bus speed on AR9
MIPS: Lantiq: Fix external interrupt sources
MIPS: tlbex: Fix build error in R3000 code.
MIPS: Alchemy: Include Au1100 in PM code.
MIPS: Alchemy: Fix typo in MAC0 registration
MIPS: MSP71xx: Fix build error.
MIPS: Handle __put_user() sleeping.
MIPS: Allow forced irq threading
MIPS: i8259: Mark cascade interrupt non-threaded
...
This patch adds support for Rx hashing.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change moves the Tx hang check into the ring flags.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch cleans up several issues with VLANs on igb after the recent
changes that were meant to leave the VLANs enabled/disable via the
netdev->features flags.
Specifically the Rx VLAN settings were being dropped after reset due to the
fact that they were not being restored correctly. In addition I removed
the IRQ disable/enable since those were in place to protect the setting of
vlgrp.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of doing a byte swap on the staterr bits in the Rx descriptor we can
save ourselves a bit of space and some CPU time by instead just testing for
the various bits out of the Rx descriptor directly.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since the netdev now has its' own checksum flag to indicate if Rx checksum
is enabled we might as well use that instead of using the ring flag.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is meant to cleanup some of the IVAR register configuration.
igb_assign_vector had become pretty large with multiple copies of the same
general code for setting the IVAR. This change consolidates most of that
code by adding the igb_write_ivar function which allows us just to compute
the index and offset and then use that information to setup the IVAR.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change moves information related to interrupt throttle rate
configuration into a separate q_vector sub-structure called a work
container. A similar change has already been made for ixgbe and this work
is based off of that.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change moves all of the ring flags into a single value. The advantage
to this is that there is one central area for all of these flags and they
can all make use of the set/test bit operations.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There are a number of places where we have values that are stored as u16
but are being converted to int unnecessarily. In order to avoid that we
should convert all variables that deal with the next_to_clean, next_to_use,
and count to u16 values.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is meant to update the ring and vector allocations so that they
are per node instead of allocating everything on the node that
ifconfig/modprobe is called on. By doing this we can cut down
significantly on cross node traffic.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of storing most of the data for the TX hot path in the stack until
we are ready to write the descriptor we can save ourselves some time and
effort by pushing the SKB, tx_flags, gso_size, bytecount, and protocol into
the first igb_tx_buffer since that is where we will end up putting it
anyway.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Microsoft has a bug with ntlmv2 that requires use of ntlmssp, but
we didn't get the required information on when/how to use ntlmssp to
old (but once very popular) legacy servers (various NT4 fixpacks
for example) until too late to merge for 3.1. Will upgrade
to NTLMv2 in NTLMSSP in 3.2
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Per comments from Ben Hutchings on a previous patch, sweep the floors
a little removing unnecessary assignments of zero to fields of struct
ethtool_ringparam in driver code supporting ethtool -g.
Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for reporting ring sizes via ethtool -g to the 8139cp driver.
Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The LEON MMU Model (SRMMU) does not implement MMu Table probing
in hardware, instead it is implemented in software. However the
software implementation does not return the PTE as it should which
always results in INVALID entires and the PROM mappings are not
inherited as they should during startup. The following patch
removes the masking of the PTE.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no point in open-coding sock_valbool_flag().
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This applies ARM errata fix 754322 for all ux500 platforms.
Cc: stable@kernel.org
Signed-off-by: srinidhi kasagar <srinidhi.kasagar@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
This change will combine the writes of tx_buffer_info and the Tx data
descriptors into a single function. The advantage of this is that we can
avoid needless memory reads from the buffer info struct and speed things up
by keeping the accesses to the local registers.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>