Multiple virtual counters share one physical counter. The reservation
of virtual counters fails due to duplicate allocation of the same
counter. The counters are already reserved. Thus, virtual counter
reservation may removed at all. This also makes the code easier.
Cc: stable@kernel.org
Signed-off-by: Robert Richter <robert.richter@amd.com>
Currently, oprofile fails silently on platforms where a non-OS entity
such as the system firmware "enables" and uses a performance
counter. There is a warning in the code for this case.
The warning indicates an already running counter. If oprofile doesn't
collect data, then try using a different performance counter on your
platform to monitor the desired event. Delete the counter from the
desired event by editing the
/usr/share/oprofile/<cpu_type>/<cpu>/events
file. If the event cannot be monitored by any other counter, contact
your hardware or BIOS vendor.
Cc: Shashi Belur <shashi-kiran.belur@hp.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch generates a warning if a counter is already active.
Implemented for AMD and P6 models. P4 is not supported.
Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Cc: Shashi Belur <shashi-kiran.belur@hp.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Robert Richter <robert.richter@amd.com>
IBS selects an op (execution operation) for sampling by counting
either cycles or dispatched ops. Better statistical samples can be
produced by adding a software generated random offset to the periodic
op counter value with each sample.
This patch adds software randomization to the IBS periodic op
counter. The lower 12 bits of the 20 bit counter are
randomized. IbsOpCurCnt is initialized with a 12 bit random value.
There is a work around if the hw can not write to IbsOpCurCnt. Then
the lower 8 bits of the 16 bit IbsOpMaxCnt [15:0] value are randomized
in the range of -128 to +127 by adding/subtracting an offset to the
maximum count (IbsOpMaxCnt).
The linear feedback shift register (LFSR) algorithm is used for
pseudo-random number generation to have low impact to the memory
system.
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch implements a linear feedback shift register (LFSR) for
pseudo-random number generation for IBS.
For IBS measurements it would be good to minimize memory traffic in
the interrupt handler since every access pollutes the data
caches. Computing a maximal period LFSR just needs shifts and ORs.
The LFSR method is good enough to randomize the ops at low
overhead. 16 pseudo-random bits are enough for the implementation and
it doesn't matter that the pattern repeats with a fairly short
cycle. It only needs to break up (hard) periodic sampling behavior.
The logic was designed by Paul Drongowski.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch adds IBS feature detection using cpuid flags. An IBS
capability mask is introduced to test for certain IBS features. The
bit mask is the same as for IBS cpuid feature flags (Fn8000_001B_EAX),
but bit 0 is used to indicate the existence of IBS.
The patch also changes the handling of the IbsOpCntCtl bit (periodic
op counter count control). The oprofilefs file for this feature
(ibs_op/dispatched_ops) will be only exposed if the feature is
available, also the default for the bit is set to count clock cycles.
In general, the userland can detect the availability of a feature by
checking for the corresponding file in oprofilefs. If it exists, the
feature also exists. This may lead to a dynamic file layout depending
on the cpu type with that the userland has to deal with. Current
opcontrol is compatible.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Standard AMD systems have the same number of nodes as there are
northbridge devices. However, there may kernel configurations
(especially for 32 bit) or system setups exist, where the node number
is different or it can not be detected properly. Thus the check is not
reliable and may fail though IBS setup was fine. For this reason it is
better to remove the check.
Cc: stable <stable@kernel.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
OProfile support for IBS is now for several versions in the
kernel. The feature is stable now and the code can be activated
permanently.
As a side effect IBS now works also on nosmp configs.
Signed-off-by: Robert Richter <robert.richter@amd.com>
The commit
1155de4 ring-buffer: Make it generally available
already made ring-buffer available without the TRACING option
enabled. This patch removes the TRACING dependency from oprofile.
Fixes also oprofile configuration on ia64.
The patch also applies to the 2.6.32-stable kernel.
Reported-by: Tony Jones <tonyj@suse.de>
Cc: stable@kernel.org
Signed-off-by: Robert Richter <robert.richter@amd.com>
CAN module on AM3517 requires programming of IO expander as part
of init sequence - to enable CAN PHY. Added platform specific
callback to handle phy control(switch on /off).
Signed-off-by: Sriramakrishnan <srk@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit c79c5ffdce.
As Jeff points out we can't break the user visible interface
like this, we need to add this into the reserved[] thing.
Signed-off-by: David S. Miller <davem@davemloft.net>
On 64 bit builds when CONFIG_BLK_CGROUP=n (the default) this removes 8
bytes of padding from structure io_context and drops its size from 72 to
64 bytes, so needing one fewer cachelines and allowing more objects per
slab in it's kmem_cache.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
----
patch against 2.6.33
compiled & test on x86_64 AMDX2
regards
Richard
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Except for SCSI no device drivers distinguish between physical and
hardware segment limits. Consolidate the two into a single segment
limit.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The block layer calling convention is blk_queue_<limit name>.
blk_queue_max_sectors predates this practice, leading to some confusion.
Rename the function to appropriately reflect that its intended use is to
set max_hw_sectors.
Also introduce a temporary wrapper for backwards compability. This can
be removed after the merge window is closed.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Add a BLK_ prefix to block layer constants.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
blk_queue_max_hw_sectors is no longer called by any subsystem and can be
removed.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Clarify blk_queue_max_sectors and update documentation.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Intergraph bought 3D Labs and some XVR-500 chips have Intergraph's
vendor id.
Reported-by: Jurij Smakov <jurij@wooyd.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Clients will set their MTU to 1280 if they receive a
ICMPV6_PKT_TOOBIG message with an MTU less than 1280.
To allow encapsulating of packets over a 1280 link
we should always accept packets with a size of 1280
for forwarding even if the path has a lower MTU and
fragment the encapsulated packets afterwards.
In case a forwarded packet is not going to be encapsulated
a ICMPV6_PKT_TOOBIG msg will still be send by ip6_fragment()
with the correct MTU.
Signed-off-by: Ulrich Weber <uweber@astaro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sparc's scatterlist structure is identical to the generic one.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to the loop complexicity in nes_nic.c, I'm using char* to copy mc addresses
to it.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The drvinfo struct should include the number of strings that
get_rx_ntuple will return. It will be variable if an underlying
driver implements its own get_rx_ntuple routine, so userspace
needs to know how much data is coming.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There doesn't seem to be any reason to explicitly return
NETDEV_TX_OK as err is set to NETDEV_TX_OK in all cases that
reach this point.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use list_first_entry macro; no longer any need to use
'next' directly in list to find first entry.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch consists of a few minor cleanups to the SR-IOV
configurion code in rtnetlink.
- Remove unneccesary lock
- Remove unneccesary casts
- Return correct error code for no driver support
These changes are based on comments from Patrick McHardy
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no point of accepting an address of smaller length than dev->addr_len
here. Therefore change this for stonger check.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Small optimization to the code which checks to see if we'd cross
a 4K boundary when stocking RX ring.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Guillaume Morin <guillaume@morinfr.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC 4291 section 2.4 states that all uncategorized addresses
should be considered as Global Unicast.
This will remove IPV6_ADDR_RESERVED completely
and return IPV6_ADDR_UNICAST in ipv6_addr_type() instead.
Signed-off-by: Ulrich Weber <uweber@astaro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We only need to assign the status block address once and it also saves
space in the structure.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Eddie Wai <waie@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With a separate IP address for iSCSI, connections should proceed
whether or not we can get a route to the target from the network stack.
It is possible that the network IP address may not reach the iSCSI target.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Eddie Wai <waie@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some data structures are freed when the device is down and it will
crash if an ISCSI netlink message is received. Add RCU protection
to prevent this. In the shutdown path, ulp_ops[CNIC_ULP_L4] is
assigned NULL and rcu_synchronized before freeing the data
structures.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For bnx2 devices, always send notification to bnx2i to let it initiate
the cleanup when RST is received.
For bnx2x devices, add unsolicited RST_COMP handling to start the cleanup.
Signed-off-by: Eddie Wai <waie@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initialize IP ID and handle some additional connection errors.
Signed-off-by: Eddie Wai <waie@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of allocating 128 struct netdev_queue per device, use the
minimum value between 128 and the number of possible txq's, to
reduce ram usage and "tc -s -d class shod dev .." output.
This patch fixes Eric Dumazet's patch to set the TX queues to
the correct minimum.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disabling TSO can cause the dev_watchdog timer to be triggered because
when TSO is disabled netif_tx_stop_all_queues is called. If the watchdog
timer fires while the queues are stopped and traffic has not recently been
sent on a paticular queue this is falsly identified as a hang and
ndo_tx_timeout() is called. This is ocossionally seen during testing.
This removes the netif_tx_stop_all_queues() it is not needed. The scheduler
submits skb's with dev_hard_start_xmit(), this checks if netif_needs_gso and
if so it calls dev_gso_segment. Disabling TSO will cause dev_hard_start_xmit()
to do the gso processing. However ixgbe does not use the features flags to
determine if it needs to use tso or not instead it uses skb->gso_size so
ixgbe will process these frames correctly regardless of the netdev features
flag.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Work around 82599 HW issue when HWRSC is enabled on IOMMU enabled
kernels. 82599 HW is updating the header information after setting the
descriptor to done, resulting DMA mapping/unmapping issues on IOMMU
enabled systems. To work around the issue delay unmapping of first packet
that carries the header information until end of packet is reached.
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Isolate spinlock for tx and rx to resolve double-locking.
This is potential bug while this controller does not exist on any
SMP platforms, but lockdep or rt-preempt reveals this bug.
Reported-by: Ralf Roesch <ralf.roesch@rw-gmbh.de>
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netif_wake_queue() is called correctly (i.e. only on !txfull
condition) from net_tx(). So Unconditional call to the
netif_wake_queue() here is wrong. This might cause calling of
start_xmit routine on txfull state and trigger tx-ring overflow.
This fix is ported from commit 662a96bd6f
("tc35815: Remove a wrong netif_wake_queue() call which triggers BUG_ON").
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hmm so actually my original patch including this bit was correct,
"list = list->next;" confused me :) - will send patch correcting that in a few.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
removed some needless checks and also corrected bug in lp486e (dmi was passed
instead of dmi->dmi_addr)
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We re-program the event control register every time we reset the count,
this appears to be superflous, hence remove it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arjan van de Ven <arjan@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since the cpu argument to hw_perf_group_sched_in() is always
smp_processor_id(), simplify the code a little by removing this argument
and using the current cpu where needed.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1265890918.5396.3.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds correct AMD NorthBridge event scheduling.
NB events are events measuring L3 cache, Hypertransport traffic. They are
identified by an event code >= 0xe0. They measure events on the
Northbride which is shared by all cores on a package. NB events are
counted on a shared set of counters. When a NB event is programmed in a
counter, the data actually comes from a shared counter. Thus, access to
those counters needs to be synchronized.
We implement the synchronization such that no two cores can be measuring
NB events using the same counters. Thus, we maintain a per-NB allocation
table. The available slot is propagated using the event_constraint
structure.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4b703957.0702d00a.6bf2.7b7d@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In certain situations, the kernel may need to stop and start the same
event rapidly. The current PMU callbacks do not distinguish between stop
and release (i.e., stop + free the resource). Thus, a counter may be
released, then it will be immediately re-acquired. Event scheduling will
again take place with no guarantee to assign the same counter. On some
processors, this may event yield to failure to assign the event back due
to competion between cores.
This patch is adding a new pair of callback to stop and restart a counter
without actually release the underlying counter resource. On stop, the
counter is stopped, its values saved and that's it. On start, the value
is reloaded and counter is restarted (on x86, actual restart is delayed
until perf_enable()).
Signed-off-by: Stephane Eranian <eranian@google.com>
[ added fallback to ->enable/->disable for all other PMUs
fixed x86_pmu_start() to call x86_pmu.enable()
merged __x86_pmu_disable into x86_pmu_stop() ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4b703875.0a04d00a.7896.ffffb824@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>