perf/x86: Use rdpmc() rather than rdmsr() when possible in the kernel
The rdpmc instruction is faster than the equivelant rdmsr call,
so use it when possible in the kernel.
The perfctr kernel patches did this, after extensive testing showed
rdpmc to always be faster (One can look in etc/costs in the perfctr-2.6
package to see a historical list of the overhead).
I have done some tests on a 3.2 kernel, the kernel module I used
was included in the first posting of this patch:
                   rdmsr           rdpmc
 Core2 T9900:      203.9 cycles     30.9 cycles
 AMD fam0fh:        56.2 cycles      9.8 cycles
 Atom 6/28/2:      129.7 cycles     50.6 cycles
The speedup of using rdpmc is large.
[ It's probably possible (and desirable) to do this without
  requiring a new field in the hw_perf_event structure, but
  the fixed events make this tricky. ]
Signed-off-by: Vince Weaver <vweaver1@eecs.utk.edu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1203011724030.26934@cl320.eecs.utk.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
	
	
This commit is contained in:
		
					parent
					
						
							
								1ff4d58a19
							
						
					
				
			
			
				commit
				
					
						c48b60538c
					
				
			
		
					 2 changed files with 4 additions and 1 deletions
				
			
		|  | @ -677,6 +677,7 @@ struct hw_perf_event { | |||
| 			u64		last_tag; | ||||
| 			unsigned long	config_base; | ||||
| 			unsigned long	event_base; | ||||
| 			int		event_base_rdpmc; | ||||
| 			int		idx; | ||||
| 			int		last_cpu; | ||||
| 
 | ||||
|  |  | |||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue
	
	 Vince Weaver
				Vince Weaver