| 
									
										
										
										
											2007-06-28 20:45:47 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  | 		HOWTO for multiqueue network device support | 
					
						
							|  |  |  | 		=========================================== | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Section 1: Base driver requirements for implementing multiqueue support | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Intro: Kernel support for multiqueue devices | 
					
						
							|  |  |  | --------------------------------------------------------- | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-07-08 23:14:24 -07:00
										 |  |  | Kernel support for multiqueue devices is always present. | 
					
						
							| 
									
										
										
										
											2007-06-28 20:45:47 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  | Section 1: Base driver requirements for implementing multiqueue support | 
					
						
							|  |  |  | ----------------------------------------------------------------------- | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Base drivers are required to use the new alloc_etherdev_mq() or | 
					
						
							|  |  |  | alloc_netdev_mq() functions to allocate the subqueues for the device.  The | 
					
						
							|  |  |  | underlying kernel API will take care of the allocation and deallocation of | 
					
						
							|  |  |  | the subqueue memory, as well as netdev configuration of where the queues | 
					
						
							|  |  |  | exist in memory. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The base driver will also need to manage the queues as it does the global | 
					
						
							|  |  |  | netdev->queue_lock today.  Therefore base drivers should use the | 
					
						
							|  |  |  | netif_{start|stop|wake}_subqueue() functions to manage each queue while the | 
					
						
							|  |  |  | device is still operational.  netdev->queue_lock is still used when the device | 
					
						
							|  |  |  | comes online or when it's completely shut down (unregister_netdev(), etc.). | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-09-12 16:29:34 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  | Section 2: Qdisc support for multiqueue devices | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ----------------------------------------------- | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-09-12 17:57:23 -07:00
										 |  |  | Currently two qdiscs are optimized for multiqueue devices.  The first is the | 
					
						
							|  |  |  | default pfifo_fast qdisc.  This qdisc supports one qdisc per hardware queue. | 
					
						
							|  |  |  | A new round-robin qdisc, sch_multiq also supports multiple hardware queues. The | 
					
						
							| 
									
										
										
										
											2008-09-12 16:29:34 -07:00
										 |  |  | qdisc is responsible for classifying the skb's and then directing the skb's to | 
					
						
							|  |  |  | bands and queues based on the value in skb->queue_mapping.  Use this field in | 
					
						
							|  |  |  | the base driver to determine which queue to send the skb to. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-09-12 17:57:23 -07:00
										 |  |  | sch_multiq has been added for hardware that wishes to avoid head-of-line | 
					
						
							|  |  |  | blocking.  It will cycle though the bands and verify that the hardware queue | 
					
						
							| 
									
										
										
										
											2008-09-12 16:29:34 -07:00
										 |  |  | associated with the band is not stopped prior to dequeuing a packet. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | On qdisc load, the number of bands is based on the number of queues on the | 
					
						
							|  |  |  | hardware.  Once the association is made, any skb with skb->queue_mapping set, | 
					
						
							|  |  |  | will be queued to the band associated with the hardware queue. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Section 3: Brief howto using MULTIQ for multiqueue devices | 
					
						
							|  |  |  | --------------------------------------------------------------- | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The userspace command 'tc,' part of the iproute2 package, is used to configure | 
					
						
							|  |  |  | qdiscs.  To add the MULTIQ qdisc to your network device, assuming the device | 
					
						
							|  |  |  | is called eth0, run the following command: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | # tc qdisc add dev eth0 root handle 1: multiq | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The qdisc will allocate the number of bands to equal the number of queues that | 
					
						
							|  |  |  | the device reports, and bring the qdisc online.  Assuming eth0 has 4 Tx | 
					
						
							|  |  |  | queues, the band mapping would look like: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | band 0 => queue 0 | 
					
						
							|  |  |  | band 1 => queue 1 | 
					
						
							|  |  |  | band 2 => queue 2 | 
					
						
							|  |  |  | band 3 => queue 3 | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-09-12 17:57:23 -07:00
										 |  |  | Traffic will begin flowing through each queue based on either the simple_tx_hash | 
					
						
							|  |  |  | function or based on netdev->select_queue() if you have it defined. | 
					
						
							| 
									
										
										
										
											2008-09-12 16:29:34 -07:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2008-09-12 16:30:20 -07:00
										 |  |  | The behavior of tc filters remains the same.  However a new tc action, | 
					
						
							|  |  |  | skbedit, has been added.  Assuming you wanted to route all traffic to a | 
					
						
							| 
									
										
										
										
											2008-09-12 17:56:50 -07:00
										 |  |  | specific host, for example 192.168.0.3, through a specific queue you could use | 
					
						
							| 
									
										
										
										
											2008-09-12 16:30:20 -07:00
										 |  |  | this action and establish a filter such as: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | tc filter add dev eth0 parent 1: protocol ip prio 1 u32 \ | 
					
						
							|  |  |  | 	match ip dst 192.168.0.3 \ | 
					
						
							|  |  |  | 	action skbedit queue_mapping 3 | 
					
						
							| 
									
										
										
										
											2008-09-12 16:29:34 -07:00
										 |  |  | 
 | 
					
						
							|  |  |  | Author: Alexander Duyck <alexander.h.duyck@intel.com> | 
					
						
							|  |  |  | Original Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com> |