The WRITE SAME command supported on some SCSI devices allows the same block to be efficiently replicated throughout a block range. Only a single logical block is transferred from the host and the storage device writes the same data to all blocks described by the I/O. This patch implements support for WRITE SAME in the block layer. The blkdev_issue_write_same() function can be used by filesystems and block drivers to replicate a buffer across a block range. This can be used to efficiently initialize software RAID devices, etc. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
		
			
				
	
	
		
			222 lines
		
	
	
	
		
			8.4 KiB
			
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			222 lines
		
	
	
	
		
			8.4 KiB
			
		
	
	
	
		
			Text
		
	
	
	
	
	
What:		/sys/block/<disk>/stat
 | 
						|
Date:		February 2008
 | 
						|
Contact:	Jerome Marchand <jmarchan@redhat.com>
 | 
						|
Description:
 | 
						|
		The /sys/block/<disk>/stat files displays the I/O
 | 
						|
		statistics of disk <disk>. They contain 11 fields:
 | 
						|
		 1 - reads completed successfully
 | 
						|
		 2 - reads merged
 | 
						|
		 3 - sectors read
 | 
						|
		 4 - time spent reading (ms)
 | 
						|
		 5 - writes completed
 | 
						|
		 6 - writes merged
 | 
						|
		 7 - sectors written
 | 
						|
		 8 - time spent writing (ms)
 | 
						|
		 9 - I/Os currently in progress
 | 
						|
		10 - time spent doing I/Os (ms)
 | 
						|
		11 - weighted time spent doing I/Os (ms)
 | 
						|
		For more details refer Documentation/iostats.txt
 | 
						|
 | 
						|
 | 
						|
What:		/sys/block/<disk>/<part>/stat
 | 
						|
Date:		February 2008
 | 
						|
Contact:	Jerome Marchand <jmarchan@redhat.com>
 | 
						|
Description:
 | 
						|
		The /sys/block/<disk>/<part>/stat files display the
 | 
						|
		I/O statistics of partition <part>. The format is the
 | 
						|
		same as the above-written /sys/block/<disk>/stat
 | 
						|
		format.
 | 
						|
 | 
						|
 | 
						|
What:		/sys/block/<disk>/integrity/format
 | 
						|
Date:		June 2008
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Metadata format for integrity capable block device.
 | 
						|
		E.g. T10-DIF-TYPE1-CRC.
 | 
						|
 | 
						|
 | 
						|
What:		/sys/block/<disk>/integrity/read_verify
 | 
						|
Date:		June 2008
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Indicates whether the block layer should verify the
 | 
						|
		integrity of read requests serviced by devices that
 | 
						|
		support sending integrity metadata.
 | 
						|
 | 
						|
 | 
						|
What:		/sys/block/<disk>/integrity/tag_size
 | 
						|
Date:		June 2008
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Number of bytes of integrity tag space available per
 | 
						|
		512 bytes of data.
 | 
						|
 | 
						|
 | 
						|
What:		/sys/block/<disk>/integrity/write_generate
 | 
						|
Date:		June 2008
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Indicates whether the block layer should automatically
 | 
						|
		generate checksums for write requests bound for
 | 
						|
		devices that support receiving integrity metadata.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/alignment_offset
 | 
						|
Date:		April 2009
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Storage devices may report a physical block size that is
 | 
						|
		bigger than the logical block size (for instance a drive
 | 
						|
		with 4KB physical sectors exposing 512-byte logical
 | 
						|
		blocks to the operating system).  This parameter
 | 
						|
		indicates how many bytes the beginning of the device is
 | 
						|
		offset from the disk's natural alignment.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/<partition>/alignment_offset
 | 
						|
Date:		April 2009
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Storage devices may report a physical block size that is
 | 
						|
		bigger than the logical block size (for instance a drive
 | 
						|
		with 4KB physical sectors exposing 512-byte logical
 | 
						|
		blocks to the operating system).  This parameter
 | 
						|
		indicates how many bytes the beginning of the partition
 | 
						|
		is offset from the disk's natural alignment.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/queue/logical_block_size
 | 
						|
Date:		May 2009
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		This is the smallest unit the storage device can
 | 
						|
		address.  It is typically 512 bytes.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/queue/physical_block_size
 | 
						|
Date:		May 2009
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		This is the smallest unit a physical storage device can
 | 
						|
		write atomically.  It is usually the same as the logical
 | 
						|
		block size but may be bigger.  One example is SATA
 | 
						|
		drives with 4KB sectors that expose a 512-byte logical
 | 
						|
		block size to the operating system.  For stacked block
 | 
						|
		devices the physical_block_size variable contains the
 | 
						|
		maximum physical_block_size of the component devices.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/queue/minimum_io_size
 | 
						|
Date:		April 2009
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Storage devices may report a granularity or preferred
 | 
						|
		minimum I/O size which is the smallest request the
 | 
						|
		device can perform without incurring a performance
 | 
						|
		penalty.  For disk drives this is often the physical
 | 
						|
		block size.  For RAID arrays it is often the stripe
 | 
						|
		chunk size.  A properly aligned multiple of
 | 
						|
		minimum_io_size is the preferred request size for
 | 
						|
		workloads where a high number of I/O operations is
 | 
						|
		desired.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/queue/optimal_io_size
 | 
						|
Date:		April 2009
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Storage devices may report an optimal I/O size, which is
 | 
						|
		the device's preferred unit for sustained I/O.  This is
 | 
						|
		rarely reported for disk drives.  For RAID arrays it is
 | 
						|
		usually the stripe width or the internal track size.  A
 | 
						|
		properly aligned multiple of optimal_io_size is the
 | 
						|
		preferred request size for workloads where sustained
 | 
						|
		throughput is desired.  If no optimal I/O size is
 | 
						|
		reported this file contains 0.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/queue/nomerges
 | 
						|
Date:		January 2010
 | 
						|
Contact:
 | 
						|
Description:
 | 
						|
		Standard I/O elevator operations include attempts to
 | 
						|
		merge contiguous I/Os. For known random I/O loads these
 | 
						|
		attempts will always fail and result in extra cycles
 | 
						|
		being spent in the kernel. This allows one to turn off
 | 
						|
		this behavior on one of two ways: When set to 1, complex
 | 
						|
		merge checks are disabled, but the simple one-shot merges
 | 
						|
		with the previous I/O request are enabled. When set to 2,
 | 
						|
		all merge tries are disabled. The default value is 0 -
 | 
						|
		which enables all types of merge tries.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/discard_alignment
 | 
						|
Date:		May 2011
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Devices that support discard functionality may
 | 
						|
		internally allocate space in units that are bigger than
 | 
						|
		the exported logical block size. The discard_alignment
 | 
						|
		parameter indicates how many bytes the beginning of the
 | 
						|
		device is offset from the internal allocation unit's
 | 
						|
		natural alignment.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/<partition>/discard_alignment
 | 
						|
Date:		May 2011
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Devices that support discard functionality may
 | 
						|
		internally allocate space in units that are bigger than
 | 
						|
		the exported logical block size. The discard_alignment
 | 
						|
		parameter indicates how many bytes the beginning of the
 | 
						|
		partition is offset from the internal allocation unit's
 | 
						|
		natural alignment.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/queue/discard_granularity
 | 
						|
Date:		May 2011
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Devices that support discard functionality may
 | 
						|
		internally allocate space using units that are bigger
 | 
						|
		than the logical block size. The discard_granularity
 | 
						|
		parameter indicates the size of the internal allocation
 | 
						|
		unit in bytes if reported by the device. Otherwise the
 | 
						|
		discard_granularity will be set to match the device's
 | 
						|
		physical block size. A discard_granularity of 0 means
 | 
						|
		that the device does not support discard functionality.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/queue/discard_max_bytes
 | 
						|
Date:		May 2011
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Devices that support discard functionality may have
 | 
						|
		internal limits on the number of bytes that can be
 | 
						|
		trimmed or unmapped in a single operation. Some storage
 | 
						|
		protocols also have inherent limits on the number of
 | 
						|
		blocks that can be described in a single command. The
 | 
						|
		discard_max_bytes parameter is set by the device driver
 | 
						|
		to the maximum number of bytes that can be discarded in
 | 
						|
		a single operation. Discard requests issued to the
 | 
						|
		device must not exceed this limit. A discard_max_bytes
 | 
						|
		value of 0 means that the device does not support
 | 
						|
		discard functionality.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/queue/discard_zeroes_data
 | 
						|
Date:		May 2011
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Devices that support discard functionality may return
 | 
						|
		stale or random data when a previously discarded block
 | 
						|
		is read back. This can cause problems if the filesystem
 | 
						|
		expects discarded blocks to be explicitly cleared. If a
 | 
						|
		device reports that it deterministically returns zeroes
 | 
						|
		when a discarded area is read the discard_zeroes_data
 | 
						|
		parameter will be set to one. Otherwise it will be 0 and
 | 
						|
		the result of reading a discarded area is undefined.
 | 
						|
 | 
						|
What:		/sys/block/<disk>/queue/write_same_max_bytes
 | 
						|
Date:		January 2012
 | 
						|
Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 | 
						|
Description:
 | 
						|
		Some devices support a write same operation in which a
 | 
						|
		single data block can be written to a range of several
 | 
						|
		contiguous blocks on storage. This can be used to wipe
 | 
						|
		areas on disk or to initialize drives in a RAID
 | 
						|
		configuration. write_same_max_bytes indicates how many
 | 
						|
		bytes can be written in a single write same command. If
 | 
						|
		write_same_max_bytes is 0, write same is not supported
 | 
						|
		by the device.
 | 
						|
 |