 c9f3f2d8b3
			
		
	
	
	c9f3f2d8b3
	
	
	
		
			
			Correct typo (double words) in documentations. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
		
			
				
	
	
		
			109 lines
		
	
	
	
		
			4.3 KiB
			
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			109 lines
		
	
	
	
		
			4.3 KiB
			
		
	
	
	
		
			Text
		
	
	
	
	
	
| Reference counting in pnfs:
 | |
| ==========================
 | |
| 
 | |
| The are several inter-related caches.  We have layouts which can
 | |
| reference multiple devices, each of which can reference multiple data servers.
 | |
| Each data server can be referenced by multiple devices.  Each device
 | |
| can be referenced by multiple layouts.  To keep all of this straight,
 | |
| we need to reference count.
 | |
| 
 | |
| 
 | |
| struct pnfs_layout_hdr
 | |
| ----------------------
 | |
| The on-the-wire command LAYOUTGET corresponds to struct
 | |
| pnfs_layout_segment, usually referred to by the variable name lseg.
 | |
| Each nfs_inode may hold a pointer to a cache of these layout
 | |
| segments in nfsi->layout, of type struct pnfs_layout_hdr.
 | |
| 
 | |
| We reference the header for the inode pointing to it, across each
 | |
| outstanding RPC call that references it (LAYOUTGET, LAYOUTRETURN,
 | |
| LAYOUTCOMMIT), and for each lseg held within.
 | |
| 
 | |
| Each header is also (when non-empty) put on a list associated with
 | |
| struct nfs_client (cl_layouts).  Being put on this list does not bump
 | |
| the reference count, as the layout is kept around by the lseg that
 | |
| keeps it in the list.
 | |
| 
 | |
| deviceid_cache
 | |
| --------------
 | |
| lsegs reference device ids, which are resolved per nfs_client and
 | |
| layout driver type.  The device ids are held in a RCU cache (struct
 | |
| nfs4_deviceid_cache).  The cache itself is referenced across each
 | |
| mount.  The entries (struct nfs4_deviceid) themselves are held across
 | |
| the lifetime of each lseg referencing them.
 | |
| 
 | |
| RCU is used because the deviceid is basically a write once, read many
 | |
| data structure.  The hlist size of 32 buckets needs better
 | |
| justification, but seems reasonable given that we can have multiple
 | |
| deviceid's per filesystem, and multiple filesystems per nfs_client.
 | |
| 
 | |
| The hash code is copied from the nfsd code base.  A discussion of
 | |
| hashing and variations of this algorithm can be found at:
 | |
| http://groups.google.com/group/comp.lang.c/browse_thread/thread/9522965e2b8d3809
 | |
| 
 | |
| data server cache
 | |
| -----------------
 | |
| file driver devices refer to data servers, which are kept in a module
 | |
| level cache.  Its reference is held over the lifetime of the deviceid
 | |
| pointing to it.
 | |
| 
 | |
| lseg
 | |
| ----
 | |
| lseg maintains an extra reference corresponding to the NFS_LSEG_VALID
 | |
| bit which holds it in the pnfs_layout_hdr's list.  When the final lseg
 | |
| is removed from the pnfs_layout_hdr's list, the NFS_LAYOUT_DESTROYED
 | |
| bit is set, preventing any new lsegs from being added.
 | |
| 
 | |
| layout drivers
 | |
| --------------
 | |
| 
 | |
| PNFS utilizes what is called layout drivers. The STD defines 3 basic
 | |
| layout types: "files" "objects" and "blocks". For each of these types
 | |
| there is a layout-driver with a common function-vectors table which
 | |
| are called by the nfs-client pnfs-core to implement the different layout
 | |
| types.
 | |
| 
 | |
| Files-layout-driver code is in: fs/nfs/nfs4filelayout.c && nfs4filelayoutdev.c
 | |
| Objects-layout-deriver code is in: fs/nfs/objlayout/.. directory
 | |
| Blocks-layout-deriver code is in: fs/nfs/blocklayout/.. directory
 | |
| 
 | |
| objects-layout setup
 | |
| --------------------
 | |
| 
 | |
| As part of the full STD implementation the objlayoutdriver.ko needs, at times,
 | |
| to automatically login to yet undiscovered iscsi/osd devices. For this the
 | |
| driver makes up-calles to a user-mode script called *osd_login*
 | |
| 
 | |
| The path_name of the script to use is by default:
 | |
| 	/sbin/osd_login.
 | |
| This name can be overridden by the Kernel module parameter:
 | |
| 	objlayoutdriver.osd_login_prog
 | |
| 
 | |
| If Kernel does not find the osd_login_prog path it will zero it out
 | |
| and will not attempt farther logins. An admin can then write new value
 | |
| to the objlayoutdriver.osd_login_prog Kernel parameter to re-enable it.
 | |
| 
 | |
| The /sbin/osd_login is part of the nfs-utils package, and should usually
 | |
| be installed on distributions that support this Kernel version.
 | |
| 
 | |
| The API to the login script is as follows:
 | |
| 	Usage: $0 -u <URI> -o <OSDNAME> -s <SYSTEMID>
 | |
| 	Options:
 | |
| 		-u		target uri e.g. iscsi://<ip>:<port>
 | |
| 				(allways exists)
 | |
| 				(More protocols can be defined in the future.
 | |
| 				 The client does not interpret this string it is
 | |
| 				 passed unchanged as received from the Server)
 | |
| 		-o		osdname of the requested target OSD
 | |
| 				(Might be empty)
 | |
| 				(A string which denotes the OSD name, there is a
 | |
| 				 limit of 64 chars on this string)
 | |
| 		-s 		systemid of the requested target OSD
 | |
| 				(Might be empty)
 | |
| 				(This string, if not empty is always an hex
 | |
| 				 representation of the 20 bytes osd_system_id)
 | |
| 
 | |
| blocks-layout setup
 | |
| -------------------
 | |
| 
 | |
| TODO: Document the setup needs of the blocks layout driver
 |