29 lines
		
	
	
	
		
			1.5 KiB
			
		
	
	
	
		
			Text
		
	
	
	
	
	
		
		
			
		
	
	
			29 lines
		
	
	
	
		
			1.5 KiB
			
		
	
	
	
		
			Text
		
	
	
	
	
	
| 
								 | 
							
								The remap_file_pages() system call is used to create a nonlinear mapping,
							 | 
						||
| 
								 | 
							
								that is, a mapping in which the pages of the file are mapped into a
							 | 
						||
| 
								 | 
							
								nonsequential order in memory. The advantage of using remap_file_pages()
							 | 
						||
| 
								 | 
							
								over using repeated calls to mmap(2) is that the former approach does not
							 | 
						||
| 
								 | 
							
								require the kernel to create additional VMA (Virtual Memory Area) data
							 | 
						||
| 
								 | 
							
								structures.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Supporting of nonlinear mapping requires significant amount of non-trivial
							 | 
						||
| 
								 | 
							
								code in kernel virtual memory subsystem including hot paths. Also to get
							 | 
						||
| 
								 | 
							
								nonlinear mapping work kernel need a way to distinguish normal page table
							 | 
						||
| 
								 | 
							
								entries from entries with file offset (pte_file). Kernel reserves flag in
							 | 
						||
| 
								 | 
							
								PTE for this purpose. PTE flags are scarce resource especially on some CPU
							 | 
						||
| 
								 | 
							
								architectures. It would be nice to free up the flag for other usage.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Fortunately, there are not many users of remap_file_pages() in the wild.
							 | 
						||
| 
								 | 
							
								It's only known that one enterprise RDBMS implementation uses the syscall
							 | 
						||
| 
								 | 
							
								on 32-bit systems to map files bigger than can linearly fit into 32-bit
							 | 
						||
| 
								 | 
							
								virtual address space. This use-case is not critical anymore since 64-bit
							 | 
						||
| 
								 | 
							
								systems are widely available.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The plan is to deprecate the syscall and replace it with an emulation.
							 | 
						||
| 
								 | 
							
								The emulation will create new VMAs instead of nonlinear mappings. It's
							 | 
						||
| 
								 | 
							
								going to work slower for rare users of remap_file_pages() but ABI is
							 | 
						||
| 
								 | 
							
								preserved.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								One side effect of emulation (apart from performance) is that user can hit
							 | 
						||
| 
								 | 
							
								vm.max_map_count limit more easily due to additional VMAs. See comment for
							 | 
						||
| 
								 | 
							
								DEFAULT_MAX_MAP_COUNT for more details on the limit.
							 |