In Part 2 of Security Through Process Isolation, I started to dive into the file system isolation aspects of Process Isolation. In this post, I’ll dig a little deeper covering some of the details in implementing items such as
- How to process Copy on Write (CoW) for files and directories
- How to merge directory views of the Base Name Space (BNS) and the Shadow Name Space (SNS)
- IO Processing
First I’d like to cover some more general details to lead us into the CoW processing of files and directories. In the last post, I discussed handling the open processing of files (note that I will use the term file to denote a file or directory unless I need more precise specification) and how the filter will have 3 possible file objects at the end of the open processing. These are
- FO(top) – The file object passed into the filter driver from the IO Manager
- FO(BNS) – The file object representing the open instance within the Base Name Space, if it exists
- FO(SNS) – The file object representing the open instance within the Shadow Name Space, if it exists
At the completion of the open processing, our context structure–which we store in the FO(top)->FsContext and FO(top)->FsContext2 pointers–describes the current state of the file, whether it is a shadow file or not, as well as the possible set of file objects listed above. After the caller has successfully opened a given file, they usually will perform operations on the opened file handle, such as retrieving or setting file information, reading or writing file data, or deleting the file, to list just a few of the possible operations. Let’s consider the set of operations that the caller can perform on a given file that does not alter the file’s data or metadata. These can include
- File metadata queries such as the basic, standard, internal or name information
- Non-cached, file data read requests (more on this later)
- Volume information such as attributes or size of the underlying volume
- Security, Extended Attribute, Quota or a variety of Device and File System IOCtl requests
- Directory enumeration for directory objects
All of the listed operations, as well as a handful of requests not listed, will result in the target object not being modified. Thus if the FO(top) context structure indicates the file is a shadow file or not, we can simply update the target information to either the FO(BNS) or FO(SNS) and pass along the request to the underlying file system.
A sidebar here … remember that we have defined our shadow store location in the previous post as C:\SNS. Let’s assume the underlying file system for the shadow store is NTFS. If we are processing a request for a file on a non-NTFS partition–let’s say the file is located on a thumb drive and the file system is FAT–then we need to maintain this information in our context structure. Why? Because some requests, such as Extended Attributes, are not supported on FAT. Thus if we issue the request to the file located in the SNS, the NTFS partition, then we would get back a successful status. Whereas if we issued the request to the FAT partition, we would get back a failure status. In general this is not a huge concern, but it should be noted so that we can maintain consistent isolation, regardless of where the file is located.
OK, back to the original thread …
Again, for query operations that do not modify the underlying file, we can simply update the target information and pass the request to the underlying file system for processing. The exceptions here are cached read requests (more on this when I cover IO Processing below), and directory enumerations which may require merging (more on this below as well).
Now for the modification requests that result in the underlying file being altered in some way. Remember that the end goal of Process Isolation is to keep the underlying BNS unaltered. This means that any request which could possibly alter the BNS should be redirected to the SNS. The operations which fall into this category include
- Set file information such as attributes, file sizes or time stamps
- Rename or delete processing
- Non-cached write operations (more on this below)
- New file create, overwrite, and supersede operations
- File system lock acquisition callbacks
These operations, along with a handful of device and file system IOCtl requests, will require us to ensure the file is located in the SNS before passing the request to the underlying file system. If the file is already marked as a shadow file, as determined in the open processing, then the decision is easy: simply update the target information of the request and pass it to the underlying file system, namely the SNS. As noted before, for operations such as setting an Extended Attribute on a file, we must ensure that the original location of the file, within the BNS, actually supports EAs. If it does not, then we need to fail the request instead of passing it down.
Let’s now handle the case where the file is not a shadow file and requires migration. We’ll first handle the case where the object we are migrating is a directory and, for example, a new file is being created within the directory. Let’s say the directory name is E:\Directory\Foo and the file we are creating in this directory is Bar.txt. As established in the first post of this series, we are going to maintain the directory name space in the SNS, and thus we would need to create the following directory within the SNS:
Note that if the design opted to have a flat name space, mapping files to some GUID, the processing would be slightly different.
Going with our example, once the directory has been created in the SNS, we can create the file Bar.txt within the SNS directory, marking it as a shadow file. The newly migrated directory now exists in both the BNS and the SNS, and to prevent the need for migrating ALL the existing files in this directory in the SNS, as well as all sub-directories, we will maintain both the FO(BNS) and FO(SNS). We need to do this to handle requests such as directory enumerations, which we’ll cover below.
Now onto the case where the object being modified is a file: we need to create the directory branch where the file exists within the SNS, if it does not already exist, and we need to migrate the file itself from the SNS to the BNS. To start we simply create the directory branch along with an empty file that has the same name as the file being migrated. Next we would read the file content from the file in the SNS, writing it to the file in the BNS. Finally we would query all the metadata of the file in the BNS, such as file attributes, sizes, time stamps, EA’s, etc., and set this on the newly created file in the SNS. Additional items such as alternate data streams also need to be queried and migrated with the file. Once this migration processing is complete, we have the shadow file located in the SNS, which is an exact copy of the file in the BNS. Now the original request which triggered this migration can be passed down to the FO(SNS) for final processing.
There are a few points to note in this migration processing of files. First and foremost we need to maintain some sort of lock on the file during this migration processing, so that any request which is received while the file is being migrated is blocked until the migration is complete. This could be additional opens on the file, or other operations on the file that had been previously opened, but were marked as non-shadow file opens. As well, once the migration is complete we need to ensure that any currently open instances of the file are marked as shadow files and an FO(SNS) initialized for them. Next we mentioned above a category of operations, “File system lock acquisition callbacks,” that will result in the file being migrated. These callbacks are invoked by system modules, such as the Cache or Memory Manager, to pre-acquire file locks before issuing paging IO requests. Being a little preemptive in this grouping, we can consider them all to trigger a CoW event on the file even though some of them may result in only read operations or no operations before the lock is dropped … your design can be more selective in how these are handled. And finally, in the above description I mention that all file data, for all streams, will be copied over to the SNS. This could have some performance impacts for large multi-GB files, resulting in the operation that triggered the CoW event to block until all the data is copied. An alternative design could copy only regions of the file that are altered, maintaining a bitmap of blocks, perhaps on 4KB boundaries, of these regions. It would lessen the impact of copying the entire content of the file, but increase the complexity of the design … your choice there.
Let’s touch on a few final points here, the first being merging of directory content. When we discussed the migration of a directory from the BNS to SNS, we noted that we do not copy all the content of the directory. Thus if a directory enumeration request is processed on a directory that is a shadow directory that has been migrated, then we need to merge the content from the SNS and the BNS. We’ll assume for this discussion that the directory was migrated, thus it exists in both the BNS and the SNS. Handling the directory enumeration request will require us to first query the content of the directory in the BNS–this will be our base listing that we return, possibly modified, to the caller. Once we have this listing, we query the content of the directory within the SNS. Going through each entry in the listing from the BNS, we check to see if any of the following exist:
- The same entry exists in the SNS listing. This would result in the update of the BNS entry to contain the attributes, time stamps, file sizes and other information from the SNS entry
- A deleted node exists in the SNS listing. This would result in the removal of the entry in the BNS listing, which is returned to the caller
Once all entries within the BNS listing have been processed, we would add any entries from the SNS listing not processed to the BNS listing. These would be files or directories that were created within the SNS that did not exist in the BNS. The final, updated listing would then be returned to the caller. As a side note here, we have done quite a bit of hand waiving in the processing of these listings. For example, the directory enumeration is generally not handled in a single request, thus we would need to maintain pointers within our listing to know where to pickup the processing within the SNS. As well, the type of information requested within the directory enumeration can vary, so we must ensure the information queried from the SNS is the same as that which is queried within the BNS. Lastly, we may not want to query the SNS on each sub-query of the directory enumeration, but instead query it when the enumeration request is first handled, and then maintain this information in a memory buffer for handling subsequent requests.
The final point I would like to touch on is file IO processing. As we mentioned previously, we have taken ownership of the FO(top) file object. This means we are responsible for handling the caching interface for this file while the underlying file in the BNS or SNS contains the actual on-disk data. What this entails is for any IO marked as cached IO, meaning that the non-cached or paging bits are not set, we must call the Cache Manager API set for initializing the cache map and handling the IO request. In addition, depending on your design goals, the data that is written to the SNS can be encrypted, or at least obfuscated in some way. This achieves an additional layer of assurance of isolation, so that a process outside of the isolated process group cannot access the data, which has been downloaded from the web, in our case. So there would be no possibility, or the possibility would be minimized, that if a user downloads a malicious file from the web, it could be accessed by a process outside of the isolated process group.