In the last entry of Security Through Process Isolation – Part 3, I covered more of the details in the area of Copy on Write processing of files and directories as well as some of the aspects of handling file modifications such as delete and rename processing. In this entry, I’ll dive into the details of:
- Cache and Memory Manger Integration
- Lock Processing
- Direct Volume Access
Before I begin discussing the integration with the Cache and Memory Managers, CM and MM respectively, I want to do a little review from the past few entries. As discussed previously, the 3 file objects which are in process include:
- FO(top) – The file object passed into the filter driver from the IO Manager
- FO(BNS) – The file object representing the open instance within the Base Name Space, if it exists
- FO(SNS) – The file object representing the open instance within the Shadow Name Space, if it exists
The FsContext and FsContext2 structures are initialized and set during the handling of the IRP_MJ_CREATE request, or the PreCreate callback handler in Mini-Filter lingo. During the completion of this processing, the FO(top)->FsContext and FO(top)->FsContext2 pointers are established. These pointers allow us to retrieve the context structures during processing of any later request since the FO(top) file object is passed in with the given request information. Note that the FsContext field is not entirely private to our usage. The kernel will access a common structure at the beginning of the context. Specifically, the FO(top)->FsContext structure must have a system defined header of the form of the FSRTL_COMMON_FCB_HEADER. Note that in more recent operating systems this is embedded within the FSRTL_ADVANCED_FCB_HEADER but in any case the common portion of this header contains information which is directly accessible by the system. This includes a set of file sizes along with 2 ERESOURCEs which are used to maintain consistent access to the underlying file.
In addition to the establishment of the FO(top)->FsContext and FO(top)->FsContext2 pointers, the FO(top)->SectionObjectPointer structure must be set if CM and MM integration is required. This pointer within FO(top) is a structure which contains 3 pointers exclusively for system use. The memory used for this structure is owned by the file system and is maintained on a per file basis, but the internal pointers, and what they point to, are owned and maintained by the system components responsible for the system cache, namely the CM and MM modules. When the file system establishes this set of pointers, in our case within the FO(top) file object, we are telling the system that system cache integration is desired and the file will later be using the system cache for data access, possibly.
Why do you want to support CM and MM integration? The reason is simple, if you would like to support the higher performing cached access to data as well as support memory mapped access to files then it is required that integration with CM and MM be supported. Memory mapped access would include standard mapped views of files within user mode as well as file execution. So if you want to support these interfaces, it is a must.
Once the SectionObjectPointer structure is set within the FO(top) file object, the file system must now inform the CM to initiate the cache map as well as perform calls to handle read and write requests involving the cache map. In order to do this, the file system must first initialize the cache map within the CM. This is handled on the first IO request to the file by calling the system API CcInitializeCacheMap(). This call requires that the caller specify the size of the cache map to establish, the file object which should be associated to the cache map as well as a set of callbacks invoked by the CM for locking the file, more on this below. Once the cache map has been established within the CM, further processing of IO requests, such as top level reads and writes which are not marked as non-cached or paging IO requests, can be sent to the CM through the system APIs CcCopyRead() and CcCopyWrite(). These APIs will then read data from or write data to the system cache map established within the system cache.
Now one of the more complicated aspects of handling IO on the Windows operating system – reentrancy. Reentrancy is a much misunderstood aspect of Windows yet at its root is fairly straightforward. Let’s take, for example, the case where we establish the cache map for a file C:\foo.txt as described above. Let’s also assume that this is handled during the first read to the file. So when we call CcCopyRead() to actually handle the read request, the CM has no data in the system cache to satisfy the read. In this case, the CM will generate a read request which is marked as a non-cached and paging IO and will send this IO request back into the file system for handling. Note this is where lock hierarchy is important.
A side note here … remember that FSRTL_COMMON_FCB_HEADER? It contains 2 locks that are referred to as the file resource and the paging resource. The assumption here is that these locks will always be acquired in order so the file resource is acquired first, then the paging resource.
OK, back to the main thread.
During the original cached read request, we would normally acquire the file resource SHARED while performing the cache map initialization and the call into the CcCopyRead() API. Of course there are cases where we may need to first acquire the file resource EXCLUSIVE, such as during a flush of the cache during a non-cached read, then drop the lock to SHARED but this is something we can discuss later. So while handling the original read request, we’ve acquired the file resource SHARED and performed a call to CcCopyRead(). For our example, let’s assume this is the first read on the file and the CM generated a non-cached, paging read IO to in-page the content from disk to the cache map. When the file system received this paging read request, it would acquire the paging resource SHARED. Now we have successfully acquired both locks in order and in such a way that while handling this request, no modifications can be performed on the file such as a write.
While the above discussion is more general and not specific to an Isolation File System driver, it still applies. The only difference being that for the non-cached, paging read request which is handled from the CM, the file system would update the target file object to be either the FO(BNS) or FO(SNS), depending on whether the file had previously been migrated or not. Once this IO to the underlying file system completes, we would complete it back to the CM and it would in turn populate the system cache with the data and thus allow for the original call to CcCopyRead() to complete with the buffer correctly populated with data.
Write processing is similar to the above description with a few exceptions. The first being the lock acquisitions, generally the locks are taken exclusively to ensure consistency when accessing the metadata for the file. This is because writes can change the size of a file by extending it while other pathways such as setting the file size through a SetFileInformation call can extend or truncate the file. To ensure consistent access to the associated metadata, the file system would generally take the file resource exclusive. One note on file size changes, when this does occur and the file is currently cached it is required that the CM be notified of this change so the cache map size can be updated.
One final comment about the CM and MM integration. As mentioned before, when the cache map is initialized a set of callbacks are provided to the CM in order for the CM to perform pre-acquisition of locks prior to issuing paging requests. These are required for subsystems within the CM which initiate IO not because of top level, cached IO requests, but rather due to read ahead or lazy flushing of data. In order to maintain the lock hierarchy, the CM can call into the file system to acquire the file resource. The same holds true for MM, it must be able to pre-acquire locks prior to performing flushing of dirty pages.
The final point of discussion to be covered in this entry is Direct Volume Access. This is where an application requests to open the volume directly to perform operations, such as IO, IOCtls or other requests directly to the volume and not through file access. In the current discussion, we deny any access to volumes which require write access. The main reason being to prevent exploits such as BadUSB from issuing IOCtls to the volume directly, bypassing the file system layer. It can also be used to prevent direct volume writes such as those used in the Blue Pill exploit but in more recent Windows platforms, the operating system prevents direct writes to volumes from user mode.