Nope. This is an implementation of one of several things that people often imagi...

KingOfCoders · on March 30, 2024

I have no clue, so I've asked, where is the difference?

wtallis · on March 30, 2024

DirectStorage is mostly an API for CPU code to asynchronously issue high-level storage requests such as asking for a file to be read from storage and the contents placed in a particular GPU buffer. Behind the scenes, the file contents could in theory be transferred from an SSD to the GPU using P2P DMA, because the OS now has enough of a big-picture view of what's going on to set up that kind of transfer when it's possible. But everything about parsing the filesystem data structures to locate the requested file data and issue commands to the SSD is still done on the CPU by the OS, and the application originating those high-level requests is a process running on the CPU and making system calls.

Making the requests asynchronous and issuing lots of requests in parallel is what makes it possible to get good performance out of flash-based storage; P2P DMA would be a relatively minor optimization on top of that. DirectStorage isn't the only way to asynchronously issue batches of storage requests; Windows has long had IOCP and more recently cloned io_uring from Linux.

DirectStorage 1.1 introduced an optional feature for GPU decompression, so that data which is stored on disk in a (the) supported compressed format can be streamed to the GPU and decompressed there instead of needing a round-trip through the CPU and its RAM for decompression. This could help make the P2P DMA option more widely usable by reducing the cases which need to fall back to the CPU, but decompressing on the GPU is nothing that applications couldn't already implement for themselves; DirectStorage just provides a convenient standardized API for this so that GPU vendors can provide a well-optimized decompression implementation. When P2P DMA isn't available, you can still get some computation offloaded from the CPU to the GPU after the compressed data makes a trip through the CPU's RAM.

(Note: official docs about DirectStorage don't really say anything about P2P DMA, but it's clearly being designed to allow for it in the future.)

The GPU4FS described here is a project to implement the filesystem entirely on the GPU: the code to eg. walk the directory hierarchy and locate what address actually holds the file contents is not on the CPU but on the GPU. This approach means the application running on the GPU needs exclusive ownership of the device holding the filesystem. For now, they're using persistent memory as the backing store, but in the future they could implement NVMe and have storage requests originate from the GPU and be delivered directly to the SSD with no CPU or OS involvement.

KingOfCoders · on March 30, 2024

Thanks!