Microsoft's new DirectStorage tech will unlock maximum SSD I/O throughput for dramatically faster data management, empowering developers to do impressive things.

Today Microsoft revealed more info on its new DirectStorage tech, which is part of its convergent DirectX 12 Ultimate platform. DirectStorage will use its own unique GPU-based compression/decompression algorithm to significantly optimize data pipeline flow, and Microsoft is working closely with chip-makers on this tech. The company is tweaking Windows to make the OS seamlessly optimize the API, too.
- DirectStorage - Leads to raw speed boost using D3D, specially-designed decompression/compression algorithm, GPU bandwidth
- Sampler Feedback Streaming - More efficient and intelligent delivery of data
DirectStorage's main function is to use GPU power and its higher VRAM memory bandwidth to decompress data at a rate that matches the I/O speeds of an SSD. The idea is to remove CPU-bound constraints and improve data flow by eliminating decompression from the CPU. The result is virtually loading-free video game experiences thanks to a more efficient pipeline of data management and processing.
Here's how it works:
Typical data flow

Typically, asset decompression is handled by Direct3D and the CPU itself (in the Series X's case, a special on-chip decompression block) and data assets are pulled from storage drives via the Windows storage stack. Block compressed and deflate-based compression geometry and texture data is decompressed at run-time from the SSD by D3D and the CPU, passed along to the system memory, then copied to the GPU for scene rendering.
DirectStorage, however, will use the GPU to decompress the assets. This means decompression won't be CPU-bound. DirectStorage will help unleash maximum IO rates for significantly lowered load sequences in video games.
DirectStorage flow

The data will be read by a new optimized Windows storage stack (which includes Sampler Feedback Streaming and a new compression/decompression algorithm) and shot from the NVMe SSD right over to RAM, which copies the data directly to a GPU's VRAM, and then the compressed assets then use the GPUs power to decompress assets. These assets are then rendered by the GPU.
DirectStorage integrates directly into Direct3D 12, "essentially bridging the gap between storage and GPU technology," Microsoft Senior Program Manager Lead Andrew Yeung says.
Workloads are getting more granular--devs can now target down to the MIPS level--and efficiency is incredibly important to reduce overhead and deliver data that's needed. The DirectStorage API also introduces a new batch-based calling pattern in order to keep the SSD and GPU properly fed at an optimum data rates while also ensuring things requests are handled in precise groups. Right now data requests are handled individually, which is inefficient for high workloads.
Read Also: DirectStorage will optimize Gen4 SSDs for Windows 10 gaming
This is made possible by new advanced compression/decompression tech.
Currently, decompression is handled by special CPU-based algorithms, but with PCIe 4.0 speeds and massive asset sizes, CPU overhead is a big problem. This is one of the main reasons why today's games don't run at max data rates advertised on high-end NVMe SSDs.
To solve this problem Microsoft is working closely with chip-makers like Intel, NVIDIA, and AMD to make GPU-based compression tech for existing CPUs and GPUs.
Microsoft has developed an early DirectCompute decompressor that fully utilizes high-end NVMe bandwidth, as well as a CPU-based decompressor for data destined to RAM, and a new special compressor to pack everything down.
On the Windows OS level, Microsoft is further tweaking its Windows storage stack for high bandwidth/high IOPs workloads. The stack pulls the data from the SSD and is a kind of gateway that allows DirectStorage to unleash its full potential, complete
Microsoft also reveals how Sampler Feedback Streaming will work in tandem with DirectStorage.
Sampler Feedback Streaming help offset the brute demands of games that exceed on-system VRAM availability. As we outlined in our in-depth Xbox Series X SSD video, Sampler Feedback Streaming allows the system to work smarter, not harder. SFS allows memory multiplier techniques facilitated by selective asset management--the game will only request specific data batches versus single-file chunks.
In short, SFS is the brains whereas DirectStorage is the brawn.
"If you only bring in half as much memory as before, let's say you have 5MBs instead of 10, you're being very selective and you say 'I only need those five,' what it really looks like is instead of having a 10GB GPU you have a 20GB GPU because you're twice as efficient," Yeung said in the presentation.
DirectStorage will begin previewing to developers in Summer 2021, but there's no info on a final build launch. Check below for more info:
- Microsoft says it's looking to implement DMA (direct memory access) between SSD and GPU using DirectStorage: "We are looking into supporting this in the future, but there's some more work we need to do in order for this functionality to be robust enough to include in Windows."
- May eventually support external storage
- PCIe 3.0 drives supported
- DirecStorage uses its own compression algorithm, new class of compression tech
- Does not require GDK on Windows
- DirectCompute - Allows DirectStorage to run on existing GPUs