How to Estimate Windows Drive Compression
How to Estimate Windows Drive Compression
Windows drive compression can reduce disk usage on NTFS volumes, but the result depends heavily on file types, CPU speed, disk speed, and current system load. A drive full of text logs, CSV files, JSON, source code, and uncompressed datasets may shrink a lot. A drive full of ZIP files, videos, model weights, images, installers, and Parquet files may barely shrink.
If you are running LLM applications, local eval jobs, agent traces, logs, prompt datasets, or development environments on Windows, estimate compression before you apply it broadly. Do not treat compression as a backup strategy. Do not assume it is safe to run on production machines without testing. Compression changes how reads and writes behave, and it can add CPU overhead at the wrong time.
What Windows drive compression does
Windows NTFS compression stores files in a compressed format on disk. When an application reads a compressed file, Windows decompresses it. When an application writes to a compressed file, Windows may compress it again.
This can help when your bottleneck is disk space or disk I/O. It can hurt when your bottleneck is CPU or when your workload writes heavily to compressed files.
For AI engineering teams, this matters on machines that store:
- LLM request and response logs
- Prompt test datasets
- Trace exports
- Evaluation outputs
- Local vector database files
- Source repositories and build artifacts
- Temporary files created by agents or coding tools
This is file-system compression, not prompt compression. Disk compression changes how files are stored on Windows. Prompt compression changes what you send to a model.
Before you estimate: check the risk
Do a small test before touching a full drive. This is especially important if the machine runs production services, scheduled evals, CI jobs, inference workers, databases, or local agent workflows.
Use these rules:
- Back up important data first. Compression is not a substitute for backups.
- Do not start with the whole system drive. Test a representative folder first.
- Avoid active database folders unless you have tested them. SQLite, Postgres data directories, vector DB files, and similar workloads may behave differently under compression.
- Avoid heavy write paths first. Examples include cache folders, build output folders, and directories that receive high-volume logs.
- Run the test when the machine is quiet. Compression competes for CPU and disk resources.
Method 1: Estimate with the Windows drive Properties checkbox
The easiest way to inspect drive compression is through File Explorer. This also shows the setting most Windows users recognize.
- Open File Explorer.
- Right-click the drive, such as C: or D:.
- Select Properties.
- Look for Compress this drive to save disk space.
- Do not check it yet if you are only estimating.

The checkbox can apply compression broadly across a drive. That is convenient, but it is not the best first step for estimating. Start with a smaller folder that represents your actual data.
Method 2: Estimate with compact.exe on a sample folder
Windows includes compact.exe, a command-line tool for NTFS compression. You can use it to compress a test folder and measure the result.
Pick a folder that resembles the data you care about. For example, if you want to estimate compression for an eval artifact drive, choose a folder that contains real eval outputs, traces, JSON logs, CSV files, and cached responses.
Check current compression status
Open Command Prompt or PowerShell, then run:
compact /q "D:\ai-data\sample"This gives you a quick status check for the folder.
Compress the sample folder
To compress the folder and all files below it, run:
compact.exe /c /s:"D:\ai-data\sample"
compact.exe /c /s:"D:\ai-data\sample" on a representative folder, not the whole drive, when you are estimating.The command recursively compresses files in the target folder. The /c option compresses files. The /s option applies it to files in the specified directory and its subdirectories.
Undo the test if needed
To decompress the same folder after the test, run:
compact.exe /u /s:"D:\ai-data\sample"The /u option uncompresses files.
How to measure compression savings
You need two numbers:
- Size before compression
- Size on disk after compression
In File Explorer:
- Right-click the sample folder.
- Select Properties.
- Record Size.
- Record Size on disk.
- Run the compression test.
- Open Properties again and record the new Size on disk.
Size is the logical file size. Size on disk is the actual disk space used. For compression estimates, use Size on disk.
Sample calculation: estimate a 500 GB drive from a 10 GB test
Suppose you test a 10 GB folder that represents your drive contents.
- Sample folder size before compression: 10 GB
- Sample folder size on disk after compression: 6.5 GB
- Space saved: 3.5 GB
- Compression savings rate: 35%
The formula is:
Savings rate = (Before - After) / BeforeUsing the sample:
(10 GB - 6.5 GB) / 10 GB = 0.35 = 35%If the full drive has 500 GB of similar data, a rough estimate is:
Estimated compressed size = 500 GB × 0.65 = 325 GB
Estimated space saved = 500 GB - 325 GB = 175 GBThis is only an estimate. It works best when the sample folder closely matches the full drive. If your 10 GB sample is mostly logs and your 500 GB drive is mostly model files, the estimate will be wrong.
How to estimate compression time
You can also time the 10 GB test and scale it cautiously.
Example:
- Sample size: 10 GB
- Compression time: 4 minutes
- Observed rate: 2.5 GB per minute
The formula is:
Compression rate = Sample size / TimeUsing the sample:
10 GB / 4 minutes = 2.5 GB per minuteFor a 500 GB drive:
500 GB / 2.5 GB per minute = 200 minutesThat gives a rough estimate of about 3 hours and 20 minutes. Do not treat this as a promise. Full-drive compression may run slower because of small files, locked files, antivirus scanning, thermal throttling, background jobs, disk fragmentation, or different file types.
A practical estimate range might be 3 to 6 hours for that example, depending on the machine. If the drive has millions of small files, it may take much longer.
Monitor CPU, disk, and memory while testing
During the sample compression, open Task Manager and watch the system.
- Press Ctrl + Shift + Esc.
- Open the Performance tab.
- Watch CPU, Memory, and Disk.
- Open the Processes tab if you need to see which process is consuming resources.

Watch for these signals:
- CPU near 100%: compression may slow running apps, local agents, eval jobs, or development tools.
- Disk active time near 100%: the machine may become sluggish, especially on older drives.
- High memory pressure: other processes may start paging, which makes timing estimates less reliable.
- Antivirus activity: scanning can slow the test and distort estimates.
If the test disrupts normal work, pause and reschedule it. For production or shared machines, test on a staging box or clone first.
File types that usually compress well
These files often produce useful savings:
- Plain text logs
- JSON and JSONL traces
- CSV files
- XML files
- Source code
- Markdown files
- Some uncompressed database exports
- Some raw text datasets
For example, a folder of JSONL LLM traces may shrink by 40% to 80% if the records contain repeated keys, repeated metadata, and verbose responses.
File types that usually compress poorly
These files often save little space because they are already compressed or dense:
- ZIP, 7z, RAR, and gzip archives
- MP4, MOV, MP3, and other media files
- JPEG, PNG, and WebP images
- Parquet and ORC files with compression enabled
- Installer packages
- Model weights and some binary tensors
- Encrypted files
If most of your drive is already compressed data, NTFS compression may add overhead without giving you meaningful savings.
Build a better estimate with multiple samples
One sample is better than guessing. Three to five samples are better if the drive contains mixed data.
For an AI development workstation, you might test:
- Logs and traces:
D:\ai-data\traces-sample - Prompt datasets:
D:\ai-data\datasets-sample - Source repositories:
D:\repos\sample - Vector DB files:
D:\vector-store\sample - Model or cache files:
D:\models\sample
Record each result in a table:
| Sample | Before | After | Savings | Time |
|---|---|---|---|---|
| JSONL traces | 10 GB | 4 GB | 60% | 5 min |
| Source repos | 10 GB | 7 GB | 30% | 4 min |
| Model cache | 10 GB | 9.7 GB | 3% | 3 min |
Then weight the estimate by how much of the drive each category uses. If 300 GB of a 500 GB drive is model cache, do not apply the JSONL trace savings rate to the whole drive.
Production cautions for AI teams
Be careful with machines that run active services. NTFS compression can be fine for some workloads, but you should validate it against your own read and write patterns.
Use extra caution with:
- Database directories
- Vector database storage
- High-throughput logging paths
- CI build directories
- Inference servers
- Agent workspaces that create and modify many files
- Folders watched by sync tools
If you decide to compress, use a staged rollout:
- Compress a representative folder.
- Run normal workloads against it.
- Measure latency, CPU, disk usage, and error rates.
- Expand to a larger non-critical folder.
- Only consider broader compression after you have evidence.
Quick estimation checklist
- Back up important files first.
- Choose a representative 10 GB to 50 GB sample.
- Record the folder’s original Size on disk.
- Run
compact.exe /c /s:"path". - Record the compressed Size on disk.
- Calculate the savings percentage.
- Time the test and estimate a cautious range for the full drive.
- Monitor Task Manager during the test.
- Do not assume the result applies to different file types.
- Test performance before using compression on important workloads.
Bottom line
The best way to estimate Windows drive compression is to compress a representative sample, measure Size on disk before and after, time the operation, and monitor the machine while it runs. A 10 GB test can give you a useful first estimate for a 500 GB drive, but only if the sample matches the drive’s real contents.
For AI teams, the safest wins usually come from text-heavy artifacts such as logs, traces, JSONL files, and prompt datasets. Be more cautious with databases, model files, caches, and production machines.
PromptLayer helps AI teams manage prompts, evaluations, datasets, and observability for LLM applications. If you are building production AI workflows and want better visibility into prompt behavior, eval results, and model runs, create a PromptLayer account at https://dashboard.promptlayer.com/create-account.