NTFS directory has 100K entries. How much performance boost if spread over 100 subdirectories?

Контекст We have a homegrown filesystem-backed caching library. We currently have performance problems with one installation due to large number of entries (e.g. up to 100,000). The problem: we store all fs entries in one "cache directory". Very large directories perform poorly.

We're looking at spreading those entries over subdirectories--as git does, e.g. 100 subdirectories with ~ 1,000 entries each.

The question

I understand that smaller directories sizes will help with filesystem access.

But will "spreading into subdirectories" speed up traversing all entries, e.g. enumerating/reading all 100,000 entries? I.e. When we initialize/warm the cache from the FS store, we need to traversing all 100,000 entries (and deleting old entries) can take 10+ minutes.

Will "spreading the data" decrease this "traversal time". Additionally this "traversal" actually can/does delete stale entries (e.g older then N days) Will "spreading the data" improve delete times?

Additional Context -NTFS -Windows Family OS (Server 2003, 2008)

-Java J2ee application.

I/we would appreciate any schooling on filesystem scalability issues.

Thanks in advance.

will

p.s. I should comment that I have the tools and ability to test this myself, but figured I'd pick the hive mind for the theory and experience first.

5
задан user331465 5 December 2010 в 01:45
поделиться