NTFS directory has 100K entries. How much performance boost if spread over 100 subdirectories?

Question

NTFS directory has 100K entries. How much performance boost if spread over 100 subdirectories?

Контекст We have a homegrown filesystem-backed caching library. We currently have performance problems with one installation due to large number of entries (e.g. up to 100,000). The problem: we store all fs entries in one "cache directory". Very large directories perform poorly.

We're looking at spreading those entries over subdirectories--as git does, e.g. 100 subdirectories with ~ 1,000 entries each.

The question

I understand that smaller directories sizes will help with filesystem access.

But will "spreading into subdirectories" speed up traversing all entries, e.g. enumerating/reading all 100,000 entries? I.e. When we initialize/warm the cache from the FS store, we need to traversing all 100,000 entries (and deleting old entries) can take 10+ minutes.

Will "spreading the data" decrease this "traversal time". Additionally this "traversal" actually can/does delete stale entries (e.g older then N days) Will "spreading the data" improve delete times?

Additional Context -NTFS -Windows Family OS (Server 2003, 2008)

-Java J2ee application.

I/we would appreciate any schooling on filesystem scalability issues.

Thanks in advance.

will

p.s. I should comment that I have the tools and ability to test this myself, but figured I'd pick the hive mind for the theory and experience first.

5

java performance filesystems ntfs large-data-volumes

задан user331465 5 December 2010 в 01:45

0 ответов

Другие вопросы по тегам:

java performance filesystems ntfs large-data-volumes

NTFS directory has 100K entries. How much performance boost if spread over 100 subdirectories?

0 ответов

Похожие вопросы: