Yes I certainly understand that. Is there perhaps some SuperFetch (SF) event viewer that shows what parts of application load came from the cache and what from disk. And maybe some "log/report of insight" into what input the algorithm is getting in deciding what to load.
Caveat: Most of my experience on 7 is from RC. So far in the ~1 week of using 7 RTM it seems similar to RC but who knows maybe SF hasn't detected any patterns yet.
Based on my one week of 7 RTM use so far, I've started VS 2010 Beta 2 on nearly every time after boot though when I ran it during a session has been very irregular (I usually only run IE after logging in then maybe few hours in start some other apps) . I have all my HDD's external to the computer so I easily hear what is going on and not once has the 2010 started from memory. Even quite long after using the desktop and disk being idle (~ hour from boot as of writing this) the Cached in task manager still says ~500 MB.
I don't have pretty much anything else but 2010 installed yet so it cannot be a question of something else pushing it off. SuperFetch may be working but the Win7 definition of "working" seems to be complete opposite of what it was in Vista. I understand some of the target systems for Win7 with either low memory or SSD's do not really need Vista-like SuperFetch (SF) but my system needs it because I have much RAM and plain ol HDD's.
As to the algorithm (In relation to how SF worked in Vista). Without going into too much detail (j/k), I believe a suitable algorithm would prefer to cache stuff that is under the executing process's own folder (if not under "program files") and prefer small files + first & last 100 KB of large files that are loaded within 5 min of app startup and sort based on when it was loaded. The programs I use like to read thousands of small files when they startup and in Vista this made that 10x speed difference as they were loaded straight after boot so when I later on actually wanted to use the app the app avoids the thousands of HDD seeks where the seek latency really piles up. The programs also scan headers/indexes in of large audio/media files which is why initial say (some threshold) KB of large files should be cached.
After all the small files and headers under the application folders are cached, if there is still available memory then move to bigger files and sort caching preference by keeping track of number of times they were loaded. And *obviously* don't load entire large files that are streamed or skipped around+streamed similar to how one might skip and stream a video/music from hdd. This rule should prevent most videos from being cached even if large files were cached from all locations unless the app happened to read through the entire video during that first 5 min of use.
I believe the ETW should already provide most if not all the data necessary to implement the rules above.
My main criticisms with the Vista SuperFetch were
1) that it did not back off reading the disk when there was HDD contention from other programs. I monitored this a lot during Vista use. And of course where there is high contention the physical HDD cache is split among the contenders. So all that "low priority IO" means nothing if SuperFetch is hitting disk simultaneously at high rate trashing the disk cache. Better implementation would notice if there is increase in non-SF access and then slowdown/pause the SF related caching until HDD is free again. To determine the appropriate thresholds (disk hit rate, bw of non-superfetch access) where to start and pause caching, the drive performance must be measured, particularly multi-threaded sequential and random read/write and combination.
and
2) Like I hinted above, the Vista SF went overboard in caching stuff. Download some video and play it once in WMP.. And next day it's loaded to memory on boot! This is why I'd like the SF to be aware of the filesystem location of what it is caching, preferring the process's folder and sub-folders and the Program Files.
and (This one is about 7 with default hdd sleep setting, I had HDD always on in Vista after I got annoyed by the freezes when they spinned up frequently.)
3) It doesn't seem to cache frequent accessed areas of disk before the HDD goes into sleep. I just started 2010b2, then opened a console app project I had made earlier. Just opened the project nothing else. And what do I hear. Two HDDs that were sleeping started up! There's *no* reason why those sleeping HDDs should've awaken from that. VS2010 may've only done some quick enumeration about free space on the sleeping volumes and read maybe some stuff from the root (mft, ntfs log etc) - but since it was sleeping and never disconnected those should have not changed since the drive went to sleep thus they should still be in cache!
Now as it is I can't complain too much about SF in Win7 as it feels like MS left it to me to decide what to load into the cache. I just need to put something in task scheduler at boot to do a buffered read on the VS2010 folders and that should solve the problem and leave me full control over what to actually cache. Too bad most users won't realize what they're missing unless they used Vista. 