Bass said:Sven Groot said:*snip*
But that just plainly isn't true. The kernel can figure out only what was the most common information in the past, and then hope that this will still be true in the future.
If you design your data structures correctly, it SHOULD be true in the future.
But there are many access patterns where this simply isn't true.
And they don't belong in databases.
The kernel cannot know if an application is going to have an access pattern for which this strategy doesn't work. Only the application can know that. Hence, the application can do better.
I don't think that's actually true. For many data structures, you can not reliably predict where an element will be located on a restructure. If you can, you can ask the kernel to cache that part of the file for you anyway. See the readahead syscall. Which is exactly how readahead daemons like preload work! They don't go around implementing their own file caching, that would be stupid. They let the kernel do it for them, as any good DB implementation would.
And CPU caching isn't relevant to this discussion for several reasons: 1. The timing gap between memory and disk is greater than that between CPU cache and memory. 2. CPU cache is a few MB at best, file cache is typically multiple GB. 3. The kernel does not manage the CPU cache, the CPU does that.
1. It may be, but having too many direct memory accesses with effect system performance in a very bad way.
2. CPU Cache is a few MB at best, so lets ignore it? That sounds like a plan chief.
3. The fact that the CPU manages the CPU cache is exactly why it's important. You can not program a CPU cache, because that is a right reserved only for the immortals who design hardware, and you are but a mortal software developer. The best you could do is structure your data in a way that is conductive to CPU caching. If you are going to do this anyway, why not structure your data in a way that is conductive to the kernel caching? Hmm? The kernel is like the CPU's prophet to software. Like most prophets, he is mortal and software like all other software, but is bestowed on by the immortals with certain miracles. But you shouldn't ignore his powers, for he speakiths directly to the hardware, and knows it's demands. You do not.
Look, I agree that in 99.9% of the cases, you do want to leave this to the kernel. I just believe that very high performance databases are one of the cases where you don't.
Of course you want to delegate as much as possible, but experience has shown that you can in fact get better performance, in many cases much better, by doing it yourself. And it's not just caching. High-performance applications tend to do their own scheduling and memory management as well. Yes, it's an enormous amount of work, so the only reason people do this is because it makes a clear, measurable difference.
They let the kernel do it for them, as any good DB implementation would.
So you contend that SQL Server, Oracle, IBM DB2 and other databases that dominate the TPC-C and TPC-H benchmarks are all in fact not any good because they do their own caching?