Coffeehouse Thread

85 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

It is time... Move the filesystem off of disks

Back to Forum: Coffeehouse
  • User profile image
    Minh

    Bass said:
    Sven Groot said:
    *snip*

    The same arguments have been made in the past regarding cooperative vs preemptive multitasking. I assume you know who won in the end. The thing about any engineering project is there is limited resources. If you choose to optimize one part of your DB (file I/O), you probably are missing out somewhere else (eg: CPU cache). As I said before, the CPU cache is not programmable. So if you don't design your data structures with the CPU cache in mind, you are losing a ridiculously important optimization.

     

    Another thing you conveniently leave out if that kernel developer and database developer many times work for the same company. Eg: Oracle and IBM are both Linux kernel developers (both interestingly have contributed file systems to the Linux kernel). There is contributions to the Linux kernel that were specifically designed around making databases faster. Sometimes this means tweaking the characteristics of the I/O scheduler and filesystem to improve their database performance, and not the other way around.

     

    Anyway what is interesting about DB performance is depending on what DB vendor you ask, their DB is the fastest. So you can argue about DB performence all day and what approach is better, but MySQL/Oracle/DB2/SQLServer/SQLite are all the fastest DB in existence anyway. Smiley

    The same arguments have been made in the past regardless cooperative vs preemptive multitasking.

     

    Not really... If your claim is:

     

    OS file handling -> Generic (good)

    DBMS "file" handling -> Specific (unnecessary)

     

    then Preemptive vs. Cooperative doesn't fall on the same scale. Pre-emptive vs Cooperative fall more on hardware vs software. And I believe you know which is better.

     

    Look, DBMSes DO have their own internal "file" / page / disk memory management... That's fact. Why? Because that code is more specific to their domain. DBMS loves contiguous blocks of memory... whereas, HDs have 1 to 2 Dimensions (spiral platters) to their storage.

     

    PS. C9, fix the GD posting errors

  • User profile image
    Dexter

    Bass said:
    Sven Groot said:
    *snip*

    The same arguments have been made in the past regarding cooperative vs preemptive multitasking. I assume you know who won in the end. The thing about any engineering project is there is limited resources. If you choose to optimize one part of your DB (file I/O), you probably are missing out somewhere else (eg: CPU cache). As I said before, the CPU cache is not programmable. So if you don't design your data structures with the CPU cache in mind, you are losing a ridiculously important optimization.

     

    Another thing you conveniently leave out if that kernel developer and database developer many times work for the same company. Eg: Oracle and IBM are both Linux kernel developers (both interestingly have contributed file systems to the Linux kernel). There is contributions to the Linux kernel that were specifically designed around making databases faster. Sometimes this means tweaking the characteristics of the I/O scheduler and filesystem to improve their database performance, and not the other way around.

     

    Anyway what is interesting about DB performance is depending on what DB vendor you ask, their DB is the fastest. So you can argue about DB performence all day and what approach is better, but MySQL/Oracle/DB2/SQLServer/SQLite are all the fastest DB in existence anyway. Smiley

    Hrm, I think I lost count of how many times you switched the subject in this thread. Not to mention that you bring arguments for things that nobody argued about.

     

    It's perfectly fine for a database to leave the caching to the filesystem if you want a lightweight codebase or if you don't have the resources to do it. The problem starts when you claim that the kernel can do caching better. I mentioned in short why it can't and Sven provided more details. Instead of answering to that you went on a rant about CPU cache, schedulers, algorithms and whatnot.

  • User profile image
    Bass

    Dexter said:
    Bass said:
    *snip*

    Hrm, I think I lost count of how many times you switched the subject in this thread. Not to mention that you bring arguments for things that nobody argued about.

     

    It's perfectly fine for a database to leave the caching to the filesystem if you want a lightweight codebase or if you don't have the resources to do it. The problem starts when you claim that the kernel can do caching better. I mentioned in short why it can't and Sven provided more details. Instead of answering to that you went on a rant about CPU cache, schedulers, algorithms and whatnot.

    Hrm, I think I lost count of how many times you switched the subject in this thread. Not to mention that you bring arguments for things that nobody argued about.

     

    It's all part of the same argument. You'll just have to keep up. Smiley

     

    It's perfectly fine for a database to leave the caching to the filesystem if you want a lightweight codebase or if you don't have the resources to do it. The problem starts when you claim that the kernel can do caching better. I mentioned in short why it can't and Sven provided more details. Instead of answering to that you went on a rant about CPU cache, schedulers, algorithms and whatnot.

     

    The kernel can do caching better. You can claim that the software knows more about the characteristics of the data structures it uses, but the kernel knows more about the characteristics of the hardware it is running on.

  • User profile image
    Cream​Filling512

    Dexter said:
    Bass said:
    *snip*

    Hrm, I think I lost count of how many times you switched the subject in this thread. Not to mention that you bring arguments for things that nobody argued about.

     

    It's perfectly fine for a database to leave the caching to the filesystem if you want a lightweight codebase or if you don't have the resources to do it. The problem starts when you claim that the kernel can do caching better. I mentioned in short why it can't and Sven provided more details. Instead of answering to that you went on a rant about CPU cache, schedulers, algorithms and whatnot.

    I also said the same thing, and he started talking about CPU caches for some reason.  Anyway this is basic engineering sense, it doesn't even really have anything to do with databases, you can almost always get better performance if you "roll it yourself".  Whether or not it is feasible to do this or worth the effort is what you need to decide as an engineer.  But when major commercial databases like Microsoft SQL,  that have massive budgets and huge teams of engineers, made a decision to roll their own caching scheme, maybe it's a good indication you're wrong about this?

  • User profile image
    Bass

    Minh said:
    Bass said:
    *snip*

    The same arguments have been made in the past regardless cooperative vs preemptive multitasking.

     

    Not really... If your claim is:

     

    OS file handling -> Generic (good)

    DBMS "file" handling -> Specific (unnecessary)

     

    then Preemptive vs. Cooperative doesn't fall on the same scale. Pre-emptive vs Cooperative fall more on hardware vs software. And I believe you know which is better.

     

    Look, DBMSes DO have their own internal "file" / page / disk memory management... That's fact. Why? Because that code is more specific to their domain. DBMS loves contiguous blocks of memory... whereas, HDs have 1 to 2 Dimensions (spiral platters) to their storage.

     

    PS. C9, fix the GD posting errors

    I don't think preemptive vs. cooperative is a different analogy at all. Cooperative multitasking allows a process to decide how much CPU time it needs. In preemptive multitasking, this is decided for the process (although in some systems like POSIX/Windows, a process can suggest how should it be handled to the kernel).

  • User profile image
    Cream​Filling512

    Bass said:
    Minh said:
    *snip*

    I don't think preemptive vs. cooperative is a different analogy at all. Cooperative multitasking allows a process to decide how much CPU time it needs. In preemptive multitasking, this is decided for the process (although in some systems like POSIX/Windows, a process can suggest how should it be handled to the kernel).

    Cooperative multitasking would probably make more sense on a database server.  Normally the entire machine is dedicated to running the database server, so there is really only one process at a time.  You would lose the overhead of preemptive multitasking, and getting context switched at a bad time.

  • User profile image
    Bass

    CreamFilling512 said:
    Dexter said:
    *snip*

    I also said the same thing, and he started talking about CPU caches for some reason.  Anyway this is basic engineering sense, it doesn't even really have anything to do with databases, you can almost always get better performance if you "roll it yourself".  Whether or not it is feasible to do this or worth the effort is what you need to decide as an engineer.  But when major commercial databases like Microsoft SQL,  that have massive budgets and huge teams of engineers, made a decision to roll their own caching scheme, maybe it's a good indication you're wrong about this?

    The CPU cache thing was in regard to Dexter's assertion that a DB shouldn't have to localize it's important data. Quite frankly to run efficiently on an x86 processor, it has no other choice. . I don't know how many of you ever did assembly programming. Although you can create a memory caching algorithm, you can't create your own processor caching algorithm on x86. That is hard-coded on the CPU by Intel/AMD. Performance degrades considerably on x86 if there is are many cache misses, which is the cache equivalent of a page fault. So your data structures must consider this in order to efficiently run on the architecture.

  • User profile image
    Bass

    CreamFilling512 said:
    Bass said:
    *snip*

    Cooperative multitasking would probably make more sense on a database server.  Normally the entire machine is dedicated to running the database server, so there is really only one process at a time.  You would lose the overhead of preemptive multitasking, and getting context switched at a bad time.

    Well with that argument, a database server should BE a kernel right? Tongue Out Since it's going to do everything a kernel does! That is clearly the way to get the optimal performance.

  • User profile image
    Dexter

    Bass said:
    CreamFilling512 said:
    *snip*

    The CPU cache thing was in regard to Dexter's assertion that a DB shouldn't have to localize it's important data. Quite frankly to run efficiently on an x86 processor, it has no other choice. . I don't know how many of you ever did assembly programming. Although you can create a memory caching algorithm, you can't create your own processor caching algorithm on x86. That is hard-coded on the CPU by Intel/AMD. Performance degrades considerably on x86 if there is are many cache misses, which is the cache equivalent of a page fault. So your data structures must consider this in order to efficiently run on the architecture.

    The CPU cache thing was in regard to Dexter's assertion that a DB shouldn't have to localize it's important data
    .

     

    Are you insane?  I've said that I don't think the database should move data around to keep the kernel happy. That's not the same thing as "localizing" important data so it fits the CPU cache (or the harddrive).

     

    The kernel can do caching better. You can claim that the software knows more about the characteristics of the data structures it uses, but the kernel knows more about the characteristics of the hardware it is running on.

     

    Except the characteristics of the hardware are likely to be simpler than the characteristics of the data structures and access patterns. Which set of characteristics do you think will be easier to communicate to the other party?

  • User profile image
    Bass

    Dexter said:
    Bass said:
    *snip*

     

    Except the characteristics of the hardware are likely to be simpler than the characteristics of the data structures and access patterns. Which set of characteristics do you think will be easier to communicate to the other party?

    Are you insane?  I've said that I don't think the database should move data around to keep the kernel happy. That's not the same thing as "localizing" important data so it fits the CPU cache (or the harddrive).

     

    And why not? The kind of caching algorithm the Linux kernel uses is not all that different from what Intel microcode uses. You kill two birds with one stone. PS: Ad homiem is evidence of a losing argument.

     

    Except the characteristics of the hardware are likely to be simpler than the characteristics of the data structures and access patterns. Which set of characteristics do you think will be easier to communicate to the other party?

     

    Well a kernel can safely assume that important/commonly used data in memory will be near each other, because that's how optimizing compilers and high performence databases tend to structure their data. True story.

     

  • User profile image
    Dexter

    Bass said:
    Dexter said:
    *snip*

    Are you insane?  I've said that I don't think the database should move data around to keep the kernel happy. That's not the same thing as "localizing" important data so it fits the CPU cache (or the harddrive).

     

    And why not? The kind of caching algorithm the Linux kernel uses is not all that different from what Intel microcode uses. You kill two birds with one stone. PS: Ad homiem is evidence of a losing argument.

     

    Except the characteristics of the hardware are likely to be simpler than the characteristics of the data structures and access patterns. Which set of characteristics do you think will be easier to communicate to the other party?

     

    Well a kernel can safely assume that important/commonly used data in memory will be near each other, because that's how optimizing compilers and high performence databases tend to structure their data. True story.

     

    Ad homiem is evidence of a losing argument.

     

    Ha ha, look who's talking. You're twisting my words yet you claim "losing argument".

     

    Well a kernel can safely assume that important/commonly used data in memory will near each other, because that's how optimizing compilers and high performence databases tend to structure their data. True story.

     

    And how exactly does this relate to what I said? Or more generally, how exactly does this relate to caching?

     

  • User profile image
    Bass

    Dexter said:
    Bass said:
    *snip*

     

    And how exactly does this relate to what I said? Or more generally, how exactly does this relate to caching?

     

    Ha ha, look who's talking. You're twisting my words yet you claim "losing argument".

     

    Well I see you want to be on an equal footing with me on this. So I'll call you an insane person also. Insane person.

     

    And how exactly does this relate to what I said? Or more generally, how exactly does this relate to caching?

     

    This is the original point I am addressing:

    Seriously, do you really want/expect a database system to move gigabytes or terrabytes of data around just to keep the kernel happy?

     

    Okay I'm going to say this again. Please no more "NUH UHs" on this. I'm quite obviously right.

     

    You can not write any explicit caching code to efficently cache things on an x86 processor. This algorithm is hardcoded into the control unit of the actual processor. The only way to get important data into cache (which is of course, what you want) is to "suggest" it to the proccessor by your data structures. This means working with the quirks of the branch predictor, and also localizing commonly accessed data.

     

    Of course this is to keep the CPU happy and you are probably going to be like "well I didn't say CPU, so I am somehow correct, and you are wrong". Well buddy, you have to keep the CPU happy. A unhappy CPU is a cache missing CPU, and you MUST avoid this to have any reasonable performance.

     

    And interestingly, by keeping the CPU happy, you also tend to make the kernel happy. Because the kernel isn't using some magical caching algorithm that Intel doesn't know about.

     

    Kapeesh?

  • User profile image
    Dexter

    Bass said:
    Dexter said:
    *snip*

    Ha ha, look who's talking. You're twisting my words yet you claim "losing argument".

     

    Well I see you want to be on an equal footing with me on this. So I'll call you an insane person also. Insane person.

     

    And how exactly does this relate to what I said? Or more generally, how exactly does this relate to caching?

     

    This is the original point I am addressing:

    Seriously, do you really want/expect a database system to move gigabytes or terrabytes of data around just to keep the kernel happy?

     

    Okay I'm going to say this again. Please no more "NUH UHs" on this. I'm quite obviously right.

     

    You can not write any explicit caching code to efficently cache things on an x86 processor. This algorithm is hardcoded into the control unit of the actual processor. The only way to get important data into cache (which is of course, what you want) is to "suggest" it to the proccessor by your data structures. This means working with the quirks of the branch predictor, and also localizing commonly accessed data.

     

    Of course this is to keep the CPU happy and you are probably going to be like "well I didn't say CPU, so I am somehow correct, and you are wrong". Well buddy, you have to keep the CPU happy. A unhappy CPU is a cache missing CPU, and you MUST avoid this to have any reasonable performance.

     

    And interestingly, by keeping the CPU happy, you also tend to make the kernel happy. Because the kernel isn't using some magical caching algorithm that Intel doesn't know about.

     

    Kapeesh?

    I'm quite obviously right

     

    Yeah, you're right... about a completly different and unrelated problem. What problem? Ah, it was branch prediction. Ah no, sorry, that was about CPU cache. Ah no, wrong again it was about optimizing compilers. Oops, I've missed again. No, I bet it was about 42.

  • User profile image
    Bass

    Dexter said:
    Bass said:
    *snip*

     

    Yeah, you're right... about a completly different and unrelated problem. What problem? Ah, it was branch prediction. Ah no, sorry, that was about CPU cache. Ah no, wrong again it was about optimizing compilers. Oops, I've missed again. No, I bet it was about 42.

    A system memory cache is a just a much slower and much bigger CPU cache.

     

    In fact this is how the CPU sees the world:

     

    Page fault: Geographical time

    Cache miss: Snail time

    In cache: Ferrari Enzo

     

    You want your CPU to be riding that Ferrari as much as possible. Smiley That means NO DIRECT MEMORY ACCESS IF POSSIBLE. A big optimization thing (at least big as autovectorization) is to try to keep your working data on the cache, and never make the CPU explicitly access a piece of memory. Does that sound familiar? Isn't that exactly what you want to do with memory caching in general?

     

    Oh wait, I can even make a fill in the blank:

    The point of X caching is to keep as much working data in a fast memory store (in this case: Y) as possible.

     

    I'm going to guess the reply: "irrelevant". LOLWUT?

  • User profile image
    AndyC

    Bass said:
    Dexter said:
    *snip*

    A system memory cache is a just a much slower and much bigger CPU cache.

     

    In fact this is how the CPU sees the world:

     

    Page fault: Geographical time

    Cache miss: Snail time

    In cache: Ferrari Enzo

     

    You want your CPU to be riding that Ferrari as much as possible. Smiley That means NO DIRECT MEMORY ACCESS IF POSSIBLE. A big optimization thing (at least big as autovectorization) is to try to keep your working data on the cache, and never make the CPU explicitly access a piece of memory. Does that sound familiar? Isn't that exactly what you want to do with memory caching in general?

     

    Oh wait, I can even make a fill in the blank:

    The point of X caching is to keep as much working data in a fast memory store (in this case: Y) as possible.

     

    I'm going to guess the reply: "irrelevant". LOLWUT?

    You have to write your database caching algorithms with the kernel and CPU/hardware behaviour in mind. You certainly don't just "leave it to the kernel" unless you're expecting performance not to be important. That's true of any bit of coding.

  • User profile image
    Bass

    AndyC said:
    Bass said:
    *snip*

    You have to write your database caching algorithms with the kernel and CPU/hardware behaviour in mind. You certainly don't just "leave it to the kernel" unless you're expecting performance not to be important. That's true of any bit of coding.

    You want to leave "as much as possible" to the kernel though. Notice that's exactly what I said. Heh heh heh.

     

    You guys are a riot.

  • User profile image
    Dexter

    Bass said:
    Dexter said:
    *snip*

    A system memory cache is a just a much slower and much bigger CPU cache.

     

    In fact this is how the CPU sees the world:

     

    Page fault: Geographical time

    Cache miss: Snail time

    In cache: Ferrari Enzo

     

    You want your CPU to be riding that Ferrari as much as possible. Smiley That means NO DIRECT MEMORY ACCESS IF POSSIBLE. A big optimization thing (at least big as autovectorization) is to try to keep your working data on the cache, and never make the CPU explicitly access a piece of memory. Does that sound familiar? Isn't that exactly what you want to do with memory caching in general?

     

    Oh wait, I can even make a fill in the blank:

    The point of X caching is to keep as much working data in a fast memory store (in this case: Y) as possible.

     

    I'm going to guess the reply: "irrelevant". LOLWUT?

    Feel free to continue rambling. I think I'll go drink my pan galactic gargle blaster now, thank you.

  • User profile image
    Bass

    Dexter said:
    Bass said:
    *snip*

    Feel free to continue rambling. I think I'll go drink my pan galactic gargle blaster now, thank you.

    You just don't like admitting you are wrong huh? I noticed that in Tech Off too. You have to try to "correct" everyone. You can't correct what is already correct though, buddy. Smiley

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.