Wednesday, February 2, 2011

Cache-oblivious Algorithms

Hi!

This time it's about Parallel Computations, and the first real article in the section is titled Cache-oblivious Algorithms. It introduces the problem of memory bandwidth, the concept of cache hierarchy in modern computers; then describes the underlying idea of cache-oblivious algorithms, and proceeds with an example of how to design a cache-oblivious algorithm for a simple problem.Presented benchmark results show unconditional superiority of the approach.

5 comments:

  1. Thanks, but how that example show "how to design a cache-oblivious algorithm for a simple problem"

    I mean, given a machine that I know how many cores, cache and memory, how that affecting the design of the algorithm? Is that mean I need to keep all the data of a function fit in the L2 cache??

    ReplyDelete
  2. > I mean, given a machine that I know how many cores, cache and memory, how that affecting the design of the algorithm?

    The idea of cache-oblivious algorithms is that particular core/cache characteristics do not affect algorithm design. It has both positive and negative implications. The positive side is that one does not need to re-tune an algorithm for each machine (since one needs to accommodate for L1, L2, L3 and memory, it becomes quite tricky, and one machine may have L3 while other is not). The negative side is that performance suffers somewhat, however cache complexity (number of "memory" transfers) is usually optimal (within a constant factor of theoretical minimum).

    > Is that mean I need to keep all the data of a function fit in the L2 cache?

    I am not sure I get your question. But, well, if you can fit data into lower level cache, it is definitely a good thing for performance. Developers try to at least fit dataset into main memory, because disk accesses make things much more problematic. But as I shown fitting data into caches makes significant difference as well.

    ReplyDelete
  3. Hi Dmitry,
    According to this presentation on modern processors we already have 95% cache hit. Does this mean that we are optimizing our code to get the last 5%?

    Thank you.

    ReplyDelete
  4. Hi,

    According to what presentation?
    Anyway, in this particular algorithm (and some others) we have -> 0 cache hit rate if we do not do the optimization.

    ReplyDelete
  5. This is great! Thanks so much for both the blog posts and then the videos. I've had ideas like this floating around my head, but haven't had a good mental framework to fit them in. It's great that much of my ideas fit into this nice theory (and more importantly that others have done the legwork to actually study what's going on :p ). Great site!

    ReplyDelete