Thursday, December 13, 2007

Case Study: Linux Memory Management

MemoryHierarchy
In order to understand the need for advanced page replacement algorithms, it is necessary to understand something about the memory hierarchy, access patterns and mixed workloads.


Memory Hierarchy
There are two types of memory hierarchy. The first one will be familiar to most people, it is the hierarchy of CPU caches, to random access memory (RAM). Caches near the top (level 1 or L1) are faster but smaller, while caches near the bottom (level 2 & 3, L2 / L3) are larger but slower. This provides a cost effective way to very quickly access a small set of data over and over again, while simultaneously being able to cache lots of data. The assumption is that the amount of data that a program accesses on very short time scales will fit in the L1 cache, and that that short term working set can slowly shift to other data.

This cache hierarchy is inclusive: every piece of data that is in L1 cache will also be in L2 cache and in memory; a line of L2 cache cannot be replaced while the corresponding data is still present in the L1 cache. When the L1 cache working set shifts back to a recent working set, chances are the data will still be in L2 cache and can be quickly loaded back into L1.


The second type of cache hierarchy is exclusive: a piece of data can be evicted from the second level cache, while it is still present in the first level cache. This is common when dealing with storage servers, like NFS or Samba servers and other NAS/SAN equipment. Typically the top level in this cache hierarchy will be the RAM on local workstations or computational nodes; the bottom level is the cache on a storage server.

Since the accesses (hits) on the data in the first level (local) cache are not visible to the storage server, the only cache hits the storage server sees are the misses of the first level cache. This means that there commonly is a very large distance between references. This means normal LRU will not be very effective and the storage server needs a smarter replacement algorithm.

Upload new attachment "secondlevel-cache.jpg"

More information on the problems LRU has with a second level cache can be found in this paper from the 2001 USENIX Annual Technical Conference by Zhou, Philbin and Li: The Multi-Queue Replacement Algorithm for Second Level Buffer Caches. Why this algorithm probably isn't very suitable as a first level cache replacement algorithm is left as an exercise to the reader

Thursday, November 22, 2007

2.) Two reasons why a regional bank might decide to buy six server computers instead of one supercomputer:

  • it have at least backups for their important files whenever the other computers commits error or acquires damages.
  • it have at least 3 or 4 computers running whenever 2 or 3 computers have their maintenance.

Operating System News

New Windows could solve age-old format puzzle--at a price

To achieve the long-elusive goal of easily finding information hidden in computer files, Microsoft is returning to a decade-old idea.
The company is building new file organization software that will begin to form the underpinnings of the next major version of its Windows operating system. The complex data software is meant to address a conundrum as old as the computer industry itself: how to quickly find and work with a piece of information, no matter what its format, from any location.

For those using Windows, this will mean easier, faster and more reliable searches for information. Replacing its antiquated file system with modern database technology should also mean a more reliable Windows that's less likely to break and easier to fix when it does, said analysts and software developers familiar with the company's plans.
In the process, the plan could boost Microsoft's high-profile .Net Web services plan and pave the way to enter new markets for document management and portal software, while simultaneously dealing a blow to competitors.
But success won't come overnight. Building a new data store is a massive undertaking, one that will touch virtually every piece of software Microsoft sells. The company plans to include the first pieces of the new data store in next release of Windows, code-named Longhorn, which is scheduled to debut in test form next year.
"We're going to have to redo the Windows shell; we're going to have to redo Office, and Outlook particularly, to take advantage" of the new data store, Microsoft CEO Steve Ballmer said in a recent interview with CNET News.com. "We're working hard on it. It's tough stuff."
Tough indeed. The development of the new file system technology is so difficult that Microsoft may have to market two distinctly different product lines while it completes the work--a move Ballmer concedes would be a huge step backward in the company's long-sought plan to unify its operating systems with Windows XP and Windows .Net Server, which has been delayed until year's end.
For years, Microsoft has sold two operating systems: a consumer version based on the 20-year-old technology DOS, and a corporate version based on the company's newer, built-from-scratch Windows NT kernel. The dual-OS track has frustrated software developers, who needed to support two different operating systems, and has confused customers, who often didn't understand the difference between them.
"Will we have two parallel tracks in the market at once? Not desirable. There are a lot of reasons why that was really a pain in the neck for everybody, and I hope we can avoid that here," Ballmer said. "But it's conceivable that we will wind up with something that will be put on a dual track."

Source: http://www.news.com/