Hints for Computer System Design

Last modified: 2025-03-05 Category: System Design

Hints for Computer System Design

Key Insights

Caching

Store $[f, x, f(x)]$ tuples in a cache.

If $f$ isn't a pure function, invalidate with the following:

$$ f(x + \Delta) = g(x, \Delta, f(x)) $$

For example, $x$ is an int[], $\Delta$ is a write $(i, v)$, and $f$ is a function int sum(int[] x). Then $g(x, \Delta, f(x))$ is f(x) + v - x[i].

Caches should ideally have adaptive sizes.

A classic example is the caching in hardware that uses $[Fetch, \text{address}, \text{content of address}]$ tuples. Similarly, virtual memory uses $[Page, \text{address}, \text{content of address}]$ tuples.

However, more complicated applications of caching exist. In real-time systems, you're often trying to cache the state of a system given small changes corresponding to events. The key here is to try and invalidate as few entries as possible in response to events.

Lecture Review Notes

Why is system design hard?

external interface isn't well defined
- requirements aren't clear
- Things are often not well-designed
The measure of success is not very clear
- Many different ways to interact with a system
- Many systems, even in production, have bugs

Throw one away

Always be prepared to discard your prototype
Throw ideas at the wall and go with what sticks

Interface Design

Conflicting requirements: - Simple - Complete - Efficient

In a way it's a lot like PL design; exposing new abstractions, objects and operations, manipulating them, etc.

KISS; Do one thing at a time and do it well.

Don't over-promise
Get it right, but beware the dangers of abstractions (especially performance)
Make it fast rather than general and complete. You should keep scope small so that it's easy to optimize, and also to compose with other systems/components
Procedure args let you keep it general but extendable
- C function pointers, C++ functions
- LD_PRELOAD trick: override calls by providing a wrapper that calls the original function, but with some extra functionality
Leave it to the client (check Exokernel paper)
- Unix pipes
Keep interfaces stable
- Counterexample LLVM
Keep a place to stand
- Virtualization!!!

Implementation

Plan to throw one away - learn from prototyping

Keep secrets - impl details hidden from clients. Can be tradeoff for performance optimizations

Handle normal and worst cases separately

Might be OK to crash a few processes if it means the system can recover
Caches in processors are optimized for common case (principle of locality)
Paging in virtual memory is optimized for common case (principle of locality)

Efficiency

Split resources
- Faster to allocate a new resource than to wait for one to be freed
- Heterogeneous systems
  - Specialized hardware like FPGA or GPU to run specialized tasks
  - E.G. Google's TPU, Microsoft Azure FPGAs
Use static analysis

Reliability

Log updates
- Can recover
- Append only is efficient
- Can be used for replication
Atomic transactions
- E.G. ACID

Takeaways

Most successful systems are built with particular themes, many of which are discussed in this paper
When reading papers, look for what you can apply, and ignore irrelevant details.
Hints can be added, e.g. approximation vs precision

Hints for Computer System Design

Hints for Computer System Design

Key Insights

Caching

Lecture Review Notes

Why is system design hard?

Throw one away

Interface Design

Implementation

Efficiency

Reliability

Takeaways

Further Reading