Intro

  • A distributed system is a collection of autonomous computing elements(nodes) that appears to its users as a single coherent system.

  • Middleware is the same to a distributed system as what an operating system is to a computer. Only difference is that middleware sits on a network of nodes.

  1. Resource sharing
  2. Transparent invisible to end user: ISO 1995
    • representation vs access
    • location
    • relocation
    • migration
    • replication
    • concurrency
    • failure Full transparency is a nice goal to strive for, but there are situations when you would want less transparency, and expose certain level of details.
  3. Open
    • Interoperability
    • Composability
    • Extensibility
    • To what degree we separate policy from mechanism
  4. Scalable
    • Size scalability
      • Compute CPU
      • Storage Disk, I/O transfer rate
      • Network bandwidth
    • Geographical scalability
    • Administrative scalability
    • Modes of scaling:
      • Scaling up - vertical - upgrade machine
      • Scaling out - horizontal - add more machines
    • Methods:
      • hide communication latencies - geographical scalability
      • distribution of work
      • replication
        • Caching is a special form of replication
    • Practice shows that combining distribution, replication, and caching with different forms of consistencyy generally leads to acceptable solutions.

Pitfalls:

  • The network is reliable
  • The network is secure
  • The network is homogeneous
  • The topology does not change
  • Latency is zero
  • Bandwidth is infinite
  • Transport cost is zero
  • There is no administrator

Types of distributed systems:

  • HPC

Architectures

Processes

Communication

Naming

Coordination

Consistency and replication

Fault tolerance

Security