CIS 506 Course Notes

2002 spring
Thulin


You may wish to review an article about the Linux OS.  See http://www.linuxresources.com/ and  http://www2.linuxjournal.com/main.html

Suggestions for papers to review:

CACM Vol. 33, No. 10 (Oct. 1990) special issue on
simulation

NEST: a network simulation and prototyping testbed
Alexander Dupuy, Jed Schwartz, Yechiam Yemini and
David Bacon
Commun. ACM 33 10 (Oct. 1990), Pages 63-74

Parallel discrete event simulation
Richard M. Fujimoto
Commun. ACM 33 10 (Oct. 1990), Pages 30-53


ACM:
http://www.acm.org/

Student Membership:
http://www.acm.org/membership/student/benefits.html
 



The Client-Server Model
 

Addressing See p.57 fig. 2.10.
  1. Hard code machine.process
  2. Random process addresses located by broadcasting
  3. Server names looked up by name server
  4. Special hardware; random process addresses stored in network interface chip
 
 


Primitives
  new thread may be started to disguise interrupt

single thread choices:

  1. Blocking send, receive (normally preferred)
  2. Nonblocking send with copy, receive with wait or test
  3. Nonblocking send, receive with interrupt
   

Remote Procedure Call (RPC)

Basic RPC Operation

Parameters Dynamic Binding RPC and Failures  

Implementation
 

RPC protocols Acknowledgments Critical path (pp. 89-91): sequence of instructions executed for every RPC Copying Timer management Problems  


Group Communication
  Design issues ISIS

Ch. 3:  Synchronization

semaphores, monitors not suited to distributed systems
 


Clock synchronization

distributed algorithms


Logical clocks


Physical clocks

 

Subject:
        Master Clock
  Date:
        Fri, 25 Sep 1998 06:30:01 UT
  From:
        The Learning Kingdom <fact@LearningKingdom.com>
    To:
        Cool Fact List <fact@tlk-lists.com>
 

----------------------------------------------------------------------
  The Learning Kingdom's Cool Fact of the Day for September 25, 1998
----------------------------------------------------------------------

             What clock keeps time for the whole planet?

----------------------------------------------------------------------
 

The source clock for the planet is the US Naval Observatory's Master
Clock.  It's a network of dozens of extremely accurate clocks located
all over the planet!

These clocks communicate by electronic messages every 100 seconds.
Together, they make a very stable time standard.  Their time signal is
used for many purposes, including the satellite-based Global
Positioning System and many kinds of electronic navigation systems.

Do you have a self-setting clock?  Self-setting clocks receive a radio
signal that originally comes from the USNO Master Clock, and reset
themselves as needed!

USNO's Master Clock web site:
http://tycho.usno.navy.mil/time.html

Pollastri's Time of the Internet:
http://www.cstv.to.cnr.it/toi/uk/toi.html

American Watchmakers-Clockmakers Institute:
http://www.awi-net.org
 

----------------------------------------------------------------------
             Cool Fact of the Day list membership: 33,513
----------------------------------------------------------------------
          To subscribe, visit http://www.tlk-lists.com/join/
        To unsubscribe, visit http://www.tlk-lists.com/change/
     To become a sponsor, visit http://www.tlk-lists.com/sponsor/
----------------------------------------------------------------------
            Copyright (c) 1998, The Learning Kingdom, Inc.
                    http://www.LearningKingdom.com
----------------------------------------------------------------------
 

http://tycho.usno.navy.mil/time.html , http://www.cstv.to.cnr.it/toi/uk/toi.html

Clock synchronization algorithms

Use of synchronized Clocks


3.2. Mutual Exclusion


3.3.  Election algorithms

 
 


3.4.  Atomic transactions

Introduction

The transaction model

Stable storage (p.146)
 

Transaction primitives

Properties of transactions (ACID)

Nested transactions (only top level durable)
 

Implementation

Concurrency control algorithms

Transactions complex hence low performance
 
 


3.5.  Deadlocks in distributed systems

four methods for handling (p. 159)

Detection

Prevention


Ch. 4:  Processes and processors in distributed systems

 

4.1.  Threads

Introduction

lightweight processes (pp. 170-171)

Thread usage

Design issues for threads packages

Implementing a threads package

Threads and RPC

 


4.2.  System models

The workstation  model

diskless workstations

diskful (disky) workstations (pp. 187-189)

Using idle workstations
 

The processor pool model (p.194)

queuing system (p.195)

A hybrid model

 
 


4.3.  Processor allocation

 

Allocation  models

Design issues for processor allocation algorithms

 
 

Implementation issues

Example algorithms


4.4.  Scheduling in distributed systems

p.210 co-scheduling


4.5.  Fault tolerance

Component faults

fault = malfunction

 ¥
S kp(1-p)k-1 = mean time to failure = 1/p
k=1
 

System failures

processor or process faults Synchronous vs. asynchronous  

Use of redundancy

information, time, physical
 

Fault tolerance using active replication

active replication (state machine approach)
TMR (p.216)

k fault tolerance

atomic broadcast problem

 

Fault tolerance using primary backup

p.218
 

Agreement in faulty systems

 

two-army problem

byzantine generals problem (p.221&222)

 


4.6.  Real-time distributed systems

fly-by-wire
periodic, aperiodic, sporadic stimuli
pp. 224, 225
soft, hard; 3 myths

Design issues

Real-time communication

TDMA
 
 
 


Ch. 5:  Distributed file systems


file service - file server

5.1.  Distributed file system design

The file service interface

file:
 sequence of bytes (usu. in distributed system)
 sequence of keyed records
attributes: owner, size, date, access (rwx)
immutable files: only create, read
capability: user ticket for object access
access control list: who may access and how
upload/download model (p.247)
remote access model

The directory service interface

type: file extension or explicit attribute
hierarchical file system: trees, graphs (p.249)
clients have a) same b) different view of file system (p.250)
global root directory? /server/path

Naming transparency
 location transparency, location independence
1. machine+path naming
2. mount remote file system on local hierarchy
3. single name space same on all machines
1, 2 easy; 3 difficult but needed for single-system image

Two-level naming
symbolic names - binary names
binary name may be server+path, or use symbolic link
capabilities as binary names
symbolic name --> several binary names

Semantics of file sharing

UNIX semantics (see p.253)
session semantics
immutable files
transactions (see p.256)


5.2.  Distributed file system implementation

File usage (Satyanarayanan 1981; university setting): System structure Caching (pp. 262, 264) Cache consistency (p.267)

Replication (p.269)

Update protocols Sun's NFS NFS layer structure (p.276) Lessons (p.279)
 


5.3.  Trends

 


Ch. 6: Distributed shared memory


6.1. Introduction

DSM (Li 1986, Li & Hudak 1989)

6.2. What is shared memory?

Examples from multiprocessors:  

6.3.  Consistency models

summary of consistency models (p. 332)
 

6.4.  Paged DSM

NORMA (NO Remote Memory Access)  

6.5.  Shared variable DSM

 

6.6.  Object-based DSM