The suggested projects are supposed to be concrete enough that there
is a ``place to start,'' yet open-ended enough that there are plenty
of interesting possibilities for further exploration. These project
ideas are intended to form the basis of projects rather than
to try to give you an itenerary for the rest of the term. Note that
most of these problems aren't entirely new, but that there should be
evidence of originality in your project! Thus, you will have to
expand on these ideas. You will be graded on your:
- Specification.
- Design.
- (where necessary) Proof.
- (where necessary) Implementation.
Projects will be term-long, and comprise 70% of your grade (the other
30% will be miscellaneous quizzes and programs).
-
Eve
- Animating Algorithms.
(example projects: algorithm comparison, visualization, performance)
Choose a favorite algorithm or algorithms by browsing through
138 notes, Knuth, CLR's "Intro to Algorithms", Dijkstra's publications,
a journal you've always wanted to read, etc. Animate the algorithm's
behavior and compare and contrast performance under different conditions.
Link your results into cs138 web pages on archetypes for posterity. Assess
the benefits of a sequential, distributed, and/or parallel version of
the algorithm(s).
- Coordination Algorithms.
(example projects: multi-agent coordination and control, distributed
agreement, consistency, synchronization)
Design and implement one of the distributed algorithms that you might
encounter some time in your life outside of cs138. In particular, model
one of the coordination and consensus algorithms that help make
the following systems function properly: airline or ticketmaster
reservation systems, distributed multiplayer games, an on-line
registrar's office, distributed voting in an election or at the UN,
a network-based auction house, a distributed appointment calendar (as
discussed in class), the stock market.
- Internet Protocols.
(example projecs: algorithm analysis, proof of correctness,
implementation, evolution)
Examine the design of a seminal or emerging Internet algorithm; e.g.,
Transport Control Protocol/Internet Protocol (TCP/IP), Simple
Mail Transfer Protocol (SMTP), Protocol Independent Multicast (PIM),
Resource Reservation Protocol (RSVP), Mobile IP, or any number of
routing algorithms that are in use. Create a prototype implementation,
assess best/worst performance, prove correctness, explore the impact
of scale and network unreliability on the algorithm. Discuss the
algorithms' evolution and improvements over time.
- Global Snapshots.
After having had to prove several termination detection algorithms
for cs138b, you should all be experts on global snapshots.
Implement a global snapshot algorithm among distributed entities.
- Network of Workstations.
Design a system that would allow you to know about idle CPUs in a
local-area network. As a result, you could use clusters of these
idle machines for multiprocessor computations. Specify how machines
are dynamically registered as available and unavailable. Prototype
a centralized architecture for workstation resource management, then
a distributed alternative. Speculate what it would entail to
migrate something like this into the wide-area, or to track many
thousands of machines. How might this system be refined to handle
load balancing among processors during a computation ?
- Diffusing Computations.
Design and implement a diffusing computation algorithm among
distributed entities.
- Web Performance.
Analyze HTTP, the main protocol used by the Web to transfer data
between Web clients and Web servers. Assess HTTP's usage of the idea
of protocol abstraction, i.e., its use of underlying protocols such
as TCP an in turn IP for data transmission. How could it be optimized for
better performance when ther are multiple transactions between the same
client-server pair? How might the Web's approach to data caching be
improved for better load balancing?
- Synchronous Game Archetype (distributed tic tac toe, checkers,
othello, connect 4, chess).
- Workflow (office automata) - an acyclic graph with a token
for each transaction, spawning new work processes where necessary.
- Implement "Sessions."
-
Paul
- Comparison of Termination Detection Algorithms
(discrete event simulation, math, research)
Several algorithms for termination detection were described in class.
The literature contains a multitude of others, including algorithms
which involve acknowledging every message, but are message optimal.
In addition, there are periodic snapshot algorithms which can be
message supra optimal, but suffer a delay between the occurrence
of termination and its detection. The problem is to analyze the
conditions & parameters under which various algorithms outperform
the others.
- Blocking Theorem for Matrices
(maybe discrete math/graph theory?, research)
Given a 2D (or nD) matrix, and the associated data dependencies
between cell entries, what is the optimal subblocking of this
matrix to minimize calculation time, given k processors? If a
closed form solution is not possible in the general case, can
we characterize the blocking structure for some of the simplest
data dependencies (eg cells depend on neighbours only)? These
cases could be solved by exhaustive analysis and simulation.
- Game of Spit
(distributed systems, asynchronous interaction)
Implement the game of spit across processes (or machines)
using java & sockets. Consider the effects of network
assymetries on the relative advantages of the players.
Can the game be designed such that neither player has any
advantage due to proximity effects? If time permits, consider
session management issues such as player invitation, game
construction, computer opponents, etc.
- Futures Market
(distributed systems, asynchronous interaction)
Implement a futures market server, patterned after the Iowa web
site. Buyers and sellers trade futures over this market.
Analyze issues of fairness with regards to network assymetries.
If time permits, examine trading programs which can be linked
to the market to buy and sell automatically to test various
market models.
- Diplomacy
(distributed systems, synchronous interaction, modes of interaction)
Implement the game of Diplomacy. Players must be able to communicate
with each other between moves to negotiate. Communication sessions
can be 2 way or multiway! Moves occur in single synchronous act.
With multiway sessions, it is important that every player see the
same "history" of what was said. That way they can have a consistent
discussion.
-
Mika
- Network security on the Internet
A network daemon provides access to clients that request some
service, potentially from all over the Internet. The low-level
protocols in use on the Internet do not include secure verification
of the sender of those packets, nor do they include encryption to
ensure that only the intended recipient of the data can understand
it. A good CS138c project would be to explore the possibilities for
ensuring security in such an environment. Passwords, public-key
encryption, public key signatures would be possible ways to implement
the secure layer of the protocol. The protocol could be applied to a
specific application (e.g., a Java-based server of some kind), or it
could be made more general. This project would probably involve
studying literature, implementation of secure network protocols, and
formal verification of those protocols.
- Fault tolerance in a specific distributed application
Imagine implementing ``the'' calendar application as a distributed
set of processes. We can imagine a solution with a centralized
secretary and distributed professors, or a different solution where
one of the professors plans a meeting. Now consider adding the very
important real-world element of failures of processes and/or
links. Even for a specific application such as this one, coming up
with solutions that tolerate faults is a very open-ended problem. The
project might involve literature searches, implementation, proofs,
and simulation. This project has a lot of potential for deep formal
arguments.
- Networks of workstations
In the CS department, we have approximately a hundred workstations
that are probably idle more than 90 % of the time. How would you
write a runtime system that would allow you to use this network of
computers as a single ``virtual multicomputer''? How would you find
out which machines are idle and available for use by your
application, and how would you manage the available resources?
Implement a centralized architecture for workstation resource
management, then a distributed alternative. (Use our systems, maybe
you can do something useful with them!) Speculate what it would
entail to migrate something like this into the wide-area, or to track
many thousands of machines. How might this system be refined to
handle load balancing among processors during a computation? This
project is very open-ended and useful. You are not expected to
implement a commercially viable system; we just want to see some
insight and analysis of the issues involved. This is probably the
most ``interesting'' and open-ended of these problems.
- The dating server
At Caltech, it can be very difficult to find a date (depending on
gender and other factors). In order to make most efficient use of the
carce resources, we have decided to use our engineering skills to
build a system to track dating opportunities for us. Here is a
suggested mechanism for finding a date for the weekend:
- Each user X enters his preferences in the form of an objective
function to maximize over the set of possible alternative dates,
and includes his email address in the request which is sent to the
dating server. User X may include the day (e.g., Friday,
Saturday, or don't care) he is available.
- The requests are queued up on the server, which can respond
immediately or later (perhaps by Wednesday evening) by either
sending a time of the date and the identity of the matching party
or by sending a message that no suitable matches were available
this week. Note that the requests arrive asynchronously.
There are endless variations on this problem. Explore distributed
dating services, different types of objective functions, different
types of mappings from personal to global objective functions (i.e.,
maximizing each person's happiness may not maximize the group's
happiness.. how do you define the happiness of a group of people
anyhow?), etc. An interesting idea is to make one of the parameters
of the objective function the time at which the decision is made,
such that a person's ``user agent'' will get less and less picky as
the weekend approaches.
- Projects in VLSI(?)
My own research interests are mainly in VLSI, so if you cannot find
a good project from the lists of suggestions, I would be more than
willing to suggest some very interesting distributed programming
problems taken from that field. But I'm not limited to that...
Here are a few examples:
- Discrete-level circuit simulation. The differential equations
governing the behavior of electronic circuits are well studied and
well understood. However, the interdependence of the state variables
of a given part of the circuit on every other part of the circuit
make this problem very intractable as a parallel or distributed
simulation. This need not be an insurmountable obstacle, especially
in the case of digital circuits, since we know that in those highly
nonlinear circuits, the state variables in one location are only
affected by {\em large\/} swings of other nodes. By taking advantage
of this, it is possible to decouple parts of the system from each
other and achieve considerable parallelism. Can you implement a
parallel discrete-level circuit simulator that works for digital
circuits? There has been some research in this area, so a literature
search is (as usual) the first place to start.
- Layout generation. You are given to generate a particular
circuit topology by drawing polygons connecting the various
nodes. You are free to move things around the two-dimensional surface
as you wish, and you may have multiple levels of interconnect. A very
difficult problem, and since your solution is likely to be ``slow,''
you decide to investigate a distributed algorithm for layout
generation. How do you partition the original circuit to achieve
efficient end results? What heuristics can you apply? Does
branch-and-bound work? This kind of project would require a great
deal of planning to make it tractable yet interesting, but if you
have a good idea for a solution, it may be possible to make progress
by the end of the term.
-
Adam
- Ticketmaster Reservation System.
Coordinating multiple distributed asynchronous
accesses to a finite resource, without priorities.
There is a concert, and lots of
people want to buy tickets from a number of distributed ticket
dispatchers. The total number of tickets available for the concert is
finite, and we assume that more people will want to buy tickets than are
available. The dispatchers must coordinate with each other to service
the ticket requests without overselling the number of tickets.
- On-Line Registrar's Office.
Coordinating multiple distributed asynchronous
accesses to a finite resource, with priorities.
Students have distributed processes
that asynchronously want to sign up for limited-enrollment classes;
an undergraduate registrar process services undergraduate requests, and
a graduate registrar process services graduate requests. Student
processes make requests with certain priority, based on class standing
and requirement of the class for graduating. The registrars must
coordinate with each other to enroll students in classes without
overenrolling them.
- Easter Eggs.
Coordinating multiple distributed asynchronous
accesses to a dynamic resource pool.
Whenever a World Wide Web surfer first discovers a
Web page with an "Easter Egg" on it, s/he is given two things: some
electronic "money" for himself or herself, and some electronic "money"
for him or her to give (as s/he sees fit) to any of the web pages s/he
has found that contain Easter Eggs, based on his or her perceived
quality of the visited page. A project can work on any one of several
facets of this problem: maintaining a registery of Easter Eggs,
maintaining a "bank" of player accounts and moneys collected, or
coordinating the dispatch of new funds when a new Easter Egg is discovered.
- Distributed Auction.
Coordinating multiple distributed asynchronous
bids to control set of common resources, with bids being unbounded.
There are 10 distinct items, and each
player has a "values sheet" of how much single items or combinations of
items are worth to that person. Each value sheet contains different
values for items and combinations of items. When the auction begins,
players bid on items they'd like to buy. Item prices can only go up.
At a designated time, the auction closes, and people "buy" the items
they want at the prices they bid, and people "collect" the total value
of the items they bought, according to their value sheets. This can be
played as a game with a specified number of rounds, but the challenge is
in coordinating the asynchronous bids in a well-designed manner.
- Distributed Voting.
Coordinating multiple distributed asynchronous
bids to control set of common resources, with bids being bounded.
Consider a slate with 5 ballots for 5
candidates. Each player has a "value sheet" of how much combinations
of candidates winning the election is worth to him or her personally;
each person's value sheet is different. When the election begins, each
player casts a vote for the candidate s/he wants in each ballot;
each player can then gauge as the election is going on whether to keep
his or her votes, or change them. However, unlike the auction, each
person can only place a single vote on each ballot. At a designated
time, the polls close, and people "collect" the total value of the
candidates electred, as per their value sheets. This can be
played as a game with a specified number of rounds, but the challenge is
in coordinating the asynchronous ballot castings and recastings in a
well-designed manner.
-
Unassigned so far
- Cellular Routing, as described by Jacob.
- Sealed Auction, like the auction described previously,
but bids are sealed, and you have to make sure information is not
leaked, as described by Jacob.
Last updated April 10, 1996 by
cs138@cs.caltech.edu.
If you have any questions, contact the TAs
Adam,
Eve,
Paul, or
Mika.
Copyright © 1994-96, K. Mani Chandy. All rights reserved.
Reproduction of all or part of this work is permitted for educational
or research use provided that this copyright notice is included in any
copy. Disclaimer: this collection of notes is experimental, and does
not serve as-is as a substitute for attendance in the actual class.