Algorithm Design & Analysis
High performance multicore dictionaries
Recently, theorists have designed two dictionary data structures: the skip list and the cuckoo hash table. Skip lists are comparable
to binary search trees, but have two advantages: they can be
traversed without a stack, and building a skip list out of n items can be parallelized
trivially. Cuckoo hash tables are comparable to chained hash
tables, but offer a guarantee
of O(1) worst case lookup
rather than an average O(1) time lookup, and
cuckoo tables are simpler to implement. A cuckoo hash table
operates by using two or more independent open hash tables; this could
be adapted to a multicore environment by assigning each table to a core.
Project:
- Implement a parallel skip list and/or cuckoo hash table in a
practical language such as C++ or Java
- Compare your data structures' performance against the equivalent
structure in the standard library
- Measure the performance relationship as a function of utilized
CPU cores
Prize Collecting Chinese Postman Problem
The Chinese postman problem (CPP)
is an NP-complete problem
related to the traveling salesman problem (TSP).
The desired output of TSP is the shortest cycle that visits every vertex exactly once.
By contrast, the desired output of CPP is the shortest cycle that
visits every edge at least
once. The TSP corresponds to the problem of a visiting
every city (vertex) only once; the CPP corresponds to visiting every
city road (edge) at least once. By extension the CPP corresponds
to the problem of traversing and photographing city roads to collect
data for a street view service such as Google
Street View or Microsoft Bing
Maps.
In the prize collecting
variant, every edge is assigned a numerical reward, and the object is
the collect at least X points
of reward. This corresponds to the (realistic) scenario in which
some streets are more important than others, and the algorithm can
choose to skip unimportant streets. The prize collecting CPP is NP-complete, but perhaps it could
be approximated efficiently.
Project: design a polynomial
time approximation scheme (PTAS) for the prize collecting Chinese
postman problem.
Data structures to minimize flash wear
The durability of flash memory is typically described as follows:
reading data does no damage, but each byte of memory can only be
written a set number of times before it wears out. This is a
reasonable abstraction of how flash works, but it isn't quite
accurate. In actuality, a flash cell is a container for electrons
that can be in one of q
states. The current state number can be increased without causing
wear. However, the only way to decrease the state is to reset the cell to state 0, which
does wear out the cell.
So we could extend the lifetime of flash memory by using codes and data
structures that make greater use of the increase operation and work to
avoid using the reset operation.
A technical
report
by Finucane and Mitzenmacher establishes codes for rewriting
arbitrary bit strings while minimizing resets. An obvious next
step would be to develop fundamental data structures, such as lists,
binary search trees, and so on, that support provably-good flash wear
patterns.
Michael Mitzenmacher described
this
problem in a blog post.
Project: Design a variant of a fundamental data structure that
minimizes flash wear. Prove that the number of resets is good in
the worst-case, average-case or amortized sense.
Compress Wikipedia
Wikipedia is an online
encyclopedia that anyone can edit. It would be nice to be able to
carry around an offline copy of Wikipedia for many reasons, such as
reading on mobile devices when network service is unavailable, or as an
archive in case of a prolonged Internet outage. Wikipedia stores
a lot of data, so compressing it would be helpful. There is even
a contest for solving this
problem well.
General purpose text compression algorithms such as LZW (GZip) and BWT
(bzip2) do a decent job, but it is
possible to do even better. Standard text compression algorithms
limit themselves to polynomial time algorithms for obvious
reasons. However, we might be able to do better by finding
approximate solutions to an NP-complete
problem that corresponds to
compressing hypertext such as Wikipedia.
Project:
- Reduce hypertext compression to a known approximable NP-complete problem
- Implement the approximation algorithm
- Compare the compression performance to competitors
Monte Carlo Artificial Intelligence for Go
Go is a
classic strategy game. It has very simple rules, with only two
kinds of pieces, yet requires a depth of strategy similar to
Chess. The computational problem of making optimal Go movies is
computationally difficult. However, there is a simple randomized
algorithm using Monte Carlo
methods that seems to work well in practice:
- Enumerate all the liberties
(legal moves)
- For each liberty, play out a random game where players take turns
making completely random, but legal moves. Try k random trials, and keep track of
the number of times w that
the computer player wins.
- Choose the liberty with the highest win ratio w/k
This is a nice algorithm because it is simple, the player strength and
running time can be tuned by adjusting the k parameter, it is embarrasingly
parallel, and it seems to work well against human opponents.
The basic algorithm has already been designed, implemented, and
documented, but there is room for more work in this area.
Projects:
- Implement a parallel Monte-Carlo Go player in the multicore, GPU,
or MapReduce model(s)
- Modify the algorithm to run fewer trials on obviously-excellent
and obviously-terrible liberties, and run more trials on borderline
liberties
- Adapt this approach to work on small devices, such as mobile
phones, where resources are limited
Branch and bound musical chords frettings
A musician playing a fretted instrument such as a guitar forms chords by
arranging up to four fingers on the instrument's fretboard.
The positions of the fingers dictate the musical pitch that is
generated by each string. Strings can also be muted so that they
do not produce any sound. For every common musical chord, there
are several known frettings,
or ways of playing that chord, and these frettings are available in
databases such as chordbook.com.
However, it would be interesting and useful to generate these frettings
algorithmically.
This is a difficult computational problem. It amounts to choosing
a subset of finger positions, and the number of subsets grows
exponentially in the number of finger positions, of which there are
hundreds. However, there are also many constraints on which fingerings are
legal. A fingering must only generate musical pitches that are
actually in the desired chord, and in the Western musical temperament
there are only 12 pitches to worry about. Also, many conceivable
fingerings are impossible to play due to anatomical limitations of the
human hand.
The problem of generating guitar chords algorithmically can be posed as
a branch-and-bound problem, where
each step involves adding one finger to a candidate fingering.
This formulation limits the depth of the search to the number of
fingers, which is typically four. The branching factor is the
number of legal places to position each next finger, which, naively, is
large. However musical and anatomical constraints make the
effective branching factor much smaller.
Project:
- implement a branch and bound algorithm for generating guitar
chords
- experiment with different branching constraints, based on musical
or anatomical requirements, to make the algorithm faster
- develop an algorithm or heuristic for ranking the legal solutions
Cache oblivious algorithms
A cache
oblivious
algorithm makes effective use of cache memory, without
any awareness of the size or relative speed of the cache.
Project: Design and/or implement a cache oblivious algorithm for any of
the notable problems, below.
MapReduce algorithms
MapReduce is a software framework for distributed computing on huge
data sets. It was introduced by Google and is used by Google,
Yahoo, and other organizations.
Apache Hadoop and Holumbus are two open
source implementations of MapReduce, for Java and Haskell,
respectively. Other implementations are available as well.
Project: Design and/or implement a MapReduce algorithm for any of the notable problems, below.
GPU Algorithms
A Graphics
Processing
Unit
(GPU) is a kind of computer processor originally
designed to render 3D graphics for computer games. It turns out
that the kinds of computations that GPUs do can be applied to many
other general purpose computing problems, and in some cases GPU
algorithms can be vastly faster than traditional CPU algorithms.
The leading manufacturer of GPUs is NVidia.
OpenCL and CUDA are
two popular programming environments for writing GPU code.
Project: Design and/or implement a GPU algorithm for any of the notable problems, below.
Notable problems
The following is a list of computational problems that might be solved
in the cache-oblivious, MapReduce, or GPU milieu. There are
certainly other problems worth considering, too.
- Matrix multiplication using Strassen's
algorithm
- Matrix
chain
multiplication
- Range trees
- Binary
space partition
- Cuckoo
hashing
- Minimum
spanning
trees, especially with variants on Boruvka's
algorithm
- 2D
arrangements
- 2D convex hulls
- 3D
convex
hulls
- Voronoi
diagrams
- Factoring
using the Quadratic
Sieve or General
Number Field Sieve
- Graph
separators
- Linear
programming
- Cuckoo
hashing
- Skip lists
Copyright 2010, Kevin Wortman

This work is licensed under a Creative Commons
Attribution 3.0 United States License.
Last updated 9/10/2010