UNIVERSITY of VIRGINIA
Computer Science
Research  Teaching  People  Community   

Search
Directory
Contact Us
 
Of interest to: Prospective StudentsMembers

How the Oracle of Bacon Works

Every couple of weeks the Oracle downloads several database files from one of the Internet Movie Database's FTP sites containing around 1.4 million actors and actresses, around 1.1 million movies and TV shows, and around 200,000 nicknames. The Oracle builds a big map of actors and movies and stores it in a 154 MB database file.

There is a database server running at all times that stores the database file in memory. The server handles three different types of requests:

There are several CGI programs -- one for each of the above types of queries -- that run on the UVA Computer Science department web server, which all connect to the database server using TCP.

The database server uses a breadth-first search to find the shortest path between pairs of actors. If you want to dig further into how shortest-path algorithms work, I recommend the textbook by Cormen, Leiserson, Rivest, and Stein as an excellent place to start. Other algorithms textbooks are likely to cover the subject as well, if Introduction to Algorithms isn't available. You may also look at materials that I wrote to explain graph algorithms (including breadth-first searches) to Duke undergraduate CS students here.

Whenever the Oracle answers a query, the results are cached so that future requests to link to the same actor will occur more quickly. About 80% of all queries can be served instantly from the result cache. The current contents of the cache (i.e., which actors can be linked quickly) can be found here.

The database server runs under Linux on a 1.6GHz Opteron, consuming about 330MB of RAM, about half of which is used for the results cache.



UVa CS Department of Computer Science
School of Engineering, University of Virginia
151 Engineer's Way, P.O. Box 400740
Charlottesville, Virginia 22904-4740

(434) 982-2200  Fax: (434) 982-2214
Comments: http://oracleofbacon.org/comments.html
Site directory, Other addresses
Server statistics
© Created by Patrick Reynolds and the CS Web Team