Joan Smith Signature Logo

Welcome

I am a candidate for the PhD degree in computer science at Old Dominion University, where I am a Graduate Research Assistant. My advisor, Dr. Michael Nelson, has been named a Digital Preservation Pioneer by the Library of Congress. I am also the main system administrator for several of our research computers (mostly Fedora Core or Debian boxes), and webmaster for the modoai.org site. Prior to this, I spent over 10 years working in the software development field, and several years as a teacher. I was the eighth recipient of the CLIR Zipf Fellowship, and in January 2006 attended the first Google Workshop for Women Engineers. My current research focuses on web architecture, particularly the accessibility of digital content and the preservation of web resources.

I have also lived, worked and studied in many countries around the world, including Belgium, Panama, Korea, and many states in the good old USA. My educational background includes degrees in philosophy (U of Leuven, Belgium); chemistry, natural science (SUNY), and Computer Education (Hampton U). I am currently pursuing a Ph.D. in computer science at Old Dominion University, where I work as a researcher and occasionally as an instructor; my advisor is Michael Nelson. I passed the Candidacy exam in 2006, and expect to defend my dissertation in mid-Spring, 2008. The title is "Integrating Preservation Functions into the Web Server." Back in 2005, I was a Featured Student of the ODU CS Department. Not exactly fame and fortune, but a nice compliment.

As a software professional, my experience lies primarily in the design and development of information systems. Currently, I am the main developer of an Apache 2 module, mod_oai. In addition, in the early days of computer-based training, I wrote software for Army Education Center courses on Control Data Corporation mainframes which had one of the first touch-screen interfaces for end-users. My enthusiasm for computing in general, and networking in particular, led me to install a sophisticated smart home system in my own residence, with a master scheduler (to control lights, for example) and 64 wired cat-5 ports. The protocol is showing its age, though and I have begun to transition to the newer Insteon system. Some photos of the project and all the wiring can be found in the Scrapbook link under "projects".

Four new research test sites have been set up: CRATE and ODUCRATE on a commercial server; along with Blanche-00 and Blanche-02 at ODU. [Warning: Those links open in a new window] The sites are designed to test robot (web crawler) behavior, using "real" content i.e., quotations from noncopyrighted, classic English-language works, rather than randomly-generated text which was used in prior experiments. The links in this paragraph are part of the experiment: they're designed to help advertise the existence of the sites to search engines. A quick preview of Google's traversal of one of the sites is shown in this animated GIF below:

MARCH OF THE GOOGLEBOTS
March of the Googlebots The "March of the Googlebots" is an animated view of Google's robots crawling one of the experimental sites. Each blue X represents a "GET" request. Red X indicates a conditional GET request, i.e., Google is asking if the page has changed since its last visit. The spread of gray in the background shows the links that have already been visited. Thus, the background becomes fully gray as all links have been retrieved by Google at least once. Notice that Google continues to revisit various links. The animation covers late February 2007 through September 2007. We collected data for a year -- more graphs and a discussion of our findings can be read in our recent D-Lib Magazine article, Site Design Impact on Robots. Information on prior experiments can be found on the Publications page.