Joan Smith Signature Logo

Publications

Where there are not copyright restrictions, the links point to local PDF versions of the documents. Some links are to on-line journals, to arXiv.org, or to the digital library's official copy. Grouped by general topic, rather than chronologically. Click the BibTeX link to get the proper citation text if you use LaTeX.

CRATE

Research into producing self-archiving web resources. This is my dissertation focus area. CRATE is a complex object model where a web server delivers both preservation metadata and the resource itself in response to a single GET request -- i.e., a self-describing resource.
  1. Creating Preservation-Ready Web Resources.
    J.A. Smith and M.L. Nelson. D-Lib Magazine. January/February 2008.
  2. CRATE: A Simple Model for Self-Describing Web Resources.
    J.A. Smith and M.L. Nelson. Proceedings of the 7th International Web Archiving Workshop IWAW'07. June 2007. [BibTeX].
  3. Generating Best Effort Preservation Metadata for Web Resources at Time of Dissemination.
    J.A. Smith and M.L. Nelson. Proceedings of JCDL 2007. June 2007. [BibTeX].
  4. Using OAI-PMH Resource Harvesting and MPEG-21 DIDL for Digital Preservation. (abstract)
    J.A. Smith and M.L. Nelson. 2nd International Conference on Open Repositories. January, 2007.
  5. Efficient, Automatic Web Harvesting.
    M.L. Nelson, J.A. Smith, I. Garcia del Campo, H. Van de Sompel and X. Liu. Proceedings of ACM WIDM 2006. [BibTeX].
  6. Integrating Preservation Functions Into The Web Server.
    J.A. Smith (Advisor: M.L. Nelson) Research Proposal. November 2006, Old Dominion University.
  7. Integrating Preservation Functions Into The Apache Web Server.
    J.A. Smith. JCDL 2006: Doctoral Consortium. June 2006.

Web Crawlers & The Gray Web

Looking at the Web Infrastructure (that is, the "Gray Web") as a preservation medium. This work was done primarily with Michael Nelson and Frank McCown. In addition, a number of experiments were done to examine the behavior of search engine robots. The impact of site depth, breadth, and internal links is covered in the D-Lib March 2008 article which includes several animations showing how various search engine robots explored the sites.
  1. Site Design Impact on Robots: An Examination of Search Engine Crawler Behavior at Deep and Wide Websites.
    J.A. Smith and M.L. Nelson. D-Lib Magazine. March/April 2008.
  2. Using The Web Infrastructure To Preserve Web Pages.
    M.L. Nelson, F. McCown, J.A. Smith, and M. Klein. International Journal on Digital Libraries. (On-line version) [BibTeX].
  3. Lazy Preservation: Reconstructing Websites for the Lazy Webmaster.
    F. McCown, J.A. Smith, M.L. Nelson, and J. Bollen. Proceedings of ACM WIDM 2006. November 2006. [BibTeX].
  4. Reconstructing Websites for the Lazy Webmaster.
    F. McCown, J.A. Smith, M.L. Nelson, and J. Bollen. Technical Report.. December 2005. [BibTeX].
  5. Observed Web Robot Behavior On Decaying Web Subsites.
    J.A. Smith, F. McCown, and M.L. Nelson. D-Lib Magazine.. February 2006. [BibTeX].

Digital Archives

Using other parts of the Gray Web, particularly Usenet groups and email servers as a preservation medium. This work was done primarily with Michael Nelson and Martin Klein.
  1. How Much Preservation Do I Get If I Do Absolutely Nothing?
    M. Klein, F. McCown, J.A. Smith, and M.L. Nelson. Proceedings of Media Production 2006. To appear (2007). [BibTeX].
  2. Repository Replication Using NNTP and SMTP
    J.A. Smith, M. Klein, and M.L. Nelson. Proceedings of European Conference on Digital Libraries. September 2006. [BibTeX].
  3. Repository Replication Using SMTP and NNTP
    M.L. Nelson, J.A. Smith, and M. Klein. Proceedings of the 2006 International Conference on Digital Government Research. May 2006. [BibTeX].
  4. Repository Replication Using NNTP and SMTP
    J.A. Smith, M. Klein, and M.L. Nelson. Technical Report. June 2006. Updated November 2006. [BibTeX].

Impact Factors

Looking at the impact of the web on ranking research influence. For more information on impact factors, see my colleague Johan Bollen's web site.
  1. Toward alternative metrics of journal impact: A comparison of download and citation data
    J. Bollen, H. Van de Sompel, J.A. Smith, and R. Luce. International Journal of Information Processing and Management. December 2005. [BibTeX].
    This paper got a very favorable REVIEW.
  2. Toward alternative metrics of journal impact: A comparison of download and citation data
    J. Bollen, H. Van de Sompel, J.A. Smith, and R. Luce. Technical Report. A longer, more detailed discussion. March 2005. [BibTeX].