Hawaiian Ecosystems at Risk project (HEAR) Don Gardner Legacy database
a brief "white paper"

by Philip A. Thomas - 18 August 2005


The Don Gardner Legacy Database (DGLegacy) is a collection of images, publications, and pathogen-host relationships for pathogens of plants of Hawaii compiled based on the work of Don Gardner. For further information, contact pt@hear.org.


data structure ]     [ metadata ]     [ how to make DGLegacy a "living" database ]     [ how to implement future "legacy" projects ]

Data structure


Metadata

To re-create the Pathogens of plants of Hawaii website from the database (originally [and currently the same as] the Don Gardner Legacy Database), download the appropriate version of the database (only applicable to Paradox-formatted versions; MS-Access versions do not include coding to re-create the website); run any 32-bit version of Paradox for Windows; create an alias ":Common:" pointing to the folder named "common" (containing the file SUBROUTI.LSL); run the form _MAINMNU.FSL; click the button entitled "How to regenerate the website" for further instructions.

Various versions of the data are provided online (including the original DGLegacy database [as originally received by HEAR [non-normalized/unedited: Access]; the original data as normalized & edited [Paradox & Access]; and the most current version of the database [used to create the current version of the "Pathogens of plants of Hawaii" website: Paradox]) (see http://www.hear.org/pph/database/). Eventually, it is planned to also host an XML version of the data corresponding to the most recently-posted version of the website.


How to make DGLegacy a "living" database

To make DGLegacy a "living" database, a data steward is needed. By "data steward," I mean someone whose responsibility it is to maintain the data set: someone knows it's her/his responsibility to keep the database safe (backed up), accessible (to those who many need it), and "in sync" with the website and/or other products generated from the database. Ideally--for at least the first task--this would be someone who has expertise in pathology. In addition, a webmaster (and server, of course) is needed to host the corresponding website, and to maintain the HTML templates and other code. Both tasks could be performed by the same person (but it would be a rare individual who would have the interest, expertise, and time to do a good job at both tasks).

In fact, a data steward needs to be designated even if the database is not being updated, so it's the responsibility of someone in particular to maintain the "safety & well-being" of the live/current version.

At this point, the current contents of the database is available online. As(/if) data is appended to/amended in the database, the site can be automatically regenerated/replaced to incorporate updated information.


How to implement future "legacy" projects

My main advice for future "legacy" projects is to build a database, and START NOW.

A properly-structured, normalized database should be created to appropriately* (*includes "relationally") store information destined to reside in a "legacy database." The data structure will vary depending on the information being stored, but thought should be given to compatibility with other properly-formatted data sets so the information will be as useful and as comparable to other data sets as possible(/practicable). It is important that someone who understands data structures be involved in the original development of the database. Otherwise, extremely time-consuming and expensive data conversions will be inevitable. Much of the strife of this process can be avoided by having the original data be collected in a properly-structured, normalized database.

In my opinion, the development of such a data structure sooner rather than later--during pre-"legacy" creation of the data set--is much preferred to the option of waiting (e.g., until a faculty member whose data is to be captured has retired, or is busy in "wrap-up" mode at the end of a career). For various reasons, including the availability of the data creators (who will be able to help ensure that the data structures accurately reflect the relationships among the data elements), and the ability to adapt the suitability of the data structure to the availability of evolving information resources.

 New! 29 August 2005 addendum: It would seem that each "legacy database" would need a customized structure, based on the requirements of the particular project; however, the databases should share certain field structures (e.g., species; publication references; etc.) in case it's desirable for them to be somehow combined. At the very least, a scheme to "crosswalk" the structures/data values among "legacy databases" should be created (again, if it's possible that someone will want to combine them someday for any reason).


PDF icon Some documents posted on the HEAR website are in Adobe Acrobat PDF format. If your computer is not already set up to read these files, you can download the FREE Adobe Acrobat reader. You can set up most web browsers to automatically invoke this reader (as a "helper application" or "add-in") upon encountering documents of this type (refer to your browser's documentation for how to do this). [Download Acrobat reader]

   [  Hawaiian Ecosystems at Risk project (HEAR) home  ]   

Comments?  Questions?  Send e-mail to: webmaster@hear.org

This page was created on 22 August 2005 by PT, and was last updated on 21 March 2006 by PT. Valid HTML 4.01!