The purpose of this section of the draft document is to show how a qualitative assessment of the CONFIDENCE LEVEL associated with a spatial data set can be made by the Hawaiian Ecosystem at Risk (HEAR) project. As we use the term, "confidence level" refers to overall reliability, i.e., the amount of faith that we are willing to place both in the correct identification of the taxonomic entities AND in the level of "precision" and the "thematic accuracy" of the spatial information (see Appendix 1 for definitions of these and other related terms).
The solution that we propose is to: (a) use the existing literature and consultation with experts to screen each data set for plausibility of the taxonomic entities and their reported locations (see the section of this document labled "Authorities" for more information), and (b) ask each contributor to explicitly state the actual spatial precision and thematic accuracy of their data (as far as it is known to them) rather than arbitrarily requiring conformance to some pre-existing national or international standard.
For Hawaii as a whole there are now several sources in the literature which give simple presence or absence data for alien species on an island-by-island basis. For some substantial fraction of these alien species, museum specimens have been collected over the years and the point coordinates (or at least the place names) of the collection localities recorded. Records of presence or absence (and sometimes density or abundance) also exist for some species in areas such as nature reserves or sites where environmental assessments have been performed.
Nevertheless, for most of the alien species (even the sedentary organisms like plants) it seems that comparatively little data exist in the form of mapped boundaries showing the areal extent of populations and sub-populations. Such data is needed to determine their status at the whole-island scale, or to track the changes in their population distributions over time. Because the repeated field surveys necessary to establish the whole-island distribution of alien species populations and monitor changes in their population boundaries are difficult and expensive to undertake in our rugged terrain, it may be quite some time before extensive data sets of this type are at our disposal for the majority of species.
In the absence of large-scale efforts to systematically collect accurate point location data and voucher specimens for each alien species on every island, the HEAR project has thus far had to rely for population distribution data on the "best guesses" of a small advisory group of experts in biology, resources management, and agriculture. We are now attempting to expand this data-gathering effort by appealing to the many other people with field expertise in these areas to contribute additional data on the alien species distributions on their island. At this initial stage in the HEAR project, even a rather small-scale, imprecise "sketch" map of estimated population boundaries can provide us with valuable data (which can be then be field-checked for the highest-priority species).
Since we expect to receive mapped information from a variety of contributors, many of whom may be unknown to us, we have had to set up some minimal guidelines for maintaining spatial data quality. In any case, by converting our current (admittedly crude and incomplete) maps to digital format and publicizing them on this web site, we hope over time to obtain the feedback needed to correct the maps and make successively better approximations to the true population distributions.
Ideally, of course, alien species distribution data would always be contributed to the HEAR project as highly accurate and precise computer coordinate files, or else as standard USGS maps of at least 1:24,000-scale on a stable medium such as mylar, on which points or polygons drawn with great precision would represent the exact center of distribution or boundaries of populations. Unfortunately, in the real world this level of mapping accuracy is rarely available for populations of most wild organisms in Hawaii (or anywhere else, for that matter).
The species distribution maps contributed to the project to date have mostly been on paper, at considerably smaller scales than 1:24,000 (in the case of Hawaii, the largest island, as small as about 1:650,000). We review each such map to see what mapping methods were used, what level of precision was used in determining position, and to identify any potential problems. If found acceptable for our purposes, the map is registered to a standard base map derived from USGS 7.5 min quads, and the distribution data are digitized into an ArcView GIS file.
We are asking our collaborators to provide a certain minimum amount of information (metadata) with each map contributed to the project, so that we (and the end-users of our data) can assess how accurately features were located in the field and represented on the map. This information should include at least items 1a and 2a - 2d below (2e is optional, and we determine 3a ourselves unless maps are already in digital form):
1) Factors related to field observations
(a) methods and degree of precision in fixing and recording the position of features
2) Factors related to field map
(a) Identity and scale of original source(s) for field map (e.g., a specific USGS 7.5 min quad)
(b) positional accuracy of original source(s) for field map
(c) actual scale and positional accuracy of field map (if different from 2a and 2b above)
(d) area of the minimum mapping unit used for polygon features
(e) thematic accuracy (%) of polygon features, as verified by a quantitative ground-truth survey
3) Factors related to the digitizing process
(a) accuracy of field map registration (see Appendix 2)
If the quantitative factors listed in section II above are known for any given map, we can then characterize its overall reliability according to an ordinal scale having just three "confidence levels": STRONG, MODERATE, and WEAK. Note that this asessment of confidence level is intended only as a rough qualitative evaluation; that is, the level of confidence we are willing to place in a spatial data set will be RELATIVELY STRONGER given:
(1) higher levels of precision in determining and recording field locations
(2) larger-scale field maps, based on original source(s) which meet USGS Map Accuracy Standards
(3) smaller minimum mapping units (implying higher resolutions)
(4) higher levels of thematic accuracy, and
(5) a "sufficiently small" Root Mean Square Error (RMSE) when digitally registering the field map (see appendix 3 below)
Appendix 1. Definitions of common terms related to confidence level
"SCALE" is the ratio of the actual length of a feature as measured on the ground to the length of the same feature as measured on a map; this is often stated in feet per map inch (e.g., 1" = 500'), or else as the equivalent "representative fraction" (e.g., 1:6,000, meaning that one unit of measurement on the map equals 6,000 of the same units on the surface of the earth).
"ACCURACY" is the general concept that refers to the relationship between a measurement or series of measurements of some feature or relationship on a map, as compared with some standard or accepted value. Note that accuracy is a relative term: one has to have a specific meaning in mind when judging whether a map depicts "reality" with sufficient accuracy. For example, accuracy can be discussed in terms of either: (a) resolution, (b) spatial precision, or (c) thematic or attribute accuracy.
"RESOLUTION" refers to the smallest distance separating two points that can be discriminated on a vector-format line map; or the area of the smallest two-dimensional object that can be depicted on a raster-format map (e.g., a single pixel on a digital image); or the area of the MINIMUM MAPPING UNIT on a thematic map. Resolution is scale-dependent: this means that as the map scale decreases, resolution also necessarily diminishes because feature boundaries must be smoothed, simplified, or not shown at all.
"SPATIAL PRECISION" is the amount of variation among individual measurements of geographic coordinates. Precision is affected by the method used to determine location in the field, and by the number of significant digits used in recording the data and storing it (e.g., in a computer).
"THEMATIC OR ATTRIBUTE ACCURACY" is the percentage of success in assigning the polygons of a thematic map to their correct qualitative categories; it is determined by locating and classifying a sample of actual features (e.g., vegetation patches) on the ground and comparing them to the polygon types as depicted on the map.
Appendix 2. Determining ground error during digitizing
An Root Mean Square Error (RMSE) is calculated when registering a field map prior to digitizing. In this case the RMSE provides a measure of how well the control points on the field map match up with their true or accepted values. A good match is important because it will affect the coordinate accuracy of all features that are subsequently digitized; the actual error in ground units will depend on the scale of the map being registered.
A typical minimum distance between coordinates that can be captured by a good digitizing tablet is between 0.001 and 0.002 inches; the software calculates an RMSE in either digitizer units (e.g., inches or mm) or ground units (e.g., ft or m) after all control points have been entered. ArcView 3 does this calculation automatically and compares the RMSE with the value entered by the user in the Error Limit field; if the RMSE is greater than the Error Limit ArcView will not allow the user to register the map.
As a point of reference, in ArcView 3 the default value for the Error Limit is set at 0.004 inches, which is recommended by the ArcView user's manual (ESRI 1996) as the maximum RMSE to "maintain highly accurate geographic data". For "less accurate data" the manual states that the Error Limit can be as high as 0.008 inches.
The smaller the scale of the input map being registered to the digitizer, the greater will be the amount of registration error in ground units. For example, if the map is a USGS 7.5 minute quad (1:24,000-scale), then 0.004 inches in digitizer units translates to a maximum RMSE of about 5 m on the ground, and 0.008 inches represents about 10 m. But if the scale of the map is small (say, about 1:654,445, or small enough that the entire island of Hawaii could be printed on a single 8.5" x 11" page), then 0.004 digitizer inches would translate to an error of about 66 m on the ground, and 0.008 digitizer inches would be an error of about 133 m.
The National Park Service - USGS/Biological Resources Division Vegetation Mapping Program is an ongoing cooperative effort by NPS and USGS/BRD to develop and test national standards for vegetation mapping at a scale of 1:24,000 in the national parks. At the time of this writing several draft documents on the classification system, field method, and accuracy assessment procedures (NPS/NBS 1994) are publicly available on a web site (http://biology.usgs.gov/npsveg/protocol.htm), but the final versions have not as yet been posted.
The proposed accuracy requirements of the National Park Service - USGS/Biological Resources Division Vegetation Mapping Program can be summarized as follows (from NPS/NBS 1994):
Maps must meet National Map Accuracy Standards; AND map scale must be 1:24,000; AND minimum mapping unit must be 0.5 hectare (1.2 acres) or smaller; AND thematic accuracy must be at least 80%; AND positional accuracy must be within a maximum allowable Root Mean Square Error (RMSE) of 6.0 meters (20 ft.) at ground scale (for a Class 1 product) OR within a maximum allowable RMSE of 12.0 meters (39 ft.) at ground scale (for a Class 2 product).
At map scales of 1:20,000 or smaller, the 1947 U.S. National Map Accuracy Standards (U.S. Bureau of the Budget 1947) require that not more than ten percent of all tested points should have a positional error of more than 1/50 inch (0.02 inch or 0.5080 mm) when measured at the publication scale. This translates into an allowable error of 40 ft. (ca. 12 m) in ground distance.
New National Cartographic Standards for Spatial Accuracy have recently [1994 -- ed.] been issued to replace the previous (1947) standards, but the new standards are not necessarily in conflict with 1947 standards. There are now two classes of accuracy, which is computed as a standard error (i.e., an RMSE) in the x- and y-coordinate directions (rather than as a circular error as implied in the 1947 standards).
The accuracy standards apply only to "well-defined" points, meaning control points such as road intersections or benchmarks, which are visible or otherwise recoverable on the ground. To check the positional accuracy of a 1:24,000-scale map, its horizontal control points must be located on the ground and their geographical coordinates ascertained within a radius of one meter. Note that the positional accuracy statements given above in terms of RMSE should be understood to indicate how well the map product is registered to its control, and not to indicate how well other features (e.g., vegetation polygon boundaries) reflect their true positions on the ground.
Most of the alien species distribution data sets so far contributed to the Ecosystems at Risk Project were mapped at scales much smaller than a 1:24,000 USGS quad map. In addition, none of our contributors have conducted the quantitative positional and thematic accuracy assessments called for in the draft NPS/NBS accuracy requirements. Therefore our present digital maps do NOT meet the proposed national vegetation mapping standards. During later stages of the project it may be feasible for us to collect sufficiently accurate GPS data and perform enough quantitative ground-truth sampling so that our maps can meet the national standards in specific geographic areas for a few selected species, but it is probably not realistic to expect that most areas in Hawaii will be mapped by us to this level of accuracy and precision.
However, the draft NPS/NBS procedures do offer some options to deal with circumstances in which accuracy cannot adequately be determined, or in which the requirements for some reason may not have been met. The option which appears to be most applicable to our project is the following: "State actual accuracy rather than conformance to an existing standard. Rather than report whether the product meets a given accuracy requirement, simply state what the accuracy estimate is...[i.e., in terms of each factor listed above]". This information then becomes one of the bases on which the HEAR project assigns a confidence level.
Thompson, M. 1987. Maps for America: Cartographic products of the U.S. Geological Survey and others. USGS, Reston, VA. 265 pp.
U.S. Bureau of the Budget. 1947. United States National Map Accuracy Standards. U.S. Bureau of the Budget, June 17, 1947.
U.S. Dept. Interior, NPS/NBS. 1994: (a) Final Draft: Vegetation Classification System, (b) Final Draft: Field Methods, (c) Final Draft: Accuracy Assessment Procedures. NBS/NPS Vegetation Mapping Program, Nov. 1994.
References cited
ESRI 1996. Using ArcView GIS. Environmental Systems Research Institute, Inc., Redlands, CA.
This page was last updated 29 January 1997 by PT