Project Goals and Methods: Extending the Hymenoptera Ontology
From HAO Wiki
Strengthen the HAO through increased engagement of hymenopterists
The utility of the HAO is directly proportional the amount of input and work provided by the hymenopterist community, i.e. the domain experts who best understand the intricacies and complex interplay of our descriptive terminology. No amount of automation or artificial intelligence can replace the expertise of a worker who has dedicated her life to the study of the anatomy of a given organism. Recognizing this, the single most important improvements we can make to the HAO will be those that simplify and encourage the interaction of the hymenopterists (i.e., human) element. We will encourage the development of the HO by continuously simplifying the process by which feedback and input from the larger community can be integrated into the formalized data structure that is the HO. These improvements will include:
- Context sensitive linkages between the HAO database and the HAO wiki. In our existing database (mx) we provide a simple means to provide context sensitive help by dynamically generating links to help pages in an associated wiki. We will adopt this functionality by mapping 1:1 the terms and relationships employed in the ontology to wiki pages intended for the use of discussion and debate by not only Hymenopterists community but also others such as experts in semantics and computer science.
- Employ feedback mechanisms for the management and archiving of suggestions for improvements and corrections on the existing utilities. While a similar mechanism already exists for mx (SourceForge repository), we will create a similar ticket-based bug tracking and feature-request system, likely using the TRAC or similar software, specifically for HAO-related software.
- Outreach via physical and virtual workshop. In this goal we seek to raise awareness of the ontology and highlight how the community as a whole will benefit from its existence and continued development. We will work in close conjunction with AmphibAnat to meet these goals. The project would also run a blog that highlights results and RSS feeds for individuals interested tracking changes to the Hymenoptera Anatomy Ontology.
We recognize that an ontology of Hymenoptera anatomy is not complete without reference to other languages, and we will develop our tools with multi-language support built in. Our team already actively engages foreign-language speaking collaborators, and we should be able to test and use this functionality eminently.
Refining and formalizing the HO
Our initial efforts focused on capturing the most widely used terms and easily derived relationships between those terms. Continuing work in this area focuses on two aspects: data capture and model formalization. Following standards established by the Plant Ontology Consortium (Ilic et al. 2007), the HAO group established the following requirements in this regard: 1) All terms must be referenced by a publication that defines and/or introduces usage of that term. References are stored in the ontology database and translated in the OBO format to dbxref format. 2) All concepts must be related (i.e., the HAO cannot contain orphaned terms). Each term (or minimally each preferred term, as determined by the domain community, in the case of synonyms) must be typified by an instance in Morphbank (images and especially annotations on images). Linkages to Morphbank are presently used to enable the figuring of terms in the ontology.
Further refinement will specifically address especially problematic character systems. This phase of this project requires a research associate (István Mikó) with expertise in comparative Hymenoptera anatomy, who will work in concert with PI Deans and graduate students to re-examine the following character systems (currently fraught with homonymy and synonymy): ridges and carinae of the apocritan functional thorax (mesosoma), glands, coriaceous patches on the body surface, male genitalia, female genitalia, ovipositor, sclerites at the wing base (including associated muscles, notal and pleural wing articulations), apodemes. Specific attention will be paid to homonyms, for example speculum, pedicel, gaster, face, stigma, disc, metapleural triangle, and anellus, which arguably create more confusion than synonyms and make data extraction especially difficult.
To ensure extensibility of the HAO we will further refine its structure with respect to the mathematical logic and models underlying the presently defined relationships. This process includes the integration of the HAO with the Common Anatomy Reference Ontology (CARO; Haendel et al. 2008) and, with help from a domain expert postdoc, a mapping of known (or demonstrable) homologies between hymenopterans and the two model organisms for which anatomical ontologies are extensively used: mosquito (in this case focused on Anopheles gambiae) and Drosophila melanogaster. The HAO will then be positioned for integration with a broad array of ontologies from other domains.
Integration of the HAO with the broader scientific community
The HAO will be integrated with the larger scientific community through the use of the widely used and broadly supported OBO standard and through automated translation into Web Ontology Language (OWL) format (e.g., OBO 2 OWL). The following are critical steps in this process will be met.
Formalization of references to terms managed in other repositories. Mechanisms for linking terms within the HO to terms governed by other ontologies through the globally unique identifiers (GUIDs) provided by those ontologies will be created. Terms currently in the HAO that belong elsewhere (e.g., 'anteriad' HAO:0001738 should be in the spatial ontology, and 'rugose' HAO:0000749, as a phenotype descriptor, belongs in PATO) will be migrated to appropriate ontologies. PATO in its current form cannot be used to adequately to describe Hymenoptera phenotypes. Aggregating terms for submission to other ontologies will be accomplished with the tagging functionality built into mx (Fig. 1). This allows tags, for example "PATO candidate" to be applied sets of terms which will ultimately be exported and submitted to their respective ontologies. Once integrated into another ontology that resource is back-matched in the HO DB, and the relevant dbxref is linked to the term(s) in question.
Availing the HAO is a standard format is part of our strategy to integrate this ontology with the rest of science. Periodic deposits of explicitly versioned HOs into the OBO Foundry (see Table 3), the current standard repository for biological ontologies, will ensure maximum utility and opportunities for user feedback.
Providing a RESTful web services API to the HAO DB. The API will allow anyone to access not only the HO but also create their own web-accessible ontology using mx and the API. Formalize our relationship with other projects. We plan to regularly share ideas of what worked (and what didn't) with others working on ontology related informatics projects. In particular we have established a working relationship with the developers of Phenote and and Phenex.
Thanks to the tight interplay between the empirical data and interactive software that will be derived from this project our end product will be transformational, benefiting both empiricists and theoreticians. Development of the ontology and related tools will use rapid prototyping and we will immediately provide feedback to our users.