AEGIS aims to add value to the data stored in its platform by semantically enriching them with more useful information and this will be done utilizing well established standards and technologies.
All data registered and/or stored in the AEGIS platform have to be properly described in metadata to enable users and tools to find, understand, and (re-)use. Three levels of metadata are distinguished:
AEGIS Domain Vocabularies
Toward the creation of the AEGIS Domain Vocabularies and a repository for them, an initial identification of the requirements, in terms of semantic vocabularies and metadata, has been performed. Since the Public Safety and Personal Security domain is fairly wide and includes numerous diverse concepts and actors, it is consequent that the semantic vocabularies and metadata requirements will also be quite diverse, taking into account the different datasets that will be provided or used by the AEGIS pilots and therefore, covering a large range of different domains.
In this context, the following indicative datasets or measured quantities have been identified for each AEGIS pilot case:
Based on the above information, the various types of data were grouped into a set of high-level categories for semantic vocabularies that are required. For each of these categories, a set of semantic vocabularies have been identified and selected in such way, so as to provide a rich set of options for the process of selecting the most suitable semantic annotation of the platform’s data and also cover a wide range of possible dataset categories. The high-level categories and some identified semantic vocabularies are the following:
Health
- DICOM – Healthcare metadata – DICOM ontology (https://www.netestate.de/dicom/dicom.owl)
- Translational Medicine Ontology – TMO (https://code.google.com/archive/p/translationalmedicineontology/)
- The Disease Ontology (http://disease-ontology.org/)
Sensor
- Semantic Sensor Network – SSN (http://w3c.github.io/sdw/ssn/)
- Home Activity – ha (http://sensormeasurement.appspot.com/ont/home/homeActivity#)
- Sensor, Observation, Sample, and Actuator – SOSA (https://www.w3.org/ns/sosa/)
Traffic – Road Conditions
- Linked Datex II – Datex (http://vocab.datex.org/terms#)
Car Accidents
- Road Accident Ontology – RAO (https://www.w3.org/2012/06/rao.html)
Weather
- Smart Home Weather – SHW (http://paul.staroch.name/thesis/SmartHomeWeather.owl#)
- Home Weather – HW (https://www.auto.tuwien.ac.at/downloads/thinkhome/ontology/WeatherOntology.owl)
Map – Location
- LinkedGeoData ontology – LGDO (http://linkedgeodata.org/About/)
- Geo ontology (https://www.w3.org/2003/01/geo/)
Crime
- OntoFuhSen Ontology (https://github.com/LiDaKrA/Ontology)
- Italian Crime Ontology (https://www.researchgate.net/publication/228971566_A_domain_ontology_Italian_crime_ontology)
Security – Safety
- Security Ontology (http://securitytoolbox.appspot.com/securityMain)
- Acl (https://www.w3.org/ns/auth/acl)
Events – News
- Bbccore (https://www.bbc.co.uk/ontologies/coreconcepts)
- Ontologies Simple News and Press Ontologies – SNaP (http://data.press.net/ontology/)
Automotive – Transportation
- Ontology of Transportation Networks (http://opensensingcity.emse.fr/scans/entity/vocabulary_8)
- Road accident Ontology – RAO (https://www.w3.org/2012/06/rao.html)
General Purpose
- Dbpedia (http://dbpedia.org/ontology/)
- Dcterms (http://dublincore.org/documents/dcmi-terms/)
Apart from using appropriate domain vocabularies to semantically enrich the data of the AEGIS platform, contextual metadata will be used, in order to present a more detailed definition of the platform’s datasets and allow the users gain further insights into them. These metadata are information about the each single dataset, such as the title, the provider, the publication date and more. The following table presents a list of more general vocabularies that can be used by the AEGIS platform, so as to describe individual datasets and their quality based on multiple attributes.
Name | Description | Link |
Data Catalog Vocabulary
(DCAT) |
DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogues published on the Web. | https://www.w3.org/TR/vocab-dcat/ |
Vocabulary Of A Friend
(VOAF) |
VOAF is a vocabulary specification providing elements allowing the description of vocabularies (RDFS vocabularies or OWL ontologies) used in the Linked Data Cloud. In particular, it provides properties expressing the different ways such vocabularies can rely on, extend, specify, annotate or otherwise link to each other. It relies itself on Dublin Core and voiD. | http://purl.org/vocommons/voaf |
Vocabulary for Annotations
(VANN) |
A vocabulary for annotating vocabulary descriptions. | http://purl.org/vocab/vann/ |
Vocabulary of Interlinked Datasets (VoID) | The Vocabulary of Interlinked Datasets (VoID) is an RDF Schema vocabulary for expressing metadata about RDF datasets. It is intended as a bridge between the publishers and users of RDF data, with applications ranging from data discovery to cataloguing and archiving of datasets. | http://rdfs.org/ns/void# |
AEGIS Vocabularies and Metadata Repository
The AEGIS Vocabularies and Metadata repository is going to be the central repository for both the domain vocabularies and the metadata of the platform’s data. It aims to offer a number of different functionalities, such as querying, searching, managing and interlinking the vocabularies and the metadata. More specifically, through the repository, a user will be able to insert vocabularies, search for vocabularies or datasets based on different criteria and keywords, evaluate SPARQL queries, find related vocabularies, download data dumps and more.
The AEGIS Vocabulary Repository will be built on top of the LinDa Workbench infrastructure (http://linda-project.eu/tools/). LinDa is a generic vocabulary / ontology metadata repository that allows for registering, describing, and searching vocabularies. It also supports a variety of more advanced capabilities like transformation to RDF, analytics, visualizations, and more.

Currently, Linda makes available the description of more than 300 vocabularies used to describe data in the Linked Open Data cloud, which break down to thousands of classes and properties.

As the vocabulary repository serves the purpose of presenting the final user with various ontologies, it will support the transformation of traditional data formats to Linked Data by suggesting classes and properties. The usage of the repository will take place with actions that can be grouped in the following categories:
- Navigation: Actions that let the user search for vocabularies and entities inside them, read vocabulary descriptions, download the vocabulary RDF documents in various formats and get access to vocabulary visualizations and best usage practices.
- Usage feedback: Evaluation of vocabularies, discussions and commenting, that expose the advantages and disadvantages of choosing a vocabulary’s terminology to create transformation plans and guide the user base of an enterprise to vocabularies best representing its structure, operations and needs.
- Repository enrichment: Authenticated users may create and upload new vocabularies containing ontologies that do not exist to the initial repository or are specific to the enterprise. Vocabulary owners may further update their vocabularies at any times. The repository automatically extracts metadata information contained in the vocabulary RDF document like classes and properties, as well as their relations.
- Term suggestion: Web API methods pick the most prevalent vocabulary terms that describe real world objects and relationships.

Author: NTUA