The AEGIS Architecture & Technical Requirements: Methodology & Produced Results

posted in: Blog | 0

The AEGIS platform promises a novel approach in data sharing and exchanging: a safe environment for the stakeholders to sell and purchase datasets, data services, algorithms or intelligence reports. In order to reach its objective AEGIS needs to understand the ambient in which it operates and the needs of its stakeholders, so as to offer a set of services adding value to this value chain.

Towards reaching this objective, AEGIS analysed the methodologies available for requirements engineering so as to decide upon the most relevant and most efficient one within the context of the project. From this state of play analysis, the AEGIS consortium concluded that the most suitable, most appropriate and most efficient when adopted methodology is the Agile development methodology. The Agile methodology provides opportunities to assess the direction of a project throughout the development lifecycle. It is people-focused and communications-oriented under the light that requirements cannot be fully collected at the beginning of the software development cycle. On the contrary, every aspect of development — requirements, design, etc. — is continually revisited to have time to steer it in another direction. The Agile planning is adaptive, focused on quick responses to change through continuous development and improvement.

The whole product in Agile methodology is broken into small increments that minimize the overall risk and allow the product to adapt to changes quickly. Iterations are short time frames (called timeboxes), each involving a cross-functional team working in all functions: planning, analysis, design, coding, unit testing, and acceptance testing. The requirements/tasks to be done in the timebox are defined at the beginning of iteration and agreed upon by the team. Another key concept in Agile development is the close collaboration between the development team and the business stakeholders. In accordance with that, the requirements elicitation was achieved adopting the principles of user stories, coming directly from the demonstrator partners.

Figure 1. Requirement Engineering

 

The Agile-compliant phases defined for the engineering of the AEGIS user requirements are graphically illustrated in Figure 1. After the AEGIS stakeholders have been identified, with their high-level goals, objectives and needs being properly defined, the User Stories need to be elicited. Two of the most popular methods for eliciting user stories are through workshops and structured and/or semi-structured interviews. Towards this end, within the context of AEGIS, such workshops and interviews were performed with the pilot teams by the technical partners in order to obtain the relevant user stories.

User stories are a very high-level definition of requirements, describing a feature told from the end user’s perspective (i.e. who desires the new capability), usually a user or customer of the system. A user story is short, generally one-sentence, but it contains enough information to describe the requirement, so that the developers can produce a reasonable estimation of the effort to implement it. A user story typically follows a simple template: As a < user-type>, I want to <user-requirement> so that <reason>. User stories are written throughout the Agile project. Usually a story-writing workshop is held close to the start of the Agile project. Additionally, new stories can be written and added every iteration.

The collected user stories were in turn analysed by the technical partners so that the user requirements could be extracted, aggregated and properly analysed for consistency. In order to come up with a proper listing of requirements, these requirements need to be properly classified. Towards this end, within the context of AEGIS, the requirements were split taking into consideration two main attributes: 1) whether they belong to the core platform, or whether they are demonstrator specific, and 2) whether they constitute functional or non-functional requirements.

  • Functional requirements are one of the most well-known types of requirements, defining the required behaviour of the system to be built, as reported by a hypothetical observer envisioning the inputs that the future system will accept and the outputs it will produce in response to those inputs, e.g., they define the capabilities that a product must provide to its users. Functional requirements are based on system objectives and respond to the critical task of ensuring the right implementation of the expected functionality in the final software. One of the main tools to achieve this goal is system testing, i.e., a mechanism to verify that the system performs the behaviour expressed in its requirements.
  • Non-functional requirements specify additional properties of AEGIS, other than functionality. These requirements can be subcategorized into categories such as performance, design constraints (that can also be categorized under external interface requirements), logical database requirements, and “characteristics” (termed “attributes” in IEEE Std. 830) that don’t fit neatly into any of the other categories. The Non-Functional Requirement can also describe quality attributes, design and implementation constraints that the product must have, thus they are more qualitative and may require a different approach for their elicitation. To identify the not functional requirements, the model proposed by ISO/IEC 25010:2011 was adopted. Following that model there are eight quality characteristics contributing to software product quality. The ISO/IEC 25010:2011 Software Product Quality model is illustrated in Figure 2.
  • Core Requirements refer to all requirements associated with, and addressed by the AEGIS platform. They are not binded with a specific demonstrator but cater for the current, and future needs associated with the services AEGIS aims to offer to all stakeholders horizontally. All core processing tasks represented by Core Requirements will be available to all pilot/stakeholders.
  • Demonstrator requirements are the requirements of each specific demonstrator. Demonstrator requirements refer to actions performed by the users or to processes supported by the applications to be developed on top of AEGIS.
Figure 2. ISO/IEC 25010:2011 Software Product Quality model

 

A table was thus created to document these extracted requirements. An extract of these requirements follows and is presented in Table 1 below.

Table 1. Requirements Listing

Id Requirement Type Requirement Nature Requirement Description
UBI1 Core Functional AEGIS should be able to process sensor data, including environmental (indoor and outdoor), occupancy sensing and Air Quality monitoring, from installed physical devices.
KTH1 Demonstrator Functional Users should be able to choose a service to work on their data
GFT2 Core Functional AEGIS needs to have a data/knowledge base to handle traceability and submitted issues status

 

The final list of user requirements was produced after having been validated within the context of a dedicated workshop involving pilot and technical partners. An extract of these requirements follows and is presented in Table 2 below. The provided extract refers only to core functional requirements.

 

Table 2. Core Functional Requirements

Id Description (Detailed description of the requirement)
Analytics
CFR22 AEGIS should be able to process daily routines as self-reported from users or automatically extracted by wearables
CFR4 AEGIS has to support many analysis types (e.g. estimation of correlations between variables, linear regression, predictive analysis, clustering algorithms, simulations)
CFR41 AEGIS should be able to process sensor data (including environmental (temperature/humidity/luminance), occupancy sensing and Air Quality monitoring) from installed physical devices
Correlation
CFR10 AEGIS has to correlate datasets to geospatial data with their description
CFR11 AEGIS should work simultaneously with public and private (customers) data
CFR28 AEGIS should be able to correlate positional information (and additional information from wearables) with Public Health Information data and announcements, taking into account also time
Security / Privacy
CFR6 AEGIS has to implement security mechanisms as well as proper handling of privacy issues (e.g. in case of using private datasets)
CFR7 AEGIS needs to display different levels of information depending on who is accessing to the data
CFR8 AEGIS should allow the creation of the different users/groups and access rights for authorized system user

 

The extraction and analysis of the requirements led to the design of the AEGIS architecture, which of course followed a number of iterations prior to reaching its final design. The stable, final version 1 design of the AEGIS architecture is graphically illustrated in Figure 3. Each of the functional and non-functional requirements translated into technical requirements were mapped into functionalities of components, so that the set components comprising the holistic AEGIS architecture cover the complete set of functional, non-functional and technical requirements.

 

Figure 3. AEGIS conceptual architecture

 

The core of the AEGIS platform is its Big Data Processing Cluster. For this cluster, the project consortium opted to exploit the Hops platform[1]. Hops is a next-generation distribution of Apache Hadoop[2] supporting Hadoop as a Service, project-Based Multi-Tenancy, secure sharing of datasets across projects, extensible metadata that supports free-text search using Elasticsearch[3] and YARN[4] quotas for projects among other features and services. Hopsworks is the User Interface built around Hops providing graphical access to the integrated services such as Spark, Yarn, Elasticsearch, Kafka and Apache Zeppelin. HopsYARN is undertaking the responsibility of the resource management of the cluster. HopsYARN is a distributed stateless Resource Manager that enables Hops to have no down-time, providing also efficient resource management with consistent operations, security and data governance tools.

The storage capabilities of the platform will be provided by the AEGIS Data Store. The AEGIS Data Store will be based on HopsFS, a new implementation of the Hadoop Filesystems (HDFS) provided by Hops. HopsFS enables more scalable clusters as it supports multiple stateless NameNodes where the metadata are stored in an in-memory distributed database increasing performance dramatically. It should be noted that the AEGIS Data Store will be also extended with additional storage solutions, such as solutions for storing linked metadata.

The AEGIS Data Harvester consists of all applications enabling the import of the data and metadata of the original data set, along with any possible transformations required. Since the original data sets provided may be in multiple formats it is essential that a wide spectrum of possible forms of data are supported. The AEGIS Data Annotator component refines the output of Data Harvester with the main purpose being the enrichment of the metadata or linked data using predefined ontologies and vocabularies. Through the semantic annotation, the concepts included in the selected subset will be related to well defined semantics. Semantic annotations provide information ‘about’ the data, for example the meaning or what the data is about and the available semantic relationships from a domain model in which the data is defined. The purpose of semantically annotating a dataset is to create a context in terms of the content and functionality of the data so that it can be easily interpreted, combined and reused by computers.

The AEGIS Brokerage Engine is responsible for applying the policies concerning read and execution permissions as defined in AEGIS Data Policy and for the artefacts of AEGIS platform such as datasets, services and algorithms. The Brokerage Engine is also responsible for maintaining the records of any action performed over these artefacts.

The AEGIS Query Builder is a graphic tool that can be used to create simple or complex queries in a user-friendly way. It will facilitate the query building procedure with a simple and easy-to-use user interface even for the complicated queries allowing the user to choose from multiple data sources and apply filters with less effort. The results of the execution will serve as input to the AEGIS Algorithm Execution Container and / or to the AEGIS Visualizer.

The AEGIS Algorithm Execution Container is the component where selected or requested algorithms are executed. It consists of two processes, the Algorithm Parameteriser and the Algorithm Executioner. The Algorithm Parameteriser is a small process only responsible for providing the parameter values of the algorithm to be executed, when applicable, as selected by the user to the main process, which is the Algorithm Executioner. The Algorithm Executioner is responsible for the initialization and monitoring of the execution of the selected algorithm, which includes communicating the Big Data Processing Cluster to initiate the execution and waiting for the results of the execution that will be later be provided to Visualizer.

The AEGIS Visualizer provides visualization capabilities on top of the content provided by the Algorithm Execution Container or the results of the query composed and executed by the Query Builder. It provides a variety of bar, line and scatter plots, charts, tables, and maps. Also, the AEGIS Visualizer will provide the ability to the user to quickly create and share flexible, dynamic dashboards.

The AEGIS Orchestrator is responsible for the interconnection of various services of the AEGIS platform, facilitating the flow of information between the services and the execution of the workflows involving several distinct services of the platform, especially the ones involving any integrated service of Hopsworks with the rest of the services of the platform. Thus, the AEGIS orchestrator will act as a mediator between services upon needs, utilizing the exposed interfaces of the services.

Finally, the AEGIS Front-End is the upper layer of the AEGIS platform providing an innovative User Interface for the AEGIS stakeholders. The AEGIS Front-End will provide a user-friendly interface, facilitating the navigation between the AEGIS platform functionalities in a flexible, easy-to-use and secure way.

Each AEGIS platform component was mapped to a set of functionalities which it undertakes, either in a stand-alone manner or in combination with another component and vice-versa; each technical requirement (stemming from corresponding functional and / or non-functional requirements) is allocated to, and undertaken by one (or more in collaboration) platform component. This mapping between components and requirements was documented in the form of a table, an extract of which is provided in below.

Table 3. Mapping between Requirements & Components

ID Need Priority Component
TR7 Process data from structured sources High Data Harvester, Data Annotator,

Algorithm Execution Container, AEGIS Data Store

TR8 Process data from semi-structured sources High Data Harvester, Data Annotator,

Algorithm Execution Container, AEGIS Data Store

TR12 Produce RDF Triples High Data Harvester, Data Annotator
TR14 Handle big data scalability High AEGIS Data Store, Query Builder, Algorithm Execution Container, Big Data Processing Cluster

 

Blog post authors: UBITECH

 

[1] https://www.hops.io/

[2] http://hadoop.apache.org/

[3] https://www.elastic.co/products/elasticsearch

[4] https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html