Data Management
Data management is a central component of working methodically with the eMAEX-approach which aims at making descriptive analyses and annotation data available and interconnecting them with audio-visual images. The archiving of the various research data and the analyzed material as well as its presentation in a comparative perspective is a central task. Our goal is to make our research data accessible in the sense of the FAIR-principles (Findable, Accessible, Interoperable, Reusable).
Database (CMS)
In order to present the database of the Hollywood war film in a clear and verifiable manner, a multimedia publication system based on a content-management-system was developed in cooperation with the Center for Digital Systems of the Freie Universität (Cedis). Its data structure follows our established methodical systematics. The data of each film segmented into scenes can thus be sorted and deducted along the different pathos categories. Comparative studies can therefore be conducted on various levels of the segmented audio-visual staging. The database combines video clips with text descriptions and diagrams as a multimedia presentation form.
Beyond that, the database was also applied and further developed in a Semantic MediaWiki. The wiki, which requires a login for editing, is also used by other research projects (e.g. on the Turkish-German cinema), in order to ascribe pathos categories to individual film scenes, or rather to ascribe narrative topoi or motives and to sort the data according to these concepts.
Annotation Infrastructure in the AdA-project
The AdA-project draws on various web-based services in and for interdisciplinary cooperation. The corpus study and collaborative annotation practice of the AdA-project is centrally based on a server infrastructure. With this, the video corpus can be managed, including processes of standardized uniform transcoding and linking with metadata. Apart from that, computation-intensive automatic video analyses can be performed. Furthermore, the systematic annotation vocabulary of the machine-readable AdA Filmontology itself as well as the semantically-structured Linked Open Data annotations created with it can be stored. The ontology’s data model is based on the idea that every concept and every annotation has a unique resource identifier (URI). Each individual entry can be accessed online by retrieving the URI. The open-source-annotation-software Advene, however, is used for annotating locally on the computer. Annotation data and the software development can be synchronized with GitHub. The results are publicly accessible: https://github.com/ProjectAdA. The data set of the AdA-projects was also published on Zenodo.
Since the end of the project, commercial server providers and video hosting platforms are used, in order to further utilize the developed annotation infrastructure. The metadata are stored in a graph database (triplestore) in the backend. With the AdA Annotation Explorer which was developed for the purpose of presentation and exploration, the annotation data can be retrieved, filtered, and visualized in sync with the audio-visual images. The open-source-software FrameTrail by Joscha Jäger forms the basis for the latter, which allows the direct in-browser viewing of interactive videos and interlinked content. The AdA Annotation Explorer was realized via a detached client-server-architecture with RESTful API which relies on open data exchange standards, as well as on the use of open-source-components.
All software components of the AdA Annotation Explorer’s web application as well as its backend developed in the AdA-project are freely accessible on GitHub.
Further Reading: