Spatiotemporal Reasoning for Complex Video Event Recognition in Content-Based Video Retrieval

Ontology-based representation of video scenes and events indicates a promising direction in content-based video retrieval. However, the multimedia ontologies described in the literature often lack formal grounding, and none of them are suitable for representing complex video scenes. This issue can be partially addressed using SWRL rules, which, however, can lead to undecidability. This paper presents a hybrid description logic-based architecture that employs general, spatial, temporal, and fuzzy axioms for video scene representation and automated reasoning-based scene interpretation, while achieving a favorable tradeoff between expressivity and reasoning complexity.