GRhOOT, the German RhetOrical OnTology, is a domain ontology of 110 rhetorical figures in the German language. The overall goal of building an ontology of rhetorical figures in German is not only the formal representation of different rhetorical figures, but also allowing for their easier detection, thus improving sentiment analysis, argument mining, detection of hate speech and fake news, machine translation, and many other tasks in which recognition of non-literal language plays an important role.
Ramona Kühn, Jelena Mitrović and Michael Granitzer
We present an effective way to create a dataset from relevant channels and groups of the messenger service Telegram, to detect clus- ters in this network, and to find influential actors. Our focus lies on the network of German COVID-19 sceptics that formed on Telegram along with growing restrictions meant to prevent the spreading of COVID-19.
Valentin Peter, Ramona Kühn, Jelena Mitrović, Michael Granitzer and Hannah Schmid-Petri
The main focus of the paper is the definitional revision and enrichment of offensive language typology, making reference to publicly available offensive language datasets and testing them on available pretrained lexical embedding systems. We review over 60 available corpora and compare tagging schemas applied there while making an attempt to explain semantic differences between particular concepts of the category OFFENSIVE in English.
Barbara Lewandowska-Tomaszczyk, Slavko Žitnik, Anna Bączkowska, Chaya Liebeskind, Jelena Mitrović and Giedre Valunaite Oleskeviciene
This paper examines several widespread assumptions about artificial intelligence, particularly machine learning, that are often taken as factual premises in discussions on the future of patent law in the wake of ‘artificial ingenuity’. The objective is to draw a more realistic and nuanced picture of the human-computer interaction in solving technical problems than where ‘intelligent’ systems autonomously yield inventions. A detailed technical perspective is presented for each assumption, followed by a discussion of pertinent uncertainties for patent law. Overall, it is argued that implications of machine learning for the patent system in its core tenets appear far less revolutionary than is often posited.
Daria Kim, Maximilian Alber, Man Wai Kwok, Jelena Mitrović, Cristian Ramirez-Atencia, JesÚs Alberto RodrÍguez Pérez, Heiner Zille
We introduce HateBERT, a re-trained BERT model for abusive language detection inEnglish. The model was trained on RAL-E, a large-scale dataset of Reddit comments in Englishfrom communities banned for being offensive, abusive, or hateful that we have collected andmade available to the public. Results and trained models are published on an OSF repository.
Caselli Tommaso, Basile Valerio, Jelena Mitrović and Michael Granitzer
In this contribution, we investigate a recent dataset for offensive language in English, namely OLID/OffensEval (Zampieri et al. 2019a; Zampieri et al. 2019b), in the light of two factors proposed by Waseem et al. 2017.
Tommaso Caselli, Valerio Basile, Jelena Mitrovic, Inga Kartoziya, Michael Granitzer
We discuss ontological modeling of legal terminology in formal ontologies, such is SUMO (Pease, 2001) and the possibility of utilizing its close connection to the lexical-semantic network WordNet (Fellbaum, 1998) in the legal domain. Formal systems that allow for automated semantic interpretation of law supported by lexical resources can bring forth solutions to legal reasoning tasks.
Jelena Mitrović, Adam Pease, Michael Granitzer
This paper surveys ontological modeling of rhetorical concepts, developed for use in argument mining and other applications of computational rhetoric, projecting their future directions. We include ontological models of argument schemes applying Rhetorical Structure Theory (RST); the RhetFig proposal for modeling; the related RetFig Ontology of Rhetorical Figures for Serbian (developed by two of the authors); and the Lassoing Rhetoric project (developed by another of the authors).
Jelena Mitrović, Cliff O’Reilly, Miljana Mladenović and Siegfried Handschuh
The paper introduces the creation and analysis of a German legal citation network. The network consists of over 200.000 German court cases from all levels of appeal and jurisdiction and more than 50.000 laws. References to court decisions and laws are extracted from within the decision text of the court cases and added as links to the network. We apply network-based analysis techniques to support common legal information retrieval tasks such as identification of important court decisions and laws and case similarity searches. Furthermore, we demonstrate that the German case citation network displays scale-free behaviour, similar to that of the U.S. and Austrian Supreme Courts as shown by previous research.
Tobias Milz, Michael Granitzer, Jelena Mitrović
We present a prototypical yet robust anddiverse data set for media bias research. It consists of 1,700 statements representing variousmedia bias instances and contains labels for media bias identification on the word and sentencelevel. In contrast to existing research, our data incorporate background information on theparticipants’ demographics, political ideology, and their opinion about media in general.
Timo Spinde, Lada Rudnitckaia, Jelena Mitrović, Felix Hamborg, Michael Granitzer, Bela Gipp, Karsten Donnay