A dataset of citations extracted from monographs about the history of Venice, created in the context of the LinkedBooks project.
More information
- Github repository
- DOI: https://doi.org/10.5281/zenodo.377047
Contact: Giovanni Colavizza, Matteo Romanello
Related publications:
Please note that the publication lists from Infoscience integrated into the EPFL website, lab or people pages are frozen following the launch of the new version of platform. The owners of these pages are invited to recreate their publication list from Infoscience. For any assistance, please consult the Infoscience help or contact support.
Annotated References in the Historiography on Venice: 19th–21st centuries
We publish a dataset containing more than 40’000 manually annotated references from a broad corpus of books and journal articles on the history of Venice. References were considered from both reference lists and footnotes, include primary and secondary sources, in full or abbreviated form. The dataset comprises references from publications from the 19th to the 21st century. References were collected from a newly digitized corpus and manually annotated in all their constituent parts. The dataset is stored on a GitHub repository, persisted in Zenodo, and it is accompanied with code to train parsers in order to extract references from other publications. Two trained Conditional Random Fields models are provided along with their evaluation, in order to act as a baseline for a parsing shared task. No comparable public dataset exists to support the task of reference parsing in the humanities. The dataset is of interest to all working on the domain of reference parsing and citation extraction in the humanities.
Journal of Open Humanities Data. 2017. Vol. 3, p. 2. DOI : 10.5334/johd.9.