This file hosts a contribution to the 10th International Conference on Ecological Informatics (ICEI 2018) taking place on 24-28 September 2018 in Jena and specifically the Session S1.6 "Semantics for biodiversity and ecosystem research" at 10:30 - 14:15 on Thursday 27 September 2018 in Lecture Hall 4. This talk is scheduled for around 13:15.
Visualizing the research ecosystem of ecosystem research via Wikidata
- https://twitter.com/EvoMRI/status/1045241579894706176
- Featuring
- Scholia homepage: https://http://scholia.toolforge.org//
- topics:
- invasive species https://scholia.toolforge.org/topic/Q183368
- biodiversity https://scholia.toolforge.org/topic/Q47041
- ecosystem https://scholia.toolforge.org/topic/Q37813
- zoonosis https://scholia.toolforge.org/topic/Q182672
- Zika virus https://scholia.toolforge.org/topic/Q202864
- Scholia https://scholia.toolforge.org/topic/Q45340488
- FAIR data https://scholia.toolforge.org/topic/Q29032648
- open data https://scholia.toolforge.org/topic/Q309901
- Denmark and machine learning https://scholia.toolforge.org/country/Q35/topic/Q2539
- work: https://scholia.toolforge.org/work/Q27973671
- author: https://scholia.toolforge.org/author/Q55099872
- event: https://scholia.toolforge.org/event/Q50706744
- journal: https://scholia.toolforge.org/venue/Q15763359
- chemical class: https://scholia.toolforge.org/chemical-class/
- reuse: https://www.wikidata.org/wiki/User:Daniel_Mietchen/Wikidata_lists/Usage_of_Template_Scholia
If anything goes wrong with the demo, we will be using this recent presentation as a backup
Like research in general, biodiversity and ecosystem research takes place in a sociotechnical ecosystem that connects researchers, institutions, funders, databases, locations, publications, methodologies and related concepts with the objects of study and the world around them. Schemas for describing such concepts are growing in breadth and depth, number and popularity, as are mechanisms to persistently and uniquely identify the concepts, the schemas, their relationships or any of their components. In parallel, more and more data — and particularly metadata — are being made available under open licenses, which facilitates discoverability, reproducibility and reuse, as well as data integration.
Wikidata is a community-curated open knowledge base in which concepts covered in any Wikipedia — and beyond — can be described in a structured fashion that can be mapped to RDF and queried using SPARQL as well as various other means. Its community of close to 20,000 monthly contributors oversees a corpus that currently comprises nearly 50 million 'items', i.e. entries about concepts. These items are annotated and linked via almost 5000 'properties' that describe relationships between items or between items and external entities or that express specific values. The items and properties have persistent unique identifiers, to which labels and descriptions can be attached in about 300 natural languages. For instance, Q61457 represents the item for 'acetaldehyde' and Q183339 'Antilope cervicapra', while P3063 stands for the property of 'average gestation period', and P3117 for 'DSSTOX substance identifier'. Besides taxa, chemical compounds, toxicology, geomorphological features or ecological interactions, Wikidata also contains information about researchers and many components of their research ecosystems, including a growing body of publications and databases, particularly in the life sciences.
A range of open-source tools is available to interact with Wikidata — to enter information, curate and query it. One of them is Scholia, a frontend to Wikidata's SPARQL endpoint. Available via https://scholia.toolforge.org/ , it can be used to explore research publications and how they relate to authors, institutions, funders and other parts of the research ecosystem, as well as to taxa, metabolic networks, or geolocations.
In this presentation — which will be given on the basis of https://github.com/Daniel-Mietchen/events/blob/master/ICEI2018-research-ecosystem.md — we will use Scholia as a starting point for exploring how information about biodiversity and ecosystem research is represented in Wikidata and how it can be explored, curated and reused.
Points to consider including
- link to relevant WikiProjects
- Taxonomy
- Invasive species
- Informatics
- WikiCite
- link to previous talks
- citizen science
- SDGs
- examples
- topics:
- invasive species https://scholia.toolforge.org/topic/Q183368
- biodiversity https://scholia.toolforge.org/topic/Q47041
- ecosystem https://scholia.toolforge.org/topic/Q37813
- zoonosis https://scholia.toolforge.org/topic/Q182672
- work: https://scholia.toolforge.org/work/Q27973671
- authors:
- https://scholia.toolforge.org/author/Q55099872
- some of the keynotes?
- institutions
- journal: https://scholia.toolforge.org/venue/Q15763359
- taxon aspect example — Caenorhabditis elegans?
- chemical/ pathway examples
- events: https://scholia.toolforge.org/event/Q50706744
- topics:
- basic Scholia stats
- link to biodiversity and ecosystem research
- open data
- data integration
- Wikidata
- multilingual
- taxon names vs. commons names
- Wikibase
- games, e.g. file candidates
- Scholia
- taxa, habitats and nature of ecological interactions
- validation
- SDGs
- Each of these items has a persistent unique identifier, e.g. Q52105 for 'habitat'. The items are annotated and linked via about 5000 'properties' (likewise with persistent identifiers) that describe relationships between items — e.g. P171, 'parent taxon' — or between items and external entities — e.g. Mount Kilimanjaro (Q7296) has the Smithsonian volcano ID (P1886) of '222150' — or that express specific values, e.g. that Antilope cervicapra (Q183339) has an average gestation period (P3063) of 5-6 months.
- ICEI2018.md
- ICEI2018-citizen-science.md
- Wikimedia projects and citizen science
- EPA CompTox and Wikidata
- Global map of national parks, per Wikidata
- Map of geolocated topics co-occurring with taxa as the main subject of scholarly publications
- Ozymandias
The conference had a call for proposals with an April 15 deadline, by which I submitted the abstract, which received the submission number 143. On May 15, I was notified of its acceptance.
- Daniel Mietchen
- Finn Årup Nielsen
- Egon Willighagen
- daniel.mietchen[at]virginia.edu*
- faan[at]dtu.dk
- egon.willighagen[at]maastrichtuniversity.nl
- Data Science Institute, University of Virginia, Charlottesville, VA, USA
- Cognitive Systems, DTU Compute, Technical University of Denmark, Copenhagen, Denmark
- Department of Bioinformatics - BiGCaT, Maastricht University, Maastricht, The Netherlands
- Wikidata
- SPARQL
- research system
- visualization
- bibliometrics