Introduction to querying
Last updated on 2023-04-24 | Edit this page
Overview
Questions
- What is SPARQL?
- How to use SPARQL to query Wikidata?
- How to use Wikidata querying tools?
Objectives
- Know what a query language is, and how SPARQL differs SQL.
- Be able to use SPARQL to query Wikidata.
- Potentially be able to use a tool like TABernacle to edit based on a query.
- Have a cursory knowledge of the plethora of Wikidata querying tools and how they can be used by librarians.
- Know the purpose and usefulness of maintenance queries for identifying missing information.
- Be able to create maintenance queries.
FIXME
There are different ways to query information in Wikidata. The simplest way is to search for an entry in Wikidata and looking up all information for that entry, e.g. search for Richard Feynmann. This search looks by default in the Q-pages as well as the P-pages. However, we can restrict a search for a property by only looking in the P-pages, e.g. if we want to look whether there is property for the ISBN we can restrict that search to properties only. Moreover, for a given entry there is always the possibility to see other pages which links to that (e.g. using it as an object), e.g. all pages linking to Richard Feynman: https://www.wikidata.org/wiki/Special:WhatLinksHere/Q39246
That is not much different from other searches you may be familiar with. However, the real potential of Wikidata as a huge knowledge graph, can be experienced through more advanced querying with the Wikidata query service where the queries have to entered in SPARQL.
% To discover Wikidata objects nearby there is the nearby search: % https://www.wikidata.org/wiki/Special:Nearby
5.1 What is SPARQL?
SPARQL is a query language for RDF data and is a W3C recommendations since 2008. The data has to be stored as triples where the object of one triple can be the subject of another triple. Thus, one can think about a huge knowledge graph, where the nodes are connected by the predicates with other nodes. For example here we see all the information about the book “The Meaning of It All” from Wikidata as a graph:
% source: http://tinyurl.com/y267yz5q
However, this is only the graph spanned by one item and its connected entries, which then itself also have more connections, e.g. we can open some links from the author Richard Feynman:
% click on that node in the above query
For querying data now in this knowledge graph with SPARQL we define some graph patterns which we want to search. The simplest form is a triple where we replace one of the components with a variable, which is indicated by a string starting with a question mark:
- Query for the publisher:
{ wd:Q7750812 wdt:P123 ?publisher . }
- Query for the connection:
{ wd:Q7750812 ?property wd:Q353060 . }
- Query for the publications from Addison-Wesley:
{ ?book wdt:P123 wd:Q353060 . }
5.2 Wikidata Query Service
The Wikidata query service can be found at https://query.wikidata.org/. There is the main window on the right to formulate your query in SPARQL. On the left there is the query helper and at the bottom the result will show up.
We will only cover here SELECT
-statements and start by
typing
SELECT * WHERE {
}
Hint It is enough to start typing “SELECT” and then use the auto-completion with Ctrl+Space. % TODO what is this for on a Mac?
Inside the parenthesis you can then place the statements describing the graph pattern you are looking for.
Exercise: Your first SPARQL query
Write your first SPARQL query for the publisher of the above mentioned book by copying the part from above point inside a SELECT-statement.
SELECT * WHERE {
wd:Q7750812 wdt:P123 ?publisher .
}
Namespaces and Prefixes
Prefixes are short abbrevations in the Wikidata Query Service. Some prefixes in Wikidata are: wd, wdt, p, ps, bd, etc.
Example:
SELECT ?item ?itemLabel
WHERE
{
?item wdt:P50 wd:Q23434.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Items should be prefixed with wd: and properties with wdt: .
Namespaces in Wikidata are:
- Main namespace
- Property
- Wikidata: it is for information and discussions about Wikidata itself. etc.
5.3 Try examples
Cats example
SELECT ?item ?itemLabel
WHERE
{
?item wdt:P31 wd:Q146. # Must be of a cat
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } # Helps get the label in your language, if not, then en language
}
Map of libraries
SELECT distinct * WHERE {
?item wdt:P31/wdt:P279* wd:Q7075;
wdt:P625 ?geo .
}
scholarly articles by Alex Bateman
SELECT ?item ?itemLabel ?journalLabel
WHERE
{
?item wdt:P31 wd:Q13442814.
?item wdt:P50 wd:Q18921408.
?item wdt:P1433 ?journal.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en". }
}
Russian poets
SELECT ?item ?itemLabel ?place ?placeLabel ?coord
WHERE
{
?item wdt:P31 wd:Q5.
?item wdt:P106 wd:Q49757.
?item wdt:P19 ?place.
?place wdt:P17 wd:Q159.
?place wdt:P625 ?coord
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
chemicals example
SELECT ?item ?itemLabel WHERE {
?item wdt:P31 wd:Q11173, wd:Q12140.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en". }
}
SELECT ?item ?itemLabel ?struc ?formula
WHERE {
?item wdt:P31 wd:Q11173, wd:Q12140.
?item wdt:P117 ?struc.
?item wdt:P274 ?formula
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en". }
}
SELECT ?item ?itemLabel ?formula ?mass ?struc
WHERE {
?item wdt:P31 wd:Q11173, wd:Q12140.
?item wdt:P117 ?struc.
?item wdt:P274 ?formula.
?item wdt:P2067 ?mass.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en". }
}
ORDER BY DESC(?mass)
LIMIT 10
People born in Berlin filtered by year 1970
SELECT ?item ?itemLabel ?dob
WHERE
{
?item wdt:P31 wd:Q5.
?item wdt:P19 wd:Q64.
?item wdt:P569 ?dob.
FILTER(YEAR(?dob) = 1970)
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
5.4 More Advanced queries
further links
https://commons.wikimedia.org/wiki/File:Wikidata_Query_Service_in_Brief.pdf
https://www.uni-mannheim.de/media/Einrichtungen/dws/Files_Teaching/Semantic_Web_Technologies/SWT05-SPARQL-v1.pdf
https://www.wikidata.org/wiki/Wikidata:SPARQL_tutorial
Key Points
- First key point. (FIXME)