-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
search wikidata images by their checksums (content hashes) #14
Comments
Accordingly, I've manually annotated a wikimedia commons entry with their associated checksums in sha1, sha-256 and md-5 speak. |
a sample query SELECT ?item ?image WHERE {
?item wdt:P4092 "85379b346e61c06033a12720155f3bf13d2c6f5946625600f34edace55cb159d693a15aefab9e15691ff2402887985d559951327974206ccf85495e27b9ee56d";
wdt:P18|wdt:P117 ?image .
}
LIMIT 10 |
Note that structured queries against objects in wikimedia commons are still under development. See for instance, https://diff.wikimedia.org/2020/10/29/sparql-in-the-shadow-of-structured-data-on-commons/ and referenced https://commons.wikimedia.org/wiki/Commons:Structured_data . Also, note that annotating checksum properties (see https://www.wikidata.org/wiki/Property:P4092 ) on image properties in wikidata objects doesn't seem to come natural because qualifiers on qualifiers appears to be too much nesting for the wikidata model. For instance, adding a checksum (or content hash) for an image that supports a physical interaction ( https://www.wikidata.org/wiki/Q2747101#P129 ) for a specific taxon https://www.wikidata.org/wiki/Q2747101 appears to be tricky with existing UI editing tools. E.g., is it currently hard to add a "determined by" quality SHA-1 algorithm for the checksum qualifier for the image related to the physical interaction property. |
So, as far as I can tell, querying wikimedia commons images by their checksums is possible, and a dedicated service / data product would have to be create to help answer questions like: What are the check sums (or content hashes) associated with this wikimedia commons entity? and Please provide content associated with this content id (or checksum) if you have it. Otherwise, say "mweh, don't have it." |
Internally, Wiki Commons uses sha1 hashes to alert users whether duplicate digital data is already available via Wiki Commons.
However, as far as I can tell, these sha1 hashes are not yet exposed via structured data by default.
And, methods already exist to annotate digital content with their checksums.
For example, see https://www.wikidata.org/wiki/Q34852 were https://www.wikidata.org/wiki/Property:P4092 is used to document sha-2 hash 8de979cbb1db728ef99debac8a516405a2088e4fa2816fda2769856a54029bcd49913a45494ce1cae4096413c49ae7da36f7bc2d20899fb216195b9eb365e55c associated with digital content .
The text was updated successfully, but these errors were encountered: