-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support discovery of datalad datasets on dataverse #46
Comments
Right, as we discussed at the Distribits hackathon, now that @yarikoptic has a published dataset in Harvard Dataverse that came from DataLad we can find it with this query: https://dataverse.harvard.edu/api/search?q=fileName:%22repo.zip%22 Here's how the search result looks:
As mentioned above, the dataset-level fields to focus on are these:
https://doi.org/10.7910/DVN/VMSH8U will resolve and redirect to the dataset at https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VMSH8U @yarikoptic and I talked about different ways to identify DataLad datasets. This "search for repo.zip" approach seems promising but could probably be refined. It's a good start! |
Sample dataset on demo node, in non-exported (key store) flavor of the special remote:
so it seems we need to search for datasets which have a file like
XDLRA-2D--2D-refs
, probably just starting withXDLRA-
and ending with-refs
.JSON file which lists all current dataverse deployments (if we are greedy to search through all of them):
For now we could just go through https://demo.dataverse.org/ and https://dataverse.harvard.edu as "groups" (like organization for github) and not care about any other.
The search API example invocation to search for that exact filename (for now):
_.datalad/dotgit/
but it seems not work).in the returned record we get
The "things" to record would be the
hostname
dataset_persistent_id
per each dataset. Hyperlink for a dataset would be constructed as
https://{hostname}/dataset.xhtml?persistentId=doi:{dataset_persistent_id}
.note: for those URLs to become clonable, first
datalad
should be configured to loaddataverse
andnext
extensions via changes to~/.gitconfig
The text was updated successfully, but these errors were encountered: