Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Simplify dataset metadata JSON files for dataset creation or import #10957

Open
DS-INRAE opened this issue Oct 23, 2024 · 5 comments
Labels
Type: Feature a feature request

Comments

@DS-INRAE
Copy link
Member

Overview of the Feature Request
Remove elements from the dataset creation json file that are superfluous

What kind of user is the feature intended for?
API User

What inspired the request?
JSON files are long, complex and intimidating for new users.

What existing behavior do you want changed?
Remove the need of the following attributes in the dataset JSON files :

  • typeClass for metadata fields
  • multiple for metadata fields
  • displayName for metadatablocks

JSON files comparison
Current Darwin Finches JSON for the fields title, author, datasetContact, dsDescription, subject :

{
  "datasetVersion": {
    "license": {
      "name": "CC0 1.0",
      "uri": "http://creativecommons.org/publicdomain/zero/1.0"
    },
    "metadataBlocks": {
      "citation": {
        "fields": [
          {
            "value": "Darwin's Finches",
            "typeClass": "primitive",
            "multiple": false,
            "typeName": "title"
          },
          {
            "value": [
              {
                "authorName": {
                  "value": "Finch, Fiona",
                  "typeClass": "primitive",
                  "multiple": false,
                  "typeName": "authorName"
                },
                "authorAffiliation": {
                  "value": "Birds Inc.",
                  "typeClass": "primitive",
                  "multiple": false,
                  "typeName": "authorAffiliation"
                }
              }
            ],
            "typeClass": "compound",
            "multiple": true,
            "typeName": "author"
          },
          {
            "value": [ 
                { "datasetContactEmail" : {
                    "typeClass": "primitive",
                    "multiple": false,
                    "typeName": "datasetContactEmail",
                    "value" : "[email protected]"
                },
                "datasetContactName" : {
                    "typeClass": "primitive",
                    "multiple": false,
                    "typeName": "datasetContactName",
                    "value": "Finch, Fiona"
                }
            }],
            "typeClass": "compound",
            "multiple": true,
            "typeName": "datasetContact"
          },
          {
            "value": [ {
               "dsDescriptionValue":{
                "value":   "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds.",
                "multiple":false,
               "typeClass": "primitive",
               "typeName": "dsDescriptionValue"
            }}],
            "typeClass": "compound",
            "multiple": true,
            "typeName": "dsDescription"
          },
          {
            "value": [
              "Medicine, Health and Life Sciences"
            ],
            "typeClass": "controlledVocabulary",
            "multiple": true,
            "typeName": "subject"
          }
        ],
        "displayName": "Citation Metadata"
      }
    }
  }
}

Simplified JSON file :

{
  "datasetVersion": {
    "license": {
      "name": "CC0 1.0",
      "uri": "http://creativecommons.org/publicdomain/zero/1.0"
    },
    "metadataBlocks": {
      "citation": {
        "fields": [
          {
            "value": "Darwin's Finches",
            "typeName": "title"
          },
          {
            "value": [
              {
                "authorName": {
                  "value": "Finch, Fiona",
                  "typeName": "authorName"
                },
                "authorAffiliation": {
                  "value": "Birds Inc.",
                  "typeName": "authorAffiliation"
                }
              }
            ],
            "typeName": "author"
          },
          {
            "value": [ 
                { "datasetContactEmail" : {
                    "typeName": "datasetContactEmail",
                    "value" : "[email protected]"
                },
                "datasetContactName" : {
                    "typeName": "datasetContactName",
                    "value": "Finch, Fiona"
                }
            }],
            "typeName": "datasetContact"
          },
          {
            "value": [ {
               "dsDescriptionValue":{
                "value":   "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds.",
               "typeName": "dsDescriptionValue"
            }}],
            "typeName": "dsDescription"
          },
          {
            "value": [
              "Medicine, Health and Life Sciences"
            ],
            "typeName": "subject"
          }
        ]
      }
    }
  }
}

Are you thinking about creating a pull request for this feature?
Even if this would help increase APIs adoption, we have other priorities at the moment.

@DS-INRAE DS-INRAE added the Type: Feature a feature request label Oct 23, 2024
@DS-INRAE
Copy link
Member Author

Note: a more radical simplification would be very interesting, but hopefully this would be an easier quick win.

@qqmyers
Copy link
Member

qqmyers commented Oct 23, 2024

Note that the metadata input for the semantic API would look like (using a (~standard) @context for readability):

{
  "title":"Darwin's Finches",
  "author": {
    "citation:authorName": "Finch, Fiona",
    "citation:authorAffiliation": "Bird's Inc."
  },   
  "citation:datasetContact": {
    "citation:datasetContactName": "Finch, Fiona",
    "citation:datasetContactEmail": "[email protected]"
  },
  "citation:dsDescription": {
    "citation:dsDescriptionValue": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds."
  },
  "subject": "Medicine, Health and Life Sciences",
  "@context": {
    "author": "http://purl.org/dc/terms/creator",
    "citation": "https://dataverse.org/schema/citation/",
    "subject": "http://purl.org/dc/terms/subject",
    "termName": "https://schema.org/name",
    "title": "http://purl.org/dc/terms/title"
  }
}

or, even shorter,

{
  "http://purl.org/dc/terms/title":"Darwin's Finches",
  "http://purl.org/dc/terms/creator": {
    "https://dataverse.org/schema/citation/authorName": "Finch, Fiona",
    "https://dataverse.org/schema/citation/authorAffiliation": "Bird's Inc."
  },   
  "https://dataverse.org/schema/citation/datasetContact": {
    "https://dataverse.org/schema/citation/datasetContactName": "Finch, Fiona",
    "https://dataverse.org/schema/citation/datasetContactEmail": "[email protected]"
  },
  "https://dataverse.org/schema/citation/dsDescription": {
    "https://dataverse.org/schema/citation/dsDescriptionValue": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds."
  },
  "http://purl.org/dc/terms/subject": "Medicine, Health and Life Sciences",
}

@pdurbin
Copy link
Member

pdurbin commented Oct 23, 2024

This is what I've suggested to @JR-1991 who has slides ready about the gnarly complicated native format, to try the semantic API. 😄

See also discussion here:

@JR-1991
Copy link
Contributor

JR-1991 commented Oct 23, 2024

@pdurbin, it is on my bucket list 😁 Can this also be passed to the dataset creation/edit endpoint?

@pdurbin
Copy link
Member

pdurbin commented Oct 23, 2024

@JR-1991 well, you have to pass 'Content-Type: application/ld+json'. Please see the guides: https://guides.dataverse.org/en/6.4/developers/dataset-semantic-metadata-api.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature a feature request
Projects
Status: 🔍 Interest
Development

No branches or pull requests

4 participants