Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a single source for resources shared by data_store and data_store_serialization test cases. #894

Open
Pushkar-Bhuse opened this issue Aug 16, 2022 · 0 comments

Comments

@Pushkar-Bhuse
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
The resources used to define the test cases in data_store_test.py and data_store_serialization_test.py are the same in many cases. For example: The structure of _type_attributes for most entries in both the above mentioned files is quire similar in most cases

DataStore._type_attributes = {
        "ft.onto.base_ontology.Document": {
            "attributes": {
                "begin": {"index": 2, "type": (None, (int,))},
                "end": {"index": 3, "type": (None, (int,))},
                "payload_idx": {"index": 4, "type": (None, (int,))},
                "document_class": {"index": 5, "type": (list, (str,))},
                "sentiment": {"index": 6, "type": (dict, (str, float))},
                "classifications": {
                    "index": 7,
                    "type": (FDict, (str, Classification)),
                },
            },
            "parent_entry": "forte.data.ontology.top.Annotation",
        },
}

Describe the solution you'd like
In order to reduce this redundancy, there should be a central file that can store these configurations and a clear format for them to be accessed by these tests. Note that although the configurations look quite similar, there are subtle differences in some cases that are intentional. For example, in data_store_serialization_test.py, the _type_attributes for Document is given by

"ft.onto.base_ontology.Document": {
                "attributes": {
                    "begin": {"index": 2, "type": (None, (int,))},
                    "end": {"index": 3, "type": (None, (int,))},
                    "payload_idx": {"index": 4, "type": (None, (int,))},
                    "sentiment": {"index": 5, "type": (dict, (str, float))},
                    "classifications": {
                        "index": 6,
                        "type": (FDict, (str, Classification)),
                    },
                },
                "parent_entry": "forte.data.ontology.top.Annotation",
            },

Note that this configuration misses the document_class attribute intentionally. Thus, the proposed solution needs to have provisions to handle the slight changes in structure.

Additional Context

  • This is part of the data efficiency project
  • This PR should be made to the master branch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants