-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring Data Store Structure #882
Conversation
This PR does apply changes to CV Ontologies since it is currently getting updated itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a large PR, maybe too large so I am not too confident in reviewing this. There are currently many changes in interfaces that won't be merged too. Maybe we should actually break down this work into multiple PRs instead of doing it at once.
- We need to understand the DataPack interface layer and DataStore interface layer, we don't want to make any changes to public PRs of DataPack
- For safety, let's do not add any
type: ignore
in this PR since a lot of them are caused by real errors - Please double-check whether serialization/deserialization data packs work after this PR.
c29928d
to
db414ad
Compare
86ecb72
to
6b689a4
Compare
Codecov Report
@@ Coverage Diff @@
## master #882 +/- ##
==========================================
+ Coverage 80.91% 80.95% +0.04%
==========================================
Files 254 254
Lines 19551 19569 +18
==========================================
+ Hits 15819 15843 +24
+ Misses 3732 3726 -6
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
6b689a4
to
adbc5a7
Compare
I'm thinking we might want to make the behavior consistent. Right now we are maintaining two distinctive approaches to set dataclass attributes before and after registering property function. But it's out of the scope of this PR and it's not of high priority. |
ce1ba31
to
1a4e3f2
Compare
4dfcff7
to
1a4e3f2
Compare
This PR fixes #874
Description of changes
This PR attempts to transform how data is stored in the
DataStore
entry. The main idea behind this new format is that all attributes of an entry other than itstid
andtype_name
should be considered asdata class
attributes. As a result of this, the DataStore format of any entry can be visualized as :[tid, type_name, ....(dataclass attributes)...]
.For example, the
type_attrubutes
ofSentence
can be seen asThe way this is implemented is by creating a class variable for
Entry
calledcached_attribute_data
. This is adict
that stores the initial values ofdataclass
attributes. The implementation makes sure that before initializing a data store entry for a given entry object, thecached_attribute_data
dict store all data class attributes and their initial values. There are two ways in which adataclass
attribute can be added tocached_attribute_data
entry_setter
property: In this case, the entry will automatically be added tocached_attribute_data
.entry_setter
property: In this case, attributes values are store in the entry object. These are fetched before the creation of the data store entry to populatecached_attribute_data
.Possible influences of this PR.
dataclass
attributes, we do not need to rely onconstants
. Instead, we use the functionget_datastore_attr_idx
to fetch the position in the datastore where a given attributes is stored.DataStore
more scalable now since any new attribute can be added to the entry as well as its datastore but declaring it as adataclass
attributeTest Conducted
The main aim of this PR was to keep the outermost interface unchanged and still be able to pass the
data_store_test
,data_pack_test
andmulti_pack_test