You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
When we choose to perform tokenize for multiple processors in the same pipeline, there would be the same entry type generated for the result which is ft.onto.base_ontology.Token. This would result in duplicate data. For example, the processor later in the pipeline would get access to these mixed entries of data for counting or prediction and get the confused result.
Describe the solution you'd like
Multiple solutions could help with solving this problem
The current solution in consideration is to create subclass ontology to diferentiate between these entries. For example, to create ontology entries of Token subclass separately for multiple processors that perform tokenize.
Describe alternatives you've considered
Create another field entry record in the datapack file for separation but it's a bit bulky for the datapack file.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
When we choose to perform tokenize for multiple processors in the same pipeline, there would be the same entry type generated for the result which is
ft.onto.base_ontology.Token
. This would result in duplicate data. For example, the processor later in the pipeline would get access to these mixed entries of data for counting or prediction and get the confused result.Describe the solution you'd like
Multiple solutions could help with solving this problem
The current solution in consideration is to create subclass ontology to diferentiate between these entries. For example, to create ontology entries of Token subclass separately for multiple processors that perform
tokenize
.Describe alternatives you've considered
Create another field
entry record
in the datapack file for separation but it's a bit bulky for the datapack file.The text was updated successfully, but these errors were encountered: