-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define ideal user experience and workflow for catalog-translate
#324
Comments
catalog-translate
For v1, you'd need the factory to index translators by all properties given to I am not able to predict what performance gain that would have in practice (I have been wrong once on this topic already), but probably it's worth a try to write a basic factory, and try it on a synthetically generated, large-ish set of metadata (either from one extractor, or from multiple, interleaved). I don't currently understand the argument why the v2 specification must be complex - why not entrypoint name alone? But if you feel v1 may be better, it won't hurt to try v1 first. |
good idea
An example of when entrypoint alone would not be enough would be if there are two translators compatible with an extractor of the same name, but different versions. One would either have to make some assumptions to e.g. select the latest, or the user would have to specify both name and version, or both translators would have to be deliberately instantiated and made part of the factory store and matched correctly at a later stage. All of these requires some decision trees that will turn into additional code, and I don't think this adds much benefit to the user in terms of improved UX (the v1 alternative actually feels easier because it's fewer arguments to be concerned with) nor in terms of improved performance over the v1 alternative (in both cases the matching and instantiation for a to-be-used translator would only happen once) |
Pragmatic approach implemented in #309. This might not be ideal forever, but closing the issue until such time as this becomes insufficient. |
Relevant existing issues:
As part of #309 I am creating a new interface for
Translate
, with one of the goals being to fix existing issues related to translation functionality. I am writing a summary here of the existing functionality and the proposed changes, to get feedback on it before I waste time on implementing the wrong stuff.Old
The current approach is:
datalad catalog translate --metadata <metalad-metadata-to-be-translated>
New
The envisioned approach v1:
datalad catalog-translate --metadata <metalad-metadata-to-be-translated>
The envisioned approach v2:
datalad catalog-translate --metadata <metalad-metadata-to-be-translated> --translator <name(s) of translator(s)>
--translator
argument which is introduced here in response to NF: add batch functionality to translation #264catalog-translate
process (if they are available as entrypoints)The problem with the v2 approach is that the manner in which the translators are supplied as arguments to
datalad catalog-translate
would have to include the same information that is used by the matching method of these translators, which could vary across translators since these are implemented on the inherited translator classes themselves (and not on the base class). So a call would have to be something like:datalad catalog-translate --metadata <metalad-metadata-to-be-translated> --translator metalad_core_v0.2.2
and that would have to be parsed in order to allow complete matching. I'm thinking more and more this is unnecessarily complicated and that we should stick with new v1 approach.
Opinions?
The text was updated successfully, but these errors were encountered: