Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collecting feature requests around a developmental feature for RAMP #250

Open
kegl opened this issue Oct 16, 2020 · 24 comments
Open

Collecting feature requests around a developmental feature for RAMP #250

kegl opened this issue Oct 16, 2020 · 24 comments

Comments

@kegl
Copy link
Contributor

kegl commented Oct 16, 2020

When RAMP is used for developing models for a problem, we may want to tag certain versions of a submission, and even problem.py, together with the scores. One idea is to use git tags. For example, after running ramp-test ... --save-output, one could run another script that git adds problem.py, the submission files, and the scores in training_output/fold_<i>, commit and tag with a user-defined tag (plus maybe a prefix indicating that it is a scoring tag, so later we may automatically search for all such tags).

@zhangJianfeng
Copy link

  1. When loading the data in ramp, seems training data will be read twice. When the data is big, it is a bit slow.
  2. Is it possible to parallelize the CV process?

@gabriel-hurtado
Copy link
Collaborator

gabriel-hurtado commented Nov 9, 2020

Adding on feature that I would be useful, at least to me: it would be great to have the ability to import more code from elsewhere in a submission, allowing multiple submissions to share some code. Now it can be done by creating a library and importing it, which is a bit tedious.
@albertcthomas mentioned this could perhaps be done on a similar way that pytest does it. They have a conftest.py file for code that you want to reuse for different test module.
#181

@albertcthomas
Copy link
Collaborator

@albertcthomas mentioned this could perhaps be done on a similar way that pytest does it. They have a conftest.py file for code that you want to reuse for different test module.

Well it is more like "this makes me think of conftest.py that can be used to share fixtures" but I don't know what happens when you run pytest and I am not sure the comparison goes very far :). As written in pytest doc "The next example puts the fixture function into a separate conftest.py file so that tests from multiple test modules in the directory can access the fixture function".
This feature is discussed in issue #181.

@albertcthomas albertcthomas changed the title Save scores for different versions of a submission during development Collecting feature requests around a developmental feature of RAMP Nov 9, 2020
@albertcthomas albertcthomas changed the title Collecting feature requests around a developmental feature of RAMP Collecting feature requests around a developmental feature for RAMP Nov 9, 2020
@illyyne
Copy link

illyyne commented Nov 12, 2020

1- I find the step of reading data is taking too much time: slower than reading it without RAMP.
2- It would be great if also the mean result is saved with the bagged one.
3- Propose a latex syntax for the results.
4- When the output is saved, it would be better to save also the experiment conditions: like data label, tested hyperparameter, etc and keep all somewhere either locally or in the cloud to check it later.

@LudoHackathon
Copy link

LudoHackathon commented Nov 19, 2020

Here are some features that could help:

  • Model selection: "early killing" (e.g. successive Halving or even simpler schemes), which implies having shared information while hyperopt or at least a way to compare or current model to the best one so far (either a global python variable or save it somehow on HDD...)
  • Experimental protocol: Having parametrized problem.py, I'm keen on json (that could also be saved each time you launch a ramp-test). I'm not a big fan of using commit tags since I can launch 10 different batches of experiments on different servers without wanting to commit each time just for a experiment's configuration file.
  • Logging:
    • Model saving and loading (path, hyperopt, ...)
    • Possibility to rename the output score folders. E.g. depending on the task and the models I've implemented I rather save the results with a different directory hierarchy, let's say w.r.t. hp or more global options. It helps regex search (useful with tensorboard for example), or plotting when dealing with tens of thousands ran experiments (and looking at parameters sensitivity).
  • Other:
    • Being able to modify the submissions while some experiments are running (looks like the .py submission file is loaded several times, I have the habit to load the class somewhere which allows me to do whatever I want while my experiments are running)
    • Same as Gabriel, ease the imports in a submission, maybe I didn't find the right way to do so but there is a lot of duplicated code in my submissions even if I've implemented a Pytorch2RAMP class. [RFC] importing files in submissions #181

@LudoHackathon
Copy link

From my (little) experience with RAMP, what made people a bit reluctant to use it was that it was too high level. Mearning that we don't see the classical sequential process we are used to see in a ML script (load data, instantiate model, train it, test it). As an example, Keras (not the same purpose as RAMP) embedded some part of the script to minimize the main script but kept the overall spirit of the classical script making it as understandable as the original one. Using ramp-test in command line may make RAMP more obscure to new users. Maybe that having a small script (as the one already in the documentation for example) giving the user a more pythonic way to play with it, without having to use ramp-test as a command line, could make machine learners more willing to use it.

@agramfort
Copy link
Contributor

agramfort commented Nov 23, 2020 via email

@kegl
Copy link
Contributor Author

kegl commented Nov 23, 2020

Calling ramp-test from a notebook is as simple as

from rampwf.utils import assert_submission
assert_submission(submission='starting_kit')

This page https://paris-saclay-cds.github.io/ramp-docs/ramp-workflow/advanced/scoring.html now contains two code snippets that you can use to call lower-level elements of the workflow and emulate a simple train/test and cross-validation loop. @LudoHackathon do you have a suggestion what else would be useful? E.g. an example notebook in the library?

@agramfort
Copy link
Contributor

agramfort commented Nov 23, 2020 via email

@albertcthomas
Copy link
Collaborator

albertcthomas commented Nov 23, 2020

this should be explained in the kits to save some pain to students

wasn't this the purpose of the "Working in the notebook" section of the old titanic notebook starting kit?

@kegl
Copy link
Contributor Author

kegl commented Nov 23, 2020

Yes, @albertcthomas is right, but the snippet in the doc is cleaner now. I'm doing this decomposition in every kit now, see for example line 36 here https://github.com/ramp-kits/optical_network_modelling/blob/master/optical_network_modelling_starting_kit.ipynb. This snippet is even simpler than in the doc but less general, only works when the Predictions class does nothing with the input numpy array, which is most of the time (regression and classification). Feel free to reuse.

@albertcthomas
Copy link
Collaborator

albertcthomas commented Nov 23, 2020

This page https://paris-saclay-cds.github.io/ramp-docs/ramp-workflow/advanced/scoring.html now contains two code snippets that you can use to call lower-level elements of the workflow and emulate a simple train/test and cross-validation loop. @LudoHackathon do you have a suggestion what else would be useful? E.g. an example notebook in the library?

The page is doing a good job at showing how you can call the different elements (and thus play with them, doing plots....)

  1. for better visibility we might clearly say that there is a command-line interface based on ramp-test and a way of calling the neededs function easily in a python script (or notebook). Of course we could add an example showing the python script interface.

  2. More importantly, maybe think of what can break when you go from one to the other interface. For instance imports from other modules located in the current working directory. This still forces us/the students to work with submission files. I think that using the "scikit-learn kits" eases the transfer of your scikit-learn estimator from your playing python script/notebook to a submission file and making sure that this works in most cases. I let @agramfort confirm this :)

  3. Instead of

from rampwf.utils import assert_submission
assert_submission(submission='starting_kit')

we could have something like

from rampwf import ramp_test
ramp_test(submission='starting_kit')

Debugging is a pain etc.

For debugging with the command line I have to say that I rely a lot on adding a breakpoint where I want to enter the debugger. However, this cannot be done post-mortem compared to using %debug in ipython or jupyter. For this we could have a --pdb or --trace flag as in pytest. But it's true that it's easier to try things and play with your models/pipelines when not using the command-line.

@albertcthomas
Copy link
Collaborator

albertcthomas commented Nov 23, 2020

use your favorite env to inspect / debug / run (vscode, notebook, google colab etc.)
giving the user a more pythonic way to play with it, without having to use ramp-test as a command line

this is an important point. 2 or 3 years ago I was rarely using the command-line and I always preferred staying in a python environment. Users should be able to use their favorite tool to play with their models and we should make sure that at the end it will work when calling ramp-test in the command line.

@kegl
Copy link
Contributor Author

kegl commented Nov 23, 2020

  1. OK
  2. no comment
  3. OK. In fact we may put in focus the python call and tell them to use the command line ramp-test as a final unit test, the same way as one would use pytest. I think the cleanest way would be to have ramp_test defined in https://github.com/paris-saclay-cds/ramp-workflow/blob/advanced/rampwf/utils/cli/testing.py and main would just call ramp_test with the exact same signature. In this way it's certain that the two calls do the same thing.
  4. I prefer not adding the command line feature if everything can be done from the python call.

@albertcthomas
Copy link
Collaborator

albertcthomas commented Nov 23, 2020

3\. I prefer not adding the command line feature if everything can be done from the python call.

is this for 4. and --pdb?

@agramfort
Copy link
Contributor

agramfort commented Nov 23, 2020 via email

@kegl
Copy link
Contributor Author

kegl commented Nov 24, 2020

import imp feature_extractor = imp.load_source( '', 'submissions/starting_kit/feature_extractor.py') fe = feature_extractor, FeatureExtractor() classifier = imp.load_source( 'submissions/starting_kit/classifier.py') clf = classifier.Classifier() is to me too complex and should be avoided. We have a way suggested by @kegl based on the ramwf function.

I'm not sure what you mean here. We're using import_module_from_source now.

@agramfort
Copy link
Contributor

agramfort commented Nov 24, 2020 via email

@kegl
Copy link
Contributor Author

kegl commented Nov 24, 2020

3\. I prefer not adding the command line feature if everything can be done from the python call.

is this for 4. and --pdb?

yes

@gabriel-hurtado
Copy link
Collaborator

Another feature that would be nice to have : have an option to separate what is saved and what is printed to the console.
This would allow to save extensive metrics without flooding the terminal.

@kegl
Copy link
Contributor Author

kegl commented Jan 28, 2021

Partial fit for models where eg. number of trees or number of epochs is a hyper. This would be mainly a feature used by hyperopt (killing trainings early) but maybe also useful as CLI param.

@kegl
Copy link
Contributor Author

kegl commented Feb 4, 2021

Standardized latex tables computed out of saved scores. Probably two steps: first create all scores (of selected submissions and data labels) into a well-designed pandas table. Then a set of tools to create latex tables, scores with CI and also paired tests. I especially like the plots and score presentation in https://link.springer.com/article/10.1007/s10994-018-5724-2.

@albertcthomas
Copy link
Collaborator

albertcthomas commented Feb 26, 2021

When RAMP is used for developing models for a problem, we may want to tag certain versions of a submission, and even problem.py, together with the scores. One idea is to use git tags. For example, after running ramp-test ... --save-output, one could run another script that git adds problem.py, the submission files, and the scores in training_output/fold_<i>, commit and tag with a user-defined tag (plus maybe a prefix indicating that it is a scoring tag, so later we may automatically search for all such tags).

would be great to have a look at MLflow, @agramfort pointed it out to me. There are some parts that we could use, for instance the tracking one

@martin1tab
Copy link
Contributor

martin1tab commented Mar 11, 2021

  1. When loading the data in ramp, seems training data will be read twice. When the data is big, it is a bit slow.
  2. Is it possible to parallelize the CV process?
  1. yes, training data is read twice for the moment since X_train, y_train, X_test, y_test = assert_data(
    ramp_kit_dir, ramp_data_dir, data_label) is called twice in the testing.py module.
    Same issue appears with the 'problem' variable, which is called 5 times.
    It is possible to fix these issues by making the testing module object oriented, then attributes corresponding to each of theses variables, (X_train, X_test,...) could be created and we would not need to repeat calls for some functions.
    But do we agree to add more object oriented code ?

  2. yes it is

@rth rth mentioned this issue Jun 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants