Fix bug with Lrnr_grf predictions #411

herbps10 · 2023-02-13T20:53:04Z

The .predict method for Lrnr_grf had an incorrect argument name for grf::predict.quantile_forest: the argument for supplying new data should be called newdata instead of new_data (see https://grf-labs.github.io/grf/reference/predict.quantile_forest.html).

The .predict method for Lrnr_grf had an incorrect argument name for grf::predict.quantile_forest: the argument for supplying new data should be called "newdata", not "new_data" (see https://grf-labs.github.io/grf/reference/predict.quantile_forest.html).

rachaelvp · 2023-06-16T09:17:22Z

Thank you! This is great. Could you please add test to “tests/testthat/“ to make sure this is functioning as expected? Then I’ll merge PR

nhejazi · 2023-07-17T19:16:18Z

i'm confused by the diff including changes beyond the fix to Lrnr_grf but this is an important catch (thanks @herbps10!); could we double check changes and merge to devel @rachaelvp? happy to help with this

herbps10 · 2023-07-18T08:03:13Z

Ah yes my mistake -- looks like I added an extra commit to the pull request by mistake, the only relevant one for this is 74f7538. Is there an easy way to fix this, or should I submit a new pull request? Thanks!

nhejazi · 2023-07-18T15:55:08Z

ok, thanks for clarifying @herbps10 -- depending on your level of comfort with git, you could rebase (e.g., https://stackoverflow.com/questions/36168839/how-to-remove-commits-from-a-pull-request), but opening up a new pull request is totally fine too

herbps10 · 2023-07-18T16:00:56Z

Great, thanks for the tip -- I just fixed the PR. I can work on adding a unit test as well.

nhejazi · 2023-07-18T16:05:40Z

great, thanks! adding a unit test would be very helpful. i may periodically make some edits here and/or to the devel branch to get the automated builds restored before this is merged

herbps10 · 2023-07-19T07:39:30Z

Quick question about how you would like Lrnr_grf to work:

With grf, if no new data are supplied for predictions (newdata = NULL), then out-of-bag predictions are returned for the training set (that is, predictions for the ith observation are made using only trees that did not use the ith training observation.)

As a consequence, if you supply the original training set as newdata (e.g. newdata = X.orig) then you get different predictions than if you predict with no new data (newdata = NULL), because the second one is out-of-bag and the first one is not.

The question is: when the predict method of Lrnr_grf is called with the same task as was used for training, should it return out-of-bag predictions or not?

(The current behavior, before the bugfix, is that the predict method of Lrnr_grf always returns the out-of-bag predictions for the training set, because newdata was always NULL).

nhejazi added the bug label Jul 17, 2023

herbps10 force-pushed the devel branch from e3c167d to 74f7538 Compare July 18, 2023 16:00

nhejazi self-assigned this Jul 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bug with Lrnr_grf predictions #411

Fix bug with Lrnr_grf predictions #411

herbps10 commented Feb 13, 2023

rachaelvp commented Jun 16, 2023

nhejazi commented Jul 17, 2023

herbps10 commented Jul 18, 2023 •

edited

Loading

nhejazi commented Jul 18, 2023

herbps10 commented Jul 18, 2023

nhejazi commented Jul 18, 2023

herbps10 commented Jul 19, 2023

Fix bug with Lrnr_grf predictions #411

Are you sure you want to change the base?

Fix bug with Lrnr_grf predictions #411

Conversation

herbps10 commented Feb 13, 2023

rachaelvp commented Jun 16, 2023

nhejazi commented Jul 17, 2023

herbps10 commented Jul 18, 2023 • edited Loading

nhejazi commented Jul 18, 2023

herbps10 commented Jul 18, 2023

nhejazi commented Jul 18, 2023

herbps10 commented Jul 19, 2023

herbps10 commented Jul 18, 2023 •

edited

Loading