-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle nan and inf float values #249
Conversation
Deploying datachain-documentation with Cloudflare Pages
|
convert_options = ConvertOptions( | ||
strings_can_be_null=True, null_values=STR_NA_VALUES | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This provides pandas-like null handling for csv files
@@ -45,12 +47,12 @@ class TeamMember(BaseModel): | |||
|
|||
|
|||
def test_merge_objects(test_session): | |||
skip_if_not_sqlite() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the note about joins in https://github.com/iterative/studio/pull/10429
@@ -103,14 +105,13 @@ def test_merge_similar_objects(test_session): | |||
|
|||
|
|||
def test_merge_values(test_session): | |||
skip_if_not_sqlite() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the note about joins in https://github.com/iterative/studio/pull/10429
Remaining test failures are either unrelated or require https://github.com/iterative/studio/pull/10429 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really cool 🪄
I didn't expect this can support this as naturally.
|
||
|
||
def test_from_values_nan_inf(tmp_dir, catalog): | ||
vals = [float("nan"), float("inf"), float("-inf")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's impressive!
Going to merge but we should follow up on arrays and CH joins |
This reverts commit 8e0f970.
@dberenbaum can we please review and create a ticket in the DataChain repo for this if needed? |
Opened #386. CH was fixed by https://github.com/iterative/studio/pull/10471. |
Partial fix for https://github.com/iterative/dvcx/issues/1697#issuecomment-2269457257:
Array(Float)
support that - needs follow-up since orjson does not support nan/infThis PR treats all missing values in float columns as NaN for a couple reasons: