Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[c++/python] Fixes for dense arrays and core 2.27/dev #3244

Closed
wants to merge 8 commits into from

Conversation

johnkerl
Copy link
Member

@johnkerl johnkerl commented Oct 25, 2024

Issue and/or context: As tracked on issue #2407 / [sc-51048].

Changes:

Some findings from TileDB-Inc/centralized-tiledb-nightlies#25. There is more to do though.

Notes for Reviewer:

Python read-OOM repro

#!/usr/bin/env python

import tiledbsoma
import pyarrow as pa
import sys
import os, shutil

# Any dense 2D array with small current domain but huge (~ 2**63) core domain
uri = 'dense-dnda-227'

print("RD-DNDA URI", uri)

dnda = tiledbsoma.open(uri)
# This next line will slowly OOM -- watch htop
o = dnda.read()

Fixed as of this commit: 0e6493a

R write-fail repro

$ cat wr-lldb.r
#!/usr/bin/env Rscript

library(tiledbsoma)
library(arrow)

uri <- "data-dnda-r"
element_type = arrow::float32()
shape = c(10, 20)

cat("CR-DNDA URI     ", uri, "\n")
cat("CR-DNDA SHAPE   ", shape, "\n")

if (dir.exists(uri)) unlink(uri, recursive=TRUE)
dnda <- tiledbsoma::SOMADenseNDArrayCreate(uri, type=element_type, shape=shape)

cat("WR-DNDA URI", uri, "\n")
dnda$write(matrix(1:12, nrow=3,ncol=4))

dnda$close()
$ ./wr-lldb.r
CR-DNDA URI      data-dnda-r
CR-DNDA SHAPE    10 20
WR-DNDA URI data-dnda-r
Error: WriterBase: Buffer sizes check failed; Invalid number of cells given for attribute 'soma_data' (12 != 18446744073709551615)
Execution halted

Fixed as of this commit: 0e6493a

@johnkerl johnkerl marked this pull request as draft October 25, 2024 22:08
@johnkerl johnkerl changed the title [c++/python] Fixes for dense arrays and core 2.27 [WIP] [c++/python] Fixes for dense arrays and core 2.27/dev [WIP] Oct 25, 2024
@johnkerl
Copy link
Member Author

johnkerl commented Oct 27, 2024

Investigation so far shows that Python dense-array write calls C++ set_array_data as well as set_dim_points, while R dense-array write calls only the former and not the latter.

cc @mojaveazure and your observation the other day that the dense-array writer doesn't handle coords. Furthermore, as presently coded, it can only write the full core domain of the array. And now that with the new-shape feature flag enabled, the core domain (soma maxdomain) is huge and the core current domain (soma domain) is small.

Here is a repro that fails even with the new-shape feature flag disabled, proving that this bug is older than the new-shape mod:

#!/usr/bin/env Rscript

library(tiledbsoma)
library(arrow)

uri <- "data-dnda-r"
element_type = arrow::float32()
shape = c(10, 20)

cat("CR-DNDA URI     ", uri, "\n")
cat("CR-DNDA SHAPE   ", shape, "\n")

if (dir.exists(uri)) unlink(uri, recursive=TRUE)

dnda <- tiledbsoma::SOMADenseNDArrayCreate(uri, type=element_type, shape=shape)

cat("WR-DNDA URI", uri, "\n")

dnda$write(matrix(1:12, nrow=3,ncol=4))

dnda$close()
CR-DNDA URI      data-dnda-r 
CR-DNDA SHAPE    10 20 
WR-DNDA URI data-dnda-r 
Error: WriterBase: Buffer sizes check failed; Invalid number of cells given for attribute 'soma_data' (12 != 200)
Execution halted

Also fails with

...
m <- matrix(1:12, nrow=3,ncol=4)
coords <- list( 4:6, 7:10)
dnda$write(m, coords)
...

Same error message.

@johnkerl johnkerl force-pushed the kerl/dense-227-fixes branch 3 times, most recently from 0c00c0f to a1e5b1c Compare October 27, 2024 06:13
@johnkerl
Copy link
Member Author

The typeguard-related failures are new and weird. I created #3245 to check.

@johnkerl
Copy link
Member Author

@johnkerl johnkerl force-pushed the kerl/dense-227-fixes branch 4 times, most recently from 4b10b61 to 0e6493a Compare October 28, 2024 22:10
Copy link

codecov bot commented Oct 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.92%. Comparing base (50c5d71) to head (0e6493a).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3244      +/-   ##
==========================================
+ Coverage   83.79%   83.92%   +0.12%     
==========================================
  Files          51       51              
  Lines        5560     5566       +6     
==========================================
+ Hits         4659     4671      +12     
+ Misses        901      895       -6     
Flag Coverage Δ
python 83.92% <100.00%> (+0.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
python_api 83.92% <100.00%> (+0.12%) ⬆️
libtiledbsoma ∅ <ø> (∅)

@johnkerl johnkerl marked this pull request as ready for review October 28, 2024 22:51
@johnkerl johnkerl requested a review from nguyenv October 28, 2024 22:51
@johnkerl
Copy link
Member Author

@nguyenv this is ready for review, and has green CI for this repo (i.e. with core 2.26).

There is still an issue or two remaining for core dev, which I have installed on one laptop to mimic @jdblischak 's nightlies. So, there will be at least one more PR. Meanwhile everything here on this PR is already a strict improvment.

@johnkerl johnkerl changed the title [c++/python] Fixes for dense arrays and core 2.27/dev [WIP] [c++/python] Fixes for dense arrays and core 2.27/dev Oct 28, 2024
@johnkerl johnkerl changed the base branch from main to kerl/pybind11-nda-sizing October 30, 2024 16:15
@johnkerl johnkerl marked this pull request as draft October 30, 2024 17:47
@johnkerl
Copy link
Member Author

Moving back to draft, likely to abandon later on.

On discussion with @nguyenv :

  • The main context of this PR is:
    • For dense arrays only
    • If any dim's subarray range isn't specified, core uses the core domain
    • With new-shape feature-flag enabled, and with core 2.27 where dense arrays can have current domain (they couldn't in 2.26), we are indeed using data-sized core current domain and huge core domain
    • Put those together and we get these problems
    • Note there are a few other issues on this PR, and a few not yet included in this PR, other than that -- and those should be split out into separate PRs
  • On this PR I've ben finding callsites, for write and for read, for Python and R, and touching up the ranges which end up being core subarrays on each dim slot
    • This is tedious and error-prone
  • What we should do instead is:
    • In one place -- in libtiledbsoma, in ManagedQuery, before submit_query
    • If any dim's range has not been set up to this point, and if the array has a current domain, simply set the ranges on that dim to be the current domain's (lo, hi)
    • We know (for a fact) that core will go ahead and use the too-big domain anyway so this is safe

Base automatically changed from kerl/pybind11-nda-sizing to main October 30, 2024 17:54
@johnkerl
Copy link
Member Author

Closing in favor of #3263 and #3268, and at least one more PR to come after those.

@johnkerl johnkerl closed this Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant