Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idr0025-stadler-proteinatlas S-BIAD846 #647

Open
will-moore opened this issue Feb 22, 2023 · 11 comments
Open

idr0025-stadler-proteinatlas S-BIAD846 #647

will-moore opened this issue Feb 22, 2023 · 11 comments

Comments

@will-moore
Copy link
Member

No description provided.

@dominikl
Copy link
Member

dominikl commented Mar 1, 2023

Export: 3.5 min / plate
Import: 3 hours

@dominikl
Copy link
Member

Convered on pilot-zarr2-dev, under /data/ngff/idr0025

@will-moore will-moore self-assigned this Jun 14, 2023
@will-moore
Copy link
Member Author

On local machine...

$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 mb s3://idr0025
make_bucket: idr0025
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-policy --bucket idr0025 --policy file://policy.json
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-cors --bucket idr0025 --cors-configuration file://cors.json

On idr-zarr2-dev...

$ cd /data/ngff
$ /home/wmoore/mc cp -r idr0025/ uk1s3/idr0025/zarr
...e 3.ome.zarr/OME/METADATA.ome.xml: 3.30 GiB / 3.30 GiB ━━━━━━━━━━━━━━━━━━━━━━━━━━

$ /home/wmoore/mc ls uk1s3/idr0025/zarr
[2023-06-14 16:41:56 UTC]     0B 10x images plate 1.ome.zarr/
[2023-06-14 16:41:56 UTC]     0B 10x images plate 2.ome.zarr/
[2023-06-14 16:41:56 UTC]     0B 10x images plate 3.ome.zarr/

https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/idr0025/zarr/10x+images+plate+3.ome.zarr

Screenshot 2023-06-14 at 17 45 18

@will-moore
Copy link
Member Author

will-moore commented Jun 27, 2023

Imported metadata-only plates into idr0125-pilot:

$ for dir in *; do   omero import --transfer=ln_s --depth=100 --name="${dir/.ome.zarr/}" --skip=all "$dir" --file "/tmp/$dir.log"  --errs "/tmp/$dir.err"; done
2023-06-27 16:04:38,638 1229406    [l.Client-0] INFO   ormats.importer.cli.LoggingImportMonitor - METADATA_PROCESSED Step: 4 of 5  Logfile: 50491601
2023-06-27 16:04:38,668 1229436    [l.Client-2] INFO   ormats.importer.cli.LoggingImportMonitor - OBJECTS_RETURNED Step: 5 of 5  Logfile: 50491601
2023-06-27 16:04:38,908 1229676    [l.Client-0] INFO   ormats.importer.cli.LoggingImportMonitor - IMPORT_DONE Imported file: /ngff/idr0025/10x images plate 1.ome.zarr/OME/METADATA.ome.xml
Other imported objects:
Fileset:5287265

==> Summary
2509 files uploaded, 1 fileset, 1 plate created, 384 images imported, 0 errors in 0:20:25.673
2023-06-27 16:47:24,290 1249768    [l.Client-0] INFO   ormats.importer.cli.LoggingImportMonitor - IMPORT_DONE Imported file: /ngff/idr0025/10x images plate 3.ome.zarr/OME/METADATA.ome.xml
Other imported objects:
Fileset:5287267

==> Summary
2509 files uploaded, 1 fileset, 1 plate created, 384 images imported, 0 errors in 0:20:45.630
$ python idr-utils/scripts/managed_repo_symlinks.py Screen:3254 /idr0025/zarr --report

Fileset: 5287265 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-13/2023-06/27/15-44-13.605/
Render Image 14835368
fileset_dirs {}
fs_contents ['10x images plate 1.ome.zarr']
Link from /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-13/2023-06/27/15-44-13.605/10x images plate 1.ome.zarr to /idr0025/zarr/10x images plate 1.ome.zarr

Fileset: 5287266 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-6/2023-06/27/16-04-45.182/
Render Image 14835512
fileset_dirs {}
fs_contents ['10x images plate 2.ome.zarr']
Link from /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-6/2023-06/27/16-04-45.182/10x images plate 2.ome.zarr to /idr0025/zarr/10x images plate 2.ome.zarr

Fileset: 5287267 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-10/2023-06/27/16-26-38.877/
Render Image 14836136
fileset_dirs {}
fs_contents ['10x images plate 3.ome.zarr']
Link from /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-10/2023-06/27/16-26-38.877/10x images plate 3.ome.zarr to /idr0025/zarr/10x images plate 3.ome.zarr

@will-moore
Copy link
Member Author

Looks good in idr0125-pilot:

Image

@will-moore
Copy link
Member Author

will-moore commented Jun 27, 2023

Create zips..

ssh pilot-zarr2-dev
cd /data/ngff/idr0025

for i in */; do zip -r "${i%/}.zip" "$i"; done
$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d /data/ngff/idr0025/idr0025 [email protected]:5f/136e8d-xxxxxxx
10x images plate 1.ome.zarr.zip                     100% 1020MB  486Mb/s    00:19
10x images plate 2.ome.zarr.zip                     100%  729MB  439Mb/s    00:32
10x images plate 3.ome.zarr.zip                     100%  832MB  192Mb/s    00:48

@will-moore will-moore assigned francesw and unassigned will-moore Jun 28, 2023
@will-moore
Copy link
Member Author

Deleted data

sudo rm -rf idr0025/

@francesw francesw removed their assignment Aug 14, 2023
@will-moore
Copy link
Member Author

@will-moore will-moore changed the title idr0025-stadler-proteinatlas to NGFF idr0025-stadler-proteinatlas S-BIAD846 Aug 15, 2023
@will-moore
Copy link
Member Author

will-moore commented Aug 29, 2023

Testing mkngff on idr0125-pilot...

Added Fileset IDs manually. NB: 10x images plate 2 has already had "swap Fileset" treatment to NGFF. Others are original.

idr0025/10x images plate 3.ome.zarr,S-BIAD846/3c534b4f-12be-4881-a84a-af6b65e142ea,23152
idr0025/10x images plate 1.ome.zarr,S-BIAD846/52304cdf-4eba-4f0a-84b1-690e0d66add9,23151
idr0025/10x images plate 2.ome.zarr,S-BIAD846/72cc291b-a4e0-4807-bd23-22e9ad75c0dd,5286921

The whitespace in these rows causes issues with processing with for r in $(cat idr0025.csv); do... since the for loop iterates over each token (split by whitespace) rather than each row of the table!
Simplest solution is to replace whitespace with _ in the csv, since we don't actually need the Fileset names anyway!

for r in $(cat idr0025.csv); do
  biapath=$(echo "$r" | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo "$r" | cut -d',' -f3)
  omero mkngff sql --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$fsid.sql"
done
...
Creating symlink /data/OMERO/ManagedRepository/demo_2/2017-03/13/15-19-51.590_mkngff/52304cdf-4eba-4f0a-84b1-690e0d66add9.zarr -> /bia-integrator-data/S-BIAD846/52304cdf-4eba-4f0a-84b1-690e0d66add9/52304cdf-4eba-4f0a-84b1-690e0d66add9.zarr
...
Creating symlink /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-16/2023-04/12/10-20-20.483_mkngff/72cc291b-a4e0-4807-bd23-22e9ad75c0dd.zarr -> /bia-integrator-data/S-BIAD846/72cc291b-a4e0-4807-bd23-22e9ad75c0dd/72cc291b-a4e0-4807-bd23-22e9ad75c0dd.zarr
for r in $(cat idr0025.csv); do
  fsid=$(echo $r | cut -d',' -f3)
  psql -U omero -d idr -h $DBHOST -f "$fsid.sql"
done

BEGIN
 mkngff_fileset 
----------------
        5287455
(1 row)
COMMIT
BEGIN
 mkngff_fileset 
----------------
        5287456
(1 row)
COMMIT
BEGIN
 mkngff_fileset 
----------------
        5287457
(1 row)
COMMIT

@will-moore
Copy link
Member Author

Looks good:

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: NGFF studies
Development

No branches or pull requests

3 participants