-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
idr0012-fuchs-cellmorph S-BIAD845 #643
Comments
cc @dgault |
The error is the same as in the idr0011 case, namely see details below.
|
The import of METADATA.ome.xml file was successfull, see https://merge-ci.openmicroscopy.org/web/webclient/?show=plate-66557, user-3 |
Imported the idr0011 plate on pilot-idrtesting - all looks fine, thumbs and full images are generated, no errors. The OMEZarrReader 0.3.1 was used, see IDR/deployment#380 Time to import that one plate was 5h 28min |
This is a good candidate for conversion of a whole study. 68 plates - approx 500 GB total Create a bash script with a bioformats2raw command for each Plate Run on pilot-zarr2-dev NB: Monitor timestamps - start/end. Can investigate memory usage afterwards to decide on future conversion strategies. |
Then updoad to a new temp idr0012 bucket on EBI uk1 s3. |
Dom:
"conversion of idr0012 is finished, the zarrs are on Need to setup
See e.g. Whole plate takes a long time to view but gets there in the end: |
Following #656 with idr0012... We have already copied the metadata-only plates to
And the data is mounted at
We want to test import and viewing with ZarrReader fix at ome/ZarrReader#53, get link from lastSuccessfulBuild...
Import... Created Screen: 3204 in webclient...
Sample import times for first 3 plates:
Updated symlinks for just the first Screen without waiting for all...
Then viewed the Plate in webclient - and thumbnails were eventually generated correctly! |
Once ALL 68 Plates were imported into the Screen, I ran the same command to update symlinks for ALL Plates:
|
Updated just the first Plate - testing the unlinking of new Images from Fileset.
Ran the psql:
Then deleted New Plate without deleting any Filesets:
So then ran same command over the whole Screen...
Ended up stopping this early because
Then ran
All Plates are expected to have the same number of Images The script was probably part-way through Plate HT13 when interrupted. It would be nice to separate logging (print statements) from the generation of a sql file... Will update script.. |
Move NGFF Plates
Ran script again - failed to write to file in same dir.
Need to manually create sql for
|
/tmp/idr0012_filesetswap.sql
Seems to be Plates
|
For So, we can manually run the psql to complete Fileset swap for old plate...
NB: the UPDATE count includes the NGFF Images that weren't processed (Fileset not yet set to None): Need to unset Fileset from all Images in Plate.
|
Using the Seems that some Images didn't get their Fileset updated, so they weren't linked to the NGFF fileset and weren't updated.
Then ran the above command again:
Validated with
Check pixels took 44 hours to check 45692 images:
All the
(No other Errors!) |
Want to upload zip files to BioStudies, but we don't have the original data locally on zarr1-dev any more. Let's try to create zip from the data mounted via goofys on idr0125-pilot...
EDIT: zip creation took > 45 minutes!
|
Uploaded 1 plate to BioStudies (wmoore account):
|
as discussed today at IDR meeting: use |
Use the
|
Run zip for all other plates in a Screen
|
Install Aspera...
Uploaded 1 more plate to BioStudies (wmoore account). Working OK...
|
On |
Uploading zips to BioStudies IDR account...
|
Page available, but currently only 6 out of 68 plates are "viewable" |
Working on idr0125-pilot, where we previously imported NGFF Plates and swapped Filesets... With Fileset IDs:
NB: I had to manually update the Fileset IDs in csv above from idr0125 since the IDs from IDR/idr-utils#56 are for vanilla IDR. this ran in a few seconds...
However, this gave sql errors as the sql was invalid (no rows in array[])
This is due to the bia paths having extra |
This was caused by the zipping from s3-mounted directory above...
If we want to work with this data on BioStudies s3, we need to add
Checked HT20 Plate:
|
Blitz log...
E.g.
|
Ahh - I wonder if this is caused by the path having 2
If so, and we want to fix that path, maybe we need to resubmit - unless EBI can do it for us? |
That's a deep file path. Have we established this is truly necessary or could we truncate it up to the top-level |
I don't think it's necessary to have such a long path. |
Installed branch from that PR, on
As above... with extra path within the zip...
|
Ran the sql scripts...
But this didn't seem to update Filesets on idr0125-pilot. May have out-of data Fileset IDs. Start again, setting these Fileset IDs from the webclient... idr0012.csv:
|
|
Seems to be that the Start again... on idr0125-pilot... Update all variables:
Check that we now get the correct Fileset ID from Image ID on plate
Unchanged idr0012.csv:
Ran this without
mkngff
Check image above has new Fileset...
|
To see how long memo file regeneration took...
Time of |
We want to recreate the zips as above (downloading from our own ebi s3 idr0012 bucket) but without the extra dirs introduced above.
|
|
Deleted all zips at https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0012 Started to upload replacements...
|
Using latest updated
|
...all done!
|
Running on
|
Had to remount goofys (didn't need server restart) for the last 5 filesets...
|
idr0012-fuchs-cellmorph
The text was updated successfully, but these errors were encountered: