Merge pull request #3 from imglib/remotes/origin/main

Updated blogposts added 20240227
imglib · Apr 25, 2024 · 5fe8d3e · 5fe8d3e
2 parents 23bdf23 + 9c3a521
commit 5fe8d3e
Show file tree

Hide file tree

Showing 7 changed files with 495 additions and 271 deletions.
diff --git a/blog/2022-05-02-juliaset-lambda/2022-05-02-juliaset-lambda.ipynb b/blog/2022-05-02-juliaset-lambda/2022-05-02-juliaset-lambda.ipynb
diff --git a/...2022-06-05-setup-ijava-jupyter-kernel.qmd → ...yter-kernel/setup-ijava-jupyter-kernel.md b/...2022-06-05-setup-ijava-jupyter-kernel.qmd → ...yter-kernel/setup-ijava-jupyter-kernel.md
@@ -1,14 +1,26 @@
 ---
-title: "Setup the IJava jupyter kernel"
-author: "Stephan Saalfeld"
-date: "2022-06-05"
-categories: [ jupyter, ijava, jshell , java  , kernel] 
-execute:
-  echo: true
+aliases:
+- /jupyter/ijava/jshell/java/kernel/2024/04/03/setup-ijava-jupyter-kernel
+author: Stephan Saalfeld
+badges: true
+branch: master
+categories:
+- jupyter
+- ijava
+- jshell
+- java
+- kernel
+date: '2022-06-05'
+date-modified: '2024-04-03'
+description: Follow these instructions to setup the IJava jupyter kernel by Spencer
+  Park.
+layout: post
+title: Setup the IJava jupyter kernel
+toc: false
 
 ---
 
-In this blog, we will show code snippets and examples to make the best use of [ImgLib2](https://github.com/imglib/imglib2), [BigDataViewer](https://github.com/bigdataviewer/bigdataviewer-core), and friends.  ImgLib2 is written to be fast and we will run code that needs to be compiled, so we cannot use any of the various interpreted scripting languages like Python, Groovy, or Javascript.  Instead, we will use the [JShell tool](https://docs.oracle.com/javase/9/jshell/introduction-jshell.htm#JSHEL-GUID-630F27C8-1195-4989-9F6B-2C51D46F52C8) that you can use directly in a terminal or through [Spencer Park's IJava jupyter kernel](https://github.com/SpencerPark/IJava).  You can also follow these tutorials in your own Java project and use your preferred IDE, but Jupyter notebooks are a great teaching tool.  Since jupyter is written in Python and most popular with the Python community, let's follow their ways and first thing create a virtual environment with conda.  The lack of version controlled dependency management for Python projects makes it necessary that practically every project must run in a container or virtual environment because the dependencies of different projects almost inevitably collide.  Conda is the most popular of several attempts to address this situation.  Conda cannot currently be installed from the default Ubuntu repositories, so much about that, but the [installation instructions](https://docs.conda.io/projects/conda/en/latest/user-guide/install/rpm-debian.html) are tolerable, there is a PPA.  Now let's create an environment for jupyter:
+In this blog, we will show code snippets and examples to make the best use of [ImgLib2](https://github.com/imglib/imglib2), [BigDataViewer](https://github.com/bigdataviewer/bigdataviewer-core), and friends.  ImgLib2 is written to be fast and we will run code that needs to be compiled, so we cannot use any of the various interpreted scripting languages like Python, Groovy, or Javascript.  Instead, we will use the [JShell tool](https://docs.oracle.com/javase/9/jshell/introduction-jshell.htm#JSHEL-GUID-630F27C8-1195-4989-9F6B-2C51D46F52C8) that you can use directly in a terminal or through [Spencer Park's IJava jupyter kernel](https://github.com/saalfeldlab/IJava).  You can also follow these tutorials in your own Java project and use your preferred IDE, but Jupyter notebooks are a great teaching tool.  Since jupyter is written in Python and most popular with the Python community, let's follow their ways and first thing create a virtual environment with conda.  The lack of version controlled dependency management for Python projects makes it necessary that practically every project must run in a container or virtual environment because the dependencies of different projects almost inevitably collide.  Conda is the most popular of several attempts to address this situation.  Conda cannot currently be installed from the default Ubuntu repositories, so much about that, but the [installation instructions](https://docs.conda.io/projects/conda/en/latest/user-guide/install/rpm-debian.html) are tolerable, there is a PPA.  Now let's create an environment for jupyter:
 
 ```
 conda create -n jshell-jupyter python=3
@@ -32,9 +44,8 @@ git checkout try-upgrade-gradle
 ./gradlew publishToMavenLocal
 
 cd ..
-git clone https://github.com/hanslovsky/IJava.git
+git clone https://github.com/saalfeldlab/IJava.git
 cd IJava/
-git checkout hanslovsky/gradle-7.4.2
 ./gradlew installKernel
 ```
 
@@ -54,7 +65,7 @@ You can now start the jupyter notebook server
 jupyter notebook --kernel=java
 ```
 
-And experiment with the examples.  [Spencer Park's IJava jupyter kernel](https://github.com/SpencerPark/IJava) makes it very easy to include dependencies.  You can include the relevant snippets from a Maven POM into a tagged code block, e.g.
+And experiment with the examples.  [Spencer Park's IJava jupyter kernel](https://github.com/saalfeldlab/IJava) makes it very easy to include dependencies.  You can include the relevant snippets from a Maven POM into a tagged code block, e.g.
 
 ```xml
 %%loadFromPOM
@@ -69,6 +80,13 @@ And experiment with the examples.  [Spencer Park's IJava jupyter kernel](https:/
 </dependency>
 ```
 
+or in gradle short notation
+
+```
+%mavenRepo scijava.public https://maven.scijava.org/content/groups/public
+%maven sc.fiji:bigdataviewer-vistools:1.0.0-beta-29
+```
+
 If you prefer to run [JShell](https://docs.oracle.com/javase/9/jshell/introduction-jshell.htm#JSHEL-GUID-630F27C8-1195-4989-9F6B-2C51D46F52C8) directly, you can pull in the dependencies from a complete Maven POM with John Pooth's Maven Jshell plugin
 
 ```
@@ -110,4 +128,3 @@ jupyter kernelspec list
 ```
 
 Done.
-
diff --git a/blog/2022-09-27-n5-imglib2/2022-09-27-n5-imglib2.ipynb b/blog/2022-09-27-n5-imglib2/2022-09-27-n5-imglib2.ipynb
diff --git a/blog/2024-02-27-n5-tutorial-basic/.Rhistory b/blog/2024-02-27-n5-tutorial-basic/.Rhistory
diff --git a/blog/2024-02-27-n5-tutorial-basic/index.qmd b/blog/2024-02-27-n5-tutorial-basic/index.qmd
@@ -0,0 +1,289 @@
+---
+title: "N5 API Basics"
+description: "Basics of the N5 API for Java developers. This tutorial shows how to read and write n-dimensional image data and structured metadata into HDF5, N5, and Zarr containers using the N5 API."
+author: 
+    - name: John Bogovic
+    - name: Caleb Hulbert
+date: "2/27/2024"
+date-modified: "4/23/2024"
+notebook-links: global
+image: n5-basic-tutorial-thumbnail.png
+categories:
+  - hdf5
+  - n5
+  - zarr
+  - imglib2
+  - tutorial
+format:
+  html:
+    toc: true
+---
+
+This tutorial for Java developers covers the most basic functionality of the [N5 API](https://github.com/saalfeldlab/n5)
+for storing large, chunked n-dimensional image data and structured metadata. The N5 API and documentation refer to n-dimensional images as
+"datasets", [terminology inherited from HDF5](https://docs.hdfgroup.org/hdf5/develop/_g_l_s.html#title3).  We will use this terminology in this tutorial.
+If you are used to work with Python and Numpy, an n-dimensional image or dataset is what you know as an [`ndarray`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html).
+We will learn about:
+
+* creating readers and writers
+* modifying and inspecting the hierarchy ("folder structure")
+* saving and loading datasets
+* saving and loading metadata
+
+## Readers and writers
+
+[`N5Reader`](https://github.com/saalfeldlab/n5/blob/n5-3.2.0/src/main/java/org/janelia/saalfeldlab/n5/N5Reader.java)s and
+[`N5Writer`](https://github.com/saalfeldlab/n5/blob/n5-3.2.0/src/main/java/org/janelia/saalfeldlab/n5/N5Writer.java)s form
+the basis of the N5 API and allow you to read and write data, respectively. We generally recommend using an
+[`N5Factory`](https://github.com/saalfeldlab/n5-universe/blob/n5-universe-1.4.2/src/main/java/org/janelia/saalfeldlab/n5/universe/N5Factory.java) to create readers and writers:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#make-reader-writer echo=true >}}
+
+The N5 API gives you access to a number of different storage formats: HDF5, Zarr, and N5's own
+format. `N5Factory`'s convenience methods try to infer the storage format from the extension
+of the path you provide:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#factory-types echo=true >}}
+
+In fact, it is possible to read with `N5Writer`s since every `N5Writer`
+is also an `N5Reader`, so from now on we'll just be using the
+`n5Writer`.
+
+::: {.callout-tip}
+## Try it!
+
+We use the the N5 storage format for the rest of the tutorial, but it will work just as well over either
+an HDF5 file or Zarr container.
+:::
+
+## Groups
+
+N5 containers form hierarchies of *groups* - think "nested folders on your file system."
+It's easy to create groups and test if they exist:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#make-groups echo=true >}}
+
+The `list` method lists groups that are children of the given group:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#list echo=true >}}
+
+and `deepList` recursively lists every descendent of the given group:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#deep-list echo=true >}}
+
+Notice that these methods *only* give information about what groups are
+present and do not provide information about metadata or datasets.
+
+::: {.callout-note}
+Some storage / access systems (AWS-S3) separate permissions for reading and listing, meaning
+it may be possible to access data but not list.
+:::
+
+## Datasets
+
+N5 stores datasets (n-dimensional arrays) in particular groups in the hierarchy. 
+
+::: {.callout-warning}
+Datasets must be terminal (leaf) nodes in the container hierarchy - i.e. a dataset can not contain
+another group or dataset.  (Is this strictly true? May be confusing with names like multiscale "datasets")
+:::
+
+We recommend using code from [n5-ij](https://github.com/saalfeldlab/n5-ij) or [n5-imglib2](https://github.com/saalfeldlab/n5-imglib2)
+to write datasets. The examples in this post will use the latter.
+
+The [`N5Utils`](https://github.com/saalfeldlab/n5-imglib2/blob/241dc2b503d01007ec6aec72dacecc9706f023ab/src/main/java/org/janelia/saalfeldlab/n5/imglib2/N5Utils.java)
+class in n5-imglib2 has many useful methods, but in this post, we'll cover simple methods for reading and writing. First, 
+[`N5Utils.save`](https://github.com/saalfeldlab/n5-imglib2/blob/241dc2b503d01007ec6aec72dacecc9706f023ab/src/main/java/org/janelia/saalfeldlab/n5/imglib2/N5Utils.java#L1440) 
+writes a dataset and required metadata to the container at a group that you specify. The group will be created if it does
+not already exist. The parameters will be discussed in more detail below. 
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#n5-imglib2-save echo=true >}}
+
+You can write in parallel by providing an [`ExecutorService`](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ExecutorService.html) to this variant of
+[`N5Utils.save`](https://github.com/saalfeldlab/n5-imglib2/blob/241dc2b503d01007ec6aec72dacecc9706f023ab/src/main/java/org/janelia/saalfeldlab/n5/imglib2/N5Utils.java#L1514)
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#n5-imglib2-save-exec echo=true >}}
+
+Reading the dataset from the container is also easy with
+[`N5Utils.open`](https://github.com/saalfeldlab/n5-imglib2/blob/241dc2b503d01007ec6aec72dacecc9706f023ab/src/main/java/org/janelia/saalfeldlab/n5/imglib2/N5Utils.java#L428) :
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#n5-imglib2-open echo=true >}}
+
+::: {.callout-warning}
+
+## Overwriting data is possible
+
+This save method *DOES NOT* perform any checks prior to writing data and will overwrite data that exists in the specified location.
+Be sure to check and take appropriate action if it is possible that data could already be at a particular location and 
+container to avoid data loss or corruption.
+:::
+
+This example shows that data can be over written:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#n5-imglib2-overwrite echo=true >}}
+
+### Parameter details
+
+#### `groupPath` 
+
+is the location inside the container that will store the dataset. You can store an dataset at the
+root of a container by specifying `""` or `"/"` as the `groupPath`. In this case, the container
+will only be able to store one dataset ([see the warning above](#datasets)).
+
+#### `blockSize` 
+
+is a very important parameter. HDF5, N5, and Zarr all break up the datasets they store
+into equally sized blocks or "chunks". The block size parameter specifies the size of these blocks.
+
+For the example above, we stored an image of size `64 x 64` using blocks sized `32 x 32`. As a result, N5 uses
+four blocks to store the entire image:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#four-blocks echo=true >}}
+
+*Quiz:* How many blocks would there be if the block size was `64 x 8`? 
+
+<details>
+<summary>Click here to show the answer.</summary>
+
+There would be eight blocks.
+
+One block covers the first dimension, but it takes 8 blocks to cover the second dimension ($8 \times 8 = 64$).
+Also demonstrated by the code below:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#eight-blocks echo=true >}}
+
+</details>
+
+::: {.callout-tip}
+## Try it!
+
+N5 lets you store your image in a single file if you want - just provide a block size that 
+is equal to or larger than the image size.
+:::
+
+#### `compression` 
+
+Each block is compressed independently, using the specified compression.
+Use [`RawCompression`](https://github.com/saalfeldlab/n5/blob/n5-3.1.3/src/main/java/org/janelia/saalfeldlab/n5/RawCompression.java)
+to store blocks without compression.
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#no-compression echo=true >}}
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#no-compression-blk-sizes echo=true >}}
+
+Notice that blocks were previously ~1700-2000 bytes and are now ~4100 without compression.
+
+The available compression options at the time of this writing are:
+
+* [`BloscCompression`](https://github.com/saalfeldlab/n5-blosc/blob/n5-blosc-1.1.1/src/main/java/org/janelia/saalfeldlab/n5/blosc/BloscCompression.java)
+* [`Bzip2Compression`](https://github.com/saalfeldlab/n5/blob/n5-3.1.3/src/main/java/org/janelia/saalfeldlab/n5/Bzip2Compression.java)
+* [`GzipCompression`](https://github.com/saalfeldlab/n5/blob/n5-3.1.3/src/main/java/org/janelia/saalfeldlab/n5/GzipCompression.java)
+* [`Lz4Compression`](https://github.com/saalfeldlab/n5/blob/n5-3.1.3/src/main/java/org/janelia/saalfeldlab/n5/Lz4Compression.java)
+* [`RawCompression`](https://github.com/saalfeldlab/n5/blob/n5-3.1.3/src/main/java/org/janelia/saalfeldlab/n5/RawCompression.java)
+* [`XzCompression`](https://github.com/saalfeldlab/n5/blob/n5-3.1.3/src/main/java/org/janelia/saalfeldlab/n5/XzCompression.java)
+* [`ZstandardCompression`](https://github.com/JaneliaSciComp/n5-zstandard/blob/n5-zstandard-1.0.2/src/main/java/org/janelia/scicomp/n5/zstandard/ZstandardCompression.java)
+
+## Metadata
+
+N5 can also store rich structured metadata in addition to array data. This tutorial will discuss basic, low-level metadata operations.
+Advanced operations and metadata standards may be described in a future tutorial.
+
+### Basics
+
+`N5Writer`s have a
+[`setAttribute`](https://github.com/saalfeldlab/n5/blob/n5-3.1.3/src/main/java/org/janelia/saalfeldlab/n5/N5Writer.java#L55) 
+method for writing metadata to the storage backend. It takes three arguments:
+
+```java
+<T> void setAttribute(String groupPath, String attributePath, T attribute)
+```
+
+* `groupPath` : the group in which to store this metadata
+* `attributePath` : the name of this attribute
+* `attribute` : the metadata attribute to be stored. Can be an arbitrary type (denoted `T`).
+
+::: {.callout-note}
+There are differences between an attribute "name" and an attribute "path", but attribute "paths" are an advanced topic
+and will be covered elsewhere.
+:::
+
+Similarly, `N5Reader`s have a 
+[`getAttribute`](https://github.com/saalfeldlab/n5/blob/n5-3.1.3/src/main/java/org/janelia/saalfeldlab/n5/N5Reader.java#L241-L244)
+method:
+
+```java
+<T> T getAttribute(String groupPath, String attributePath, Class<T> clazz)
+```
+
+The last  argument (`Class<T>`) lets you specify the type that `getAttribute` should return.
+An `N5Exception` will be thrown if the requested type can not be created from the requested attribute.
+If an attribute does not exist, `null` will be returned (see the last example of this section).
+Consider these examples:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#attributes-1 echo=true >}}
+
+Sometimes it is possible to interpret an attribute as multiple different types:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#attr-types echo=true >}}
+
+### Rich metadata
+
+It possible to save attributes of arbitrary types, enabling you to struture your 
+metadata into classes that are easy to save and load directly. For example, if we define a metadata class `FunWithMetadata`:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#fun-with-metadata echo=true >}}
+
+then make an instance and save it:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#rich-metadata echo=true >}}
+
+To retrieve all the metadata in a group as JSON:
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#all-metadata echo=true >}}
+
+### Removing metadata
+
+You can remove attributes by their name as well. To return the element that was removed, just provide the class for that element
+(this mirrors the [remove method](https://docs.oracle.com/javase/8/docs/api/java/util/List.html#remove-int-) for `List`s in Java.
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#remove-attrs echo=true >}}
+
+### Working with Dataset Metadata
+
+Metadata used to describe datasets can be `get` and `set` the same as all other metadata. 
+However there are special [`DatasetAttributes`](https://github.com/saalfeldlab/n5/blob/8e14d529276b57e1817ff21df9cac9fb1a517d59/src/main/java/org/janelia/saalfeldlab/n5/DatasetAttributes.java)
+methods to safely work with dataset metadata.
+[`N5Reader.getDatasetAttributes`](https://github.com/saalfeldlab/n5/blob/8e14d529276b57e1817ff21df9cac9fb1a517d59/src/main/java/org/janelia/saalfeldlab/n5/N5Reader.java#L276) and 
+[`N5Writer.setDatasetAttributes`](https://github.com/saalfeldlab/n5/blob/8e14d529276b57e1817ff21df9cac9fb1a517d59/src/main/java/org/janelia/saalfeldlab/n5/N5Writer.java#L134)
+ensure the metadata is always a valid representation of dataset metadata.
+Setting `DatasetAttributes` however should only be done when the dataset is initially saved. This ensure the required metadata is tightly coupled with the data.
+For example, `set`ting dataset metadata should be done through the
+[N5Writer.createDataset](https://github.com/saalfeldlab/n5/blob/8e14d529276b57e1817ff21df9cac9fb1a517d59/src/main/java/org/janelia/saalfeldlab/n5/N5Writer.java#L200) 
+methods (or indirectly through the `N5Utils.save` [methods mentioned above](#datasets))
+
+{{< embed ../../_notebooks/N5-Basics-Tutorial.ipynb#array-metadata echo=true >}}
+
+::: {.callout-warning}
+## Warning
+
+The attributes that N5 uses to read datasets can be set with `setAttribute`, and modifying them could corrupt your data.
+**Do not manually set these attributes unless you absolutely know what you're doing!**
+
+* `dimensions`
+* `blockSize`
+* `dataType`
+* `compression`
+
+The attributes that describe datasets are also accessible using `getAttribute`, try running: 
+
+```java
+n5Writer.getAttribute("data", "dimensions", long[].class);
+```
+
+though using `getDatasetAttributes().getDimensions()` are generally recommended.
+:::
+
+## What to try next
+
+* [How to work with the N5 API and ImgLib2](https://imglib.github.io/imglib2-blog/posts/2022-09-27-n5-imglib2.html)
+
diff --git a/blog/2024-02-27-n5-tutorial-basic/n5-basic-tutorial-thumbnail.png b/blog/2024-02-27-n5-tutorial-basic/n5-basic-tutorial-thumbnail.png
diff --git a/ecosystem/mastodon-sc_mastodon/index.qmd b/ecosystem/mastodon-sc_mastodon/index.qmd