1. Knowledge Hub Home
  2. Vision4D
  3. Machine learning and deep learning

Applying Cellpose models in arivis Vision4D

This article explains how to integrate cellpose into an arivis Vision4D pipeline


Cellpose is a deep-learning (DL) based algorithm for cell and nucleus segmentation. It was created by the Stringer and Pachitariu groups and was originally published in the Stringer et al., Nature Methods, 2021.

Cellpose uses a cell detection method that predicts object shape using the flow-representation of object cell dynamics that is well-suited to approximate and define the complex borders of the cells in the microscopy images. These representations are provided for the DL model training and predictions (inference).


cellpose represents an advanced method to detect objects with complex shapes such as cells and nuclei, especially in crowded fields where the objects are very close to each other, but it is limited to these cases. The new frontiers of image analysis in life science require the capability to analyze the complex interactions between biological structures.

Vision4D can be configured to execute cellpose segmentation within analysis pipelines, thereby enabling users to take advantage of both the advanced segmentation enabled by cellpose and the image and segment processing and visualisation tools offered by Vision4D. this article explains how to download and install the necessary tools, and how to configure the pipeline Python Segmenter operation to segment objects using cellpose. 

By integrating cellpose into a pipeline, users can take advantage of the full functionality of the Vision4D pipeline concept to:

  • Process large multidimensional images
  • Enable segmentation in an easy to use interface
  • Enable the visualization of objects segmented using cellpose in 4D with advanced object display options
  • Facilitate complex further analysis like parent-child relationships and tracking

Preliminary Remarks

Vision4D runs deep learning applications for instance segmentation such as Cellpose and StarDist using external and arivis-independent Python libraries and tools produced by third parties.

These tools must be installed by the user under their own responsibility, strictly following the instructions in this document. arivis has tested the setup protocol on several computers, however, due to the different and unpredictable hardware and software configurations of any given computer system, the results may vary on a case-by-case basis. Therefore, arivis declines any responsibility concerning the correct tools, installation, and setup on the individual user’s workstation. arivis cannot be made responsible for any malfunctioning or failure of the deep learning environment setup. arivis does not guarantee technical support on the setup task or on any deep learning application. Furthermore, arivis also declines any responsibility regarding the validity of the scientific results gathered from the deep learning application.

How does it work?

Objects annotation

This task consists of manually drawing the shape of the object over a set of representative images (2D or 3D). The reference objects should describe all their possible variation within the reference samples. The annotations are then used to create a binary masked image (Ground truth). Both the annotations and the related binary masks are used afterward by the training task to build the Neuronal Network (training).

The annotations are often done manually and therefore are very time-consuming. The correct number of annotations must be estimated in advance to get an accurate training result. During the subsequent rounds of training, the annotations amount might be increased if required.

The number of annotations highly depends on the project. In general, it is recommended to annotate at least 200 objects of interest from 10-20 different images and potentially many more in complex projects. The number of ground truth images can be artificially increased by data augmentation. This method applies image rotations, reflections, and non-rigid transformations to increase the amount of ground truth provided for the DL training.

Creating and training a neural network

During the training process, the Deep Learning network is trained to recognize the features and patterns in the images to predict their positions and, thereby, segment the objects of interest.

The progress of training can be evaluated by comparing the training loss with the validation loss. During training, both values should decrease before reaching the minimal value, which should not change significantly with further cycles. Comparing the validation loss development with the training loss can give insights into the model’s performance. A decrease in both training and validation loss indicates that training is still necessary. If the validation loss suddenly increases again, while the training loss decreases towards zero, it means that the network is overfitting the training data.

Deep learning training is based on math operations. These operations are repetitive and time-consuming and can easily be parallelized. The usage of GPU resources improves the training performance by reducing the total time. Working with the CPU only, complex training can take 7 to 10 days of work, while using the GPU the total time may reduce to mere hours (10 to 12).

Image analysis

Once the Neuronal Network is trained, it can be used to analyze sample images. Currently, Vision4D can’t train the Cellpose Deep Neuronal Network model internally. Cellpose offers at least three different robust pre-trained models to segment the nuclei and the cells, primarily on the fluorescence microscopy images, which can very precisely segment a wide range of image types out-of-the-box and does not require model retraining or parameter adjustments.

Cellpose operator generates the segments, which in the following steps in the pipeline can be used to run the Compartmentalization analysis, tracking and several other quantitative measurements. The image used in this application are courtesy from Dr. Masahiro Narimatsu (Dr. Wrana’s lab), Lunenfeld-Tanenbaum Research Institute.

Application examples:


The image is courtesy from Dr. Masahiro Narimatsu (Dr. Wrana’s lab), Lunenfeld-Tanenbaum Research Institute. Optical sections through the embryonic body grown from the R1 mouse embryonic stem cells. Staining: Blue: nuclei (DAPI); Green: actin (phalloidin-Alexa 488); Magenta: E-cadherin (antibody staining); Red: Oct3/4 (antibody staining).


Nuclei segmentation:

Cellpose nuclei model is applied on the 3D stack to segment all the nuclei and a region growing operation can be used to identify the boundaries of the cells based on the membrane staining. Please note the complexity and proximity of the nuclei morphologies.


The image used in this application are courtesy from Dr. Masahiro Narimatsu (Dr. Wrana’s lab), Lunenfeld-Tanenbaum Research Institute.

Cells segmentation:

Cellpose cyto2 model is applied on the 3D stack to segment all the cytoplasm objects. This operation could be applied over multiple timepoints while further operations take the result of the segmentation to create tracks and measure their properties.


The images used in this application are the courtesy from Dr. Masahiro Narimatsu (Dr. Wrana’s lab), Lunenfeld-Tanenbaum Research Institute.


Download Full "How to : install and run predictions with Cellpose» PDF