The method consists of an image processing pipeline leading to the sampling of a bigger image into tiles, by taking into account the textures of said tiles, with the purpose of obtaining an optimal, conent-aware split, without "cutting up" homogenous structures.
The method has been developed in the big data / deep learning context out of the need of sampling gigapixel medical images into minimally overlapping homogenous sub-parts for training a multiple instance learning model based on convultional neural networks, however, it has the potential of broader use.
Use case here: D. Mandache, E. B. à La Guillaume, Y. Badachi, J-C. Olivo-Marin and V. Meas-Yedid, The Lifecycle of a Neural Network in the Wild: A Multiple Instance Learning Study on Cancer Detection from Breast Biopsies Imaged with Novel Technique, 2022 IEEE International Conference on Image Processing (ICIP), 2022, pp. 3601-3605, doi: 10.1109/ICIP46576.2022.9897596.
The method strongly relies on the SLIC superpixel segmentation algorithm implemented in Scikit-Image skimage.segmentation.slic
Pipeline: image -> convert to grayscale -> downscale -> Gaussian blur -> estimate number of superpixels -> segment into superpixels -> filter out background superpixels -> extract centers of mass from superpixels -> upscale -> define corresponding patches
image
: 2D or 3D matrix, input image to subsamplepatch_size
: integer, size of resulting square patchesoverlap
: integer, approximate number of pixels common to two patches; note that while this value is exact for the regular grid sampling given as baseline, for the Sleek method the overlap value is approximativescale
: integer, downscaling factor for speeding up the execution
from SLIC (see skimage.segmentation.slic for more details)
sigma
: width of Gaussian smoothing kernel for pre-processingcompactness
: float, between 0 and 1, balances color proximity and space proximity (higher values give more weight to space proximity, making superpixel shapes more square)min_size_factor
: proportion of the minimum superpixel size to be removed with respect to the supposed initial square sizemax_size_factor
: proportion of the maximum connected superpixel sizeslic_zero
, boolean flag, if True runs the zero-parameter mode of SLICmask
: boolean 2D array given as mask for area of interest to patchify
remove_background
: boolean flag, should be False if a mask is already providedbackground_removal_strategy
: thresholding strategy applied on the mean intensity of the obtained pixel clusters, accepted values: isodata, otsu, li, yen, triangle, quantilebackground_is
: specify if the background is lighter or darker than the foregroud, accepted values: light, dark
debug
: boolean flag, if True saves images of the intermediary steps, like the result of the SLIC algrithm, background mask, etc.logdir
: path to the directory where to save the debugging files
- list of extracted patches
- list of coordinates for the centers of the patches inside the image
regular grid sampling with the same background removal stretegy as above
reconstruct the image from the sampled patches and their position
draws sampled patches over the image
Example image comes from The Early Breast Cancer Core-Needle Biopsy WSI (BCNB) Dataset, freely available at https://bupt-ai-cz.github.io/BCNB/ and the foreground mask is produced by the author using Icy Platform.
WSI of size 14208 x 18080 pixels, sampled with patches of size 2048 x 2048 with an overlap of 256 pixels, the Sleek method is applied on the greyscale transformed image down-scaled with a factor of 10
Image | Mask |
Regular Grid Sampling | Sleek Patchification | Masked Sleek Patchification |
- download repository
pip install -e /path/to/repository
import sleek
- load
image
patches, centers = sleek.sleek_patchify(image, ...)
For more details see demo
.