:py:mod:`wsinfer.patchlib` ========================== .. py:module:: wsinfer.patchlib Submodules ---------- .. toctree:: :titlesonly: :maxdepth: 1 patch/index.rst segment/index.rst Package Contents ---------------- Functions ~~~~~~~~~ .. autoapisummary:: wsinfer.patchlib.get_avg_mpp wsinfer.patchlib.get_multipolygon_from_binary_arr wsinfer.patchlib.get_patch_coordinates_within_polygon wsinfer.patchlib.segment_tissue wsinfer.patchlib.segment_and_patch_one_slide wsinfer.patchlib.save_hdf5 wsinfer.patchlib.draw_contours_on_thumbnail wsinfer.patchlib.segment_and_patch_directory_of_slides Attributes ~~~~~~~~~~ .. autoapisummary:: wsinfer.patchlib.WSI wsinfer.patchlib.logger wsinfer.patchlib.MASKS_DIR wsinfer.patchlib.PATCHES_DIR .. py:data:: WSI .. py:function:: get_avg_mpp(slide_path: pathlib.Path | str) -> float Return the average MPP of a whole slide image. The value is in units of micrometers per pixel and is the average of the X and Y dimensions. :raises CannotReadSpacing if the spacing cannot be read.: .. py:function:: get_multipolygon_from_binary_arr(arr: numpy.typing.NDArray[numpy.int_], scale: tuple[float, float] | None = None) -> tuple[shapely.MultiPolygon, Sequence[numpy.typing.NDArray[numpy.int_]], numpy.typing.NDArray[numpy.int_]] | None Create a Shapely Polygon from a binary array. :param arr: Binary array where non-zero values indicate presence of tissue. :type arr: array :param scale: If specified, this is the factor by which coordinates are multiplied to recover the coordinates at the base resolution of the whole slide image. :type scale: tuple of two floats, optional :returns: * *polygon* -- A shapely `MultiPolygon` object representing tissue regions. * *contours* -- A sequence of arrays representing unscaled contours of tissue. * *hierarchy* -- An array of the hierarchy of contours. .. py:function:: get_patch_coordinates_within_polygon(slide_width: int, slide_height: int, patch_size: int, half_patch_size: int, polygon: shapely.Polygon, overlap: float = 0.0) -> numpy.typing.NDArray[numpy.int_] Get coordinates of patches within a polygon. :param slide_width: The width of the slide in pixels at base resolution. :type slide_width: int :param slide_height: The height of the slide in pixels at base resolution. :type slide_height: int :param patch_size: The size of a patch in pixels. :type patch_size: int :param half_patch_size: Half of the length of a patch in pixels. :type half_patch_size: int :param polygon: A shapely Polygon representing the presence of tissue. :type polygon: Polygon :param overlap: The proportion of the patch_size to overlap. A value of 0.5 would have an overlap of 50%. A value of 0.2 would have an overlap of 20%. Negative values will add space between patches. A value of -1 would skip every other patch. Value must be in (-inf, 1). The default value of 0.0 produces non-overlapping patches. :type overlap: float :returns: Array with shape (N, 2), where N is the number of tiles. Each row in this array contains the coordinates of the top-left of a tile: (minx, miny). :rtype: coordinates .. py:function:: segment_tissue(im_arr: numpy.typing.NDArray, median_filter_size: int = 7, binary_threshold: int = 7, closing_kernel_size: int = 6, min_object_size_px: int = 512, min_hole_size_px: int = 1024) -> numpy.typing.NDArray[numpy.bool_] Create a binary tissue mask from an image. :param im_arr: RGB image array (uint8) with shape (rows, cols, 3). :type im_arr: array-like :param median_filter_size: The kernel size for median filtering. Must be odd and greater than one. :type median_filter_size: int :param binary_threshold: The pixel threshold for image binarization. :type binary_threshold: int :param closing_kernel_size: The kernel size for morphological closing (in pixel units). :type closing_kernel_size: int :param min_object_size_px: The minimum area of an object in pixels. If an object is smaller than this area, it is removed and is made into background. :type min_object_size_px: int :param min_hole_size_px: The minimum area of a hole in pixels. If a hole is smaller than this area, it is filled and is made into foreground. :type min_hole_size_px: int :returns: Boolean array, where True values indicate presence of tissue. :rtype: mask .. py:data:: logger .. py:data:: MASKS_DIR :value: 'masks' .. py:data:: PATCHES_DIR :value: 'patches' .. py:function:: segment_and_patch_one_slide(slide_path: str | pathlib.Path, save_dir: str | pathlib.Path, patch_size_px: int, patch_spacing_um_px: float, thumbsize: tuple[int, int] = (2048, 2048), median_filter_size: int = 7, binary_threshold: int = 7, closing_kernel_size: int = 6, min_object_size_um2: float = 200**2, min_hole_size_um2: float = 190**2, overlap: float = 0.0) -> None Get non-overlapping patch coordinates in tissue regions of a whole slide image. Patch coordinates are saved to an HDF5 file in `{save_dir}/patches/`, and a tissue detection image is saved to `{save_dir}/masks/` for quality control. In general, this function takes the following steps: 1. Get a low-resolution thumbnail of the image. 2. Binarize the image to identify tissue regions. 3. Process this binary image to remove artifacts. 4. Create a regular grid of non-overlapping patches of specified size. 5. Keep patches whose centroids are in tissue regions. :param slide_path: The path to the whole slide image file. :type slide_path: str or Path :param save_dir: The directory in which to save patching results. :type save_dir: str or Path :param patch_size_px: The length of one side of a square patch in pixels. :type patch_size_px: int :param patch_spacing_um_px: The physical spacing of patches in micrometers per pixels. This value multiplied by patch_size_px gives the physical length of a patch in micrometers. :type patch_spacing_um_px: float :param thumbsize: The size of the thumbnail to use for tissue detection. This specifies the largest possible bounding box of the thumbnail, and a thumbnail is taken to fit this space while maintaining the original aspect ratio of the whole slide image. Larger thumbnails will take longer to process but will result in better tissue masks. :type thumbsize: tuple of two integers :param median_filter_size: The size of the kernel for median filtering. This value must be odd and greater than one. This is in units of pixels in the thumbnail. :type median_filter_size: int :param binary_threshold: The value at which the image in binarized. A higher value will keep less tissue. :type binary_threshold: int :param closing_kernel_size: The size of the kernel for a morphological closing operation. This is in units of pixels in the thumbnail. :type closing_kernel_size: int :param min_object_size_um2: The minimum area of an object to keep, in units of micrometers squared. Any disconnected objects smaller than this area will be removed. :type min_object_size_um2: float :param min_hole_size_um2: The minimum size of a hole to keep, in units of micrometers squared. Any hole smaller than this area will be filled and be considered tissue. :type min_hole_size_um2: float :rtype: None .. py:function:: save_hdf5(path: str | pathlib.Path, coords: numpy.typing.NDArray[numpy.int_], patch_size: int, patch_spacing_um_px: float, compression: str | None = 'gzip') -> None Write patch coordinates to HDF5 file. This is designed to be interoperable with HDF5 files created by CLAM. :param path: Path to save the HDF5 file. :type path: str or Path :param coords: Nx2 array of coordinates, where N is the number of patches. Each row of the array must be minx and miny to specify the top-left of the patch. :type coords: array :param patch_size: The size of patches in pixels at level 0 of the slide (base resolution). :type patch_size: int :param patch_spacing_um_px: The physical spacing of the patch in micrometers per pixel. :type patch_spacing_um_px: float :param compression: Compression to use for storing coordinates. Default is "gzip". :type compression: str, optional :rtype: None .. py:function:: draw_contours_on_thumbnail(thumb: PIL.Image.Image, contours: Sequence[numpy.typing.NDArray[numpy.int_]], hierarchy: numpy.typing.NDArray[numpy.int_]) -> PIL.Image.Image Draw contours onto an image. :param thumb: The thumbnail of the whole slide of the same size as the binary image used during contour detection. :type thumb: Image.Image :param contours: The contours result of `cv.findContours`. :type contours: sequence of arrays :param hierarchy: The hierarchy result of `cv.findContours`. :type hierarchy: array :returns: An image with contours burned in. :rtype: Image.Image .. py:function:: segment_and_patch_directory_of_slides(wsi_dir: str | pathlib.Path, save_dir: str | pathlib.Path, patch_size_px: int, patch_spacing_um_px: float, thumbsize: tuple[int, int] = (2048, 2048), median_filter_size: int = 7, binary_threshold: int = 7, closing_kernel_size: int = 6, min_object_size_um2: float = 200**2, min_hole_size_um2: float = 190**2, overlap: float = 0.0) -> None Get non-overlapping patch coordinates in tissue regions for a directory of whole slide images. Patch coordinates are saved to HDF5 files in `{save_dir}/patches/`, and tissue detection images are saved to `{save_dir}/masks/` for quality control. In general, this function takes the following steps for each whole slide image: 1. Get a low-resolution thumbnail of the image. 2. Binarize the image to identify tissue regions. 3. Process this binary image to remove artifacts. 4. Create a regular grid of non-overlapping patches of specified size. 5. Keep patches whose centroids are in tissue regions. :param wsi_dir: The directory of whole slide images. This must only contain whole slide images. :type wsi_dir: str or Path :param save_dir: The directory in which to save patching results. :type save_dir: str or Path :param patch_size_px: The length of one side of a square patch in pixels. :type patch_size_px: int :param patch_spacing_um_px: The physical spacing of patches in micrometers per pixels. This value multiplied by patch_size_px gives the physical length of a patch in micrometers. :type patch_spacing_um_px: float :param thumbsize: The size of the thumbnail to use for tissue detection. This specifies the largest possible bounding box of the thumbnail, and a thumbnail is taken to fit this space while maintaining the original aspect ratio of the whole slide image. Larger thumbnails will take longer to process but will result in better tissue masks. :type thumbsize: tuple of two integers :param median_filter_size: The size of the kernel for median filtering. This value must be odd and greater than one. This is in units of pixels in the thumbnail. :type median_filter_size: int :param binary_threshold: The value at which the image in binarized. A higher value will keep less tissue. :type binary_threshold: int :param closing_kernel_size: The size of the kernel for a morphological closing operation. This is in units of pixels in the thumbnail. :type closing_kernel_size: int :param min_object_size_um2: The minimum area of an object to keep, in units of micrometers squared. Any disconnected objects smaller than this area will be removed. :type min_object_size_um2: float :param min_hole_size_um2: The minimum size of a hole to keep, in units of micrometers squared. Any hole smaller than this area will be filled and be considered tissue. :type min_hole_size_um2: float :rtype: None