New research from the UK has proposed an improved machine learning method to resize images, based on the perceived value of the various parts of the image content, instead of indiscriminately reducing the dimension (and therefore the quality and extractable features) for all the pixels in the image.
As part of a growing interest in AI-driven compression systems, it’s an approach that could eventually inform new codecs for general image compression, though the work is motivated by health imaging, where arbitrary downsampling of high-resolution medical images could lead to the loss of life-saving information.
Representational architecture of the new system. The interstitial deformation module produces a deformation map that corresponds to areas of interest in the image. The density and direction of the red dots indicate these areas. The map is used not only to downsample, but to reconstruct the primary-interest areas when the image content is non-uniformly re-upscaled at the other side of the training process. Source: https://arxiv.org/pdf/2109.11071.pdf
The system applies semantic segmentation to the images – broad blocks, represented as color blocks in the image above, that encompass recognized entities inside the picture, such as ‘road’, ‘bike’, ‘lesion’, et al. The disposition of the semantic segmentation maps are then used to calculate which parts of the photo should not be excessively downsampled.