OpenPhi: an interface to access Philips iSyntax whole slide images for computational pathology

Abstract Summary Digital pathology enables applying computational methods, such as deep learning, in pathology for improved diagnostics and prognostics, but lack of interoperability between whole slide image formats of different scanner vendors is a challenge for algorithm developers. We present OpenPhi—Open PatHology Interface, an Application Programming Interface for seamless access to the iSyntax format used by the Philips Ultra Fast Scanner, the first digital pathology scanner approved by the United States Food and Drug Administration. OpenPhi is extensible and easily interfaced with existing vendor-neutral applications. Availability and implementation OpenPhi is implemented in Python and is available as open-source under the MIT license at: https://gitlab.com/BioimageInformaticsGroup/openphi. The Philips Software Development Kit is required and available at: https://www.openpathology.philips.com. OpenPhi version 1.1.1 is additionally provided as Supplementary Data. Supplementary information Supplementary data are available at Bioinformatics online.


Introduction
Pathology is transitioning into a digital discipline (Pantanowitz et al., 2018). Digitization of microscopy slides into Whole Slide Images (WSI) has been feasible for over 20 years, but the milestone event of regulatory approval by the United States Food and Drug Administration (FDA) for the Philips Ultra Fast Scanner (UFS), based on non-inferiority to conventional microscopy (Mukhopadhyay et al., 2018), is expected to promote the clinical adoption of the technology. Application of image analysis and deep learning on WSI data has enabled computational pathology, which aims at e.g. improved diagnosis and prognosis of diseases and reduction of inter-observer variability among pathologists (Bera et al., 2019).
However, the nature of WSI data imposes challenges to developers. The multi-gigapixel WSIs frequently exceed memory capacity, a problem typically tackled by resolution pyramid approaches implemented in image formats specific to each scanner vendor. As a consequence, standard image processing libraries cannot be used with WSIs. The Digital Imaging and Communications in Medicine (DICOM) Standards Committee Working Group 26 has released a standard to improve the interoperability of WSIs (Singh et al., 2011) which, despite its potential, has still not been widely adopted (Herrmann et al., 2018). Reverse engineering of proprietary formats has resulted in vendor-neutral WSI libraries such as OpenSlide (Goode et al., 2013). However, none of these libraries support the Philips iSyntax format (Hulsken, 2020). With increasing clinical use of the FDA approved Philips UFS, it is crucial for developers to have efficient ways of accessing the growing number of iSyntax WSIs. One approach for circumventing this issue is converting iSyntax to an open format (https://github.com/glencoesoftware/isyntax2 raw), e.g. OME-TIFF (Besson et al., 2019). However, conversion is problematic in the case of large datasets, since typically one is also required to archive the original data. Moreover, in view of clinical use, utilizing the native format of the scanner may be desirable due to regulatory considerations.
Philips recently released a Software Development Kit (SDK) for iSyntax, but several challenges remain. Firstly, the SDK

3995
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Methodology and key features
The structure of OpenPhi follows the iSyntax data model (Fig. 1). The names, parameters and output of the API methods adhere to the OpenSlide Python API, allowing adapting existing vendorneutral code for iSyntax compatibility by replacing OpenSlide with OpenPhi. OpenPhi itself has no strict requirements for the operating system or Python version. To meet the current requirements of the Philips SDK 2.0, we developed OpenPhi on Ubuntu 18.04 and Python 3.6.9. We verified the practical usability of OpenPhi by integrating it with a previously developed deep learning framework for Gleason grading of prostate biopsies (Strö m et al., 2020).

Reading label and macro images
The label image associated with an iSyntax file typically contains some form of sample identification numbers in a format specific to each clinical laboratory, and can be e.g. processed using optical character recognition algorithms to extract this information. The macro image, representing a low-resolution view of the glass slide acquired before scanning, may be utilized e.g. for coarse segmentation tasks or for troubleshooting purposes.

Reading regions of interest
The read_region method extracts pixel data from a rectangular region in the WSI at a desired resolution. Typically, only a limited amount of pixels are required at any given time by downstream applications, necessitating efficient random access across different locations and resolution levels. Unlike most WSI formats, iSyntax achieves this using a recursive Discrete Wavelet Transform to avoid storing downsampled copies of the image. The extracted pixel arrays can be used e.g. as input to deep neural networks.

Reading a whole slide image
The get_thumbnail and read_wsi methods read the entire WSI at desired maximum dimensions or a specified resolution, respectively. This is typically performed to obtain a more detailed view of the entire scanned area than the macro image. Typical use cases include segmentation of tissue regions and multi-step analysis with progressively increased resolution.

Extracting metadata
DICOM compliant metadata attributes are extracted as DICOM tags. All other metadata information are provided adhering to the OpenSlide generic properties format. Metadata properties include e.g. dimensions and pixel sizes of each resolution level, imaging parameters and device identifiers. The DICOM metadata tags can for example be utilized by developers building downstream applications capable of generating DICOM compliant output WSIs.

Conclusion
We present OpenPhi, an API that provides developers streamlined access to the Philips iSyntax image format. Building on the API, developers can implement algorithms with minimal effort spent on dealing with the intricacies of the proprietary format and on adapting existing vendor-neutral code. This can facilitate the computational pathology community to utilize the growing resources of WSI data stored in iSyntax format.