Selected Publications

Publications Related to Technology Development

  • Deep Learning for the Prediction of Early On-Treatment Response in Metastatic Colorectal Cancer From Serial Medical Imaging. Lu L, Dercle L, Zhao B, and Schwartz LH. Nat Commun 12, 6654 (2021).


In current clinical practice, tumor response assessment is usually based on tumor size change on serial computerized tomography (CT) scan images. However, evaluation of tumor response to anti-vascular endothelial growth factor therapies in metastatic colorectal cancer (mCRC) is limited because morphological change in tumor may occur earlier than tumor size change. Here we present an analysis utilizing a deep learning (DL) network to characterize tumor morphological change for response assessment in mCRC patients. We retrospectively analyzed 1,028 mCRC patients who were prospectively included in the VELOUR trial (NCT00561470). We found that DL network was able to predict early on-treatment response in mCRC and showed better performance than its size-based counterpart with C-Index: 0.649 (95% CI: 0.619,0.679) vs. 0.627 (95% CI: 0.567,0.638), p = 0.009, z-test. The integration of DL network with size-based methodology could further improve the prediction performance to C-Index: 0.694 (95% CI: 0.661,0.720), which was superior to size/DL-based-only models (all p < 0.001, z-test). Our study suggests that DL network could provide a noninvasive mean for quantitative and comprehensive characterization of tumor morphological change, which may potentially benefit personalized early on-treatment decision making. Full text

Automatic Liver Segmentation by Integrating Fully Convolutional Networks into Active Contour Models. Guo X, Schwartz LH, and Zhao B.  Med Phys. 2019.


PURPOSE: Automatic and accurate three-dimensional (3D) segmentation of liver with severe diseases from computed tomography (CT) images is a challenging task. Fully convolutional networks (FCNs) have emerged as powerful tools for automatic semantic segmentation, with multiple potential applications in medical imaging. However, the use of a large receptive field and multiple pooling layers in the network leads to poor localization around object boundaries. The network usually makes pixel-wise prediction independently, making it difficult to respect local label consistency and enforce the smoothness of the object boundary.

METHODS: We have developed an automatic liver segmentation method based on a novel framework that integrates fully convolutional network predictions into active contour models (ACM). We use only a single network architecture to generate a pixel label map containing spatial regional information (foreground and background) as well as layered boundary information. We exploit the structured network outcome to define an external constraint force of active contour models. A unique property of the designed force is that both its strength and direction are adaptive to its position and relative distance to the object boundary. The resulting integrated active contour models have the advantages of incorporating both high-level and low-level image information simultaneously, while enforcing the smoothness of the contour. Because the external constraint force can push the evolving contour to the liver boundary and exists everywhere in the image domain, it allows us to place the initial contour far away from the liver boundary. It potentially allows us to control the evolution of the contour in order to preserve the topology of the liver.

RESULTS: We have trained and evaluated our model on 73 liver CT scans from a clinic study. The integrated ACM model yields mean dice coefficients (DICE) 95.8 ± 1.4 (%). Without further fine-tuning the network weights for two independent datasets, it yields mean DICE 96.2 ± 0.9 (%) for the SLIVER07 training dataset, and mean DICE 94.3 ± 2.7 (%) for the LiTS training dataset. In comparison with FCN alone model, the integrated ACM model yields improvements in terms of surface distance and DICE values for almost all the cases. Furthermore, the initialization of the active contour can be very far away from the liver boundary.

CONCLUSIONS: Experimental results for segmenting livers (with severe diseases on CT images resulting in shape and density abnormalities) have revealed that our proposed model improves segmentation results in comparison with FCN alone. Without further fine-tuning the network weights for two independent datasets, the model is capable of handling image variations from different datasets due to its inherent deformable nature. It is relatively easy to integrate more advanced (either existing or future) FCN architecture into our framework to further improve the segmentation performance.  Full text

  • Implementation Strategy of a CNN Model Affects the Performance of CT Assessment of EGFR Mutation Status in Lung Cancer Patient. Xiong J, Li X, Lu L, Schwartz LH, Fu X, Zhao J, Zhao B. IEEE Access. 2019. 7:64583-91.


To compare CNN models implemented using different strategies in the CT assessment of EGFR mutation status in patients with lung adenocarcinoma. 1,010 consecutive lung adenocarcinoma patients with known EGFR mutation status were randomly divided into a training set (n = 810) and a testing set (n = 200). The CNN models were constructed based on ResNet-101 architecture but implemented using different strategies: dimension filters (2D/3D), input sizes (small/middle/large and their fusion), slicing methods (transverse plane only and arbitrary multi-view planes), and training approaches (from scratch and fine-tuning a pre-trained CNN). The performance of the CNN models was compared using AUC. The fusion approach yielded consistently better performance than other input sizes, although the effect often did not reach statistical significance. Multi-view slicing was significantly superior to the transverse method when fine-tuning a pre-trained 2D CNN but not a CNN trained from scratch. The 3D CNN was significantly better than the 2D transverse plane method but only marginally better than the multi-view slicing method when trained from scratch. The highest performance (AUC = 0.838) was achieved for the fine-tuned 2D CNN model when built using the fusion input size and multi-view slicing method. The assessment of EGFR mutation status in patients is more accurate when CNN models use more spatial information and are fine-tuned by transfer learning. Our finding of the implementation strategy of a CNN model could be a guide to other medical 3D images applications. Compared with other published studies which used medical images to identify EGFR mutation status, our CNN model achieved the best performance in the biggest patient cohort. Full text

  • Automated Identification of Optimal Portal Venous Phase Timing with Convolutional Neural Networks. Ma J, Dercle L, Lichtenstein P, Wang D, Chen A, Zhu J, Yang H, Piessevaux, H, Zhao, J, Schwartz LH, Lu L, Zhao B.  Acad Radiol 2019 May 28. pii: S1076-6332(19)30171-0.


Objectives: To develop a deep learning-based algorithm to automatically identify optimal portal venous phase timing (PVP-timing) so that image analysis techniques can be accurately performed on post contrast studies.

Methods: 681 CT-scans (training: 479 CT-scans; validation: 202 CT-scans) from a multicenter clinical trial in patients with liver metastases from colorectal cancer were retrospectively analyzed for algorithm development and validation. An additional external validation was performed on a cohort of 228 CT-scans from gastroenteropancreatic neuroendocrine cancer patients. Image acquisition was performed according to each centers’ standard CT protocol for single portal venous phase, portal venous acquisition. The reference gold standard for the classification of PVP-timing as either optimal or nonoptimal was based on experienced radiologists' consensus opinion. The algorithm performed automated localization (on axial slices) of the portal vein and aorta upon which a novel dual input Convolutional Neural Network calculated a probability of the optimal PVP-timing.

Results: The algorithm automatically computed a PVP-timing score in 3 seconds and reached area under the curve of 0.837 (95% CI: 0.765, 0.890) in validation set and 0.844 (95% CI: 0.786, 0.889) in external validation set.

Conclusion: A fully automated, deep-learning derived PVP-timing algorithm was developed to classify scans’ contrast-enhancement timing and identify scans with optimal PVP-timing. The rapid identification of such scans will aid in the analysis of quantitative (radiomics) features used to characterize tumors and changes in enhancement with treatment in a multitude of settings including quantitative response criteria such as Choi and MASS which rely on reproducible measurement of enhancement.  Full text

  • Lymph node segmentation by dynamic programming and active contours. Tan Y, L Lu, Bonde A, Wang D, Qi J, Schwartz HL, and Zhao B. Medical Physics 2018; 45(5):2054-2062.


Purpose: Enlarged lymph nodes are indicators of cancer staging, and the change in their size is a reflection of treatment response. Automatic lymph node segmentation is challenging, as the boundary can be unclear and the surrounding structures complex. This work communicates a new three‐dimensional algorithm for the segmentation of enlarged lymph nodes.

Methods: The algorithm requires a user to draw a region of interest (ROI) enclosing the lymph node. Rays are cast from the center of the ROI, and the intersections of the rays and the boundary of the lymph node form a triangle mesh. The intersection points are determined by dynamic programming. The triangle mesh initializes an active contour which evolves to low‐energy boundary. Three radiologists independently delineated the contours of 54 lesions from 48 patients. Dice coefficient was used to evaluate the algorithm's performance.

Results: The mean Dice coefficient between computer and the majority vote results was 83.2%. The mean Dice coefficients between the three radiologists’ manual segmentations were 84.6%, 86.2%, and 88.3%.

Conclusions: The performance of this segmentation algorithm suggests its potential clinical value for quantifying enlarged lymph nodes.  Full text

  • A Response Assessment Platform for Development and Validation of Imaging Biomarkers in Oncology. Yang H, Schwartz LH, and Zhao B. 2016; 2(4):406-410. 


Quantitative imaging biomarkers are increasingly used in both oncology clinical trials and clinical practice aid evaluation of tumor response to novel therapies. To obtain these biomarkers, and to ensure smooth clinical adoption once they have been validated, it is critical to develop reliable computer-aided methods and a workflow-efficient imaging platform for integration in research and clinical settings. Here, we present a volumetric response assessment system developed based on an open-source image-viewing platform (WEASIS). Our response assessment system is designed using the Model–View–Controller concept, and it offers standard image-viewing and -manipulation functions, efficient tumor segmentation and quantification algorithms, and a reliable database containing tumor segmentation and measurement results. This prototype system is currently used in our research laboratory to foster the development and validation of new quantitative imaging biomarkers including the volumetric computed tomography technique as a more accurate and early assessment method of solid tumor response to targeted therapy and immunotherapy. Full text

  • Semi-automatic segmentation of liver metastases on volumetric CT images. Yan J, Schwartz LH, and Zhao B. Med Phys. 2015 Nov;42(11):6283. (Article chosen as 2015 Editor’s Picks) 


PURPOSE: Accurate segmentation and quantification of liver metastases on CT images are critical to surgery/radiation treatment planning and therapy response assessment. To date, there are no reliable methods to perform such segmentation automatically. In this work, the authors present a method for semiautomatic delineation of liver metastases on contrast-enhanced volumetric CT images.

METHODS: The first step is to manually place a seed region-of-interest (ROI) in the lesion on an image. This ROI will (1) serve as an internal marker and (2) assist in automatically identifying an external marker. With these two markers, lesion contour on the image can be accurately delineated using traditional watershed transformation. Density information will then be extracted from the segmented 2D lesion and help determine the 3D connected object that is a candidate of the lesion volume. The authors have developed a robust strategy to automatically determine internal and external markers for marker-controlled watershed segmentation. By manually placing a seed region-of-interest in the lesion to be delineated on a reference image, the method can automatically determine dual threshold values to approximately separate the lesion from its surrounding structures and refine the thresholds from the segmented lesion for the accurate segmentation of the lesion volume. This method was applied to 69 liver metastases (1.1-10.3 cm in diameter) from a total of 15 patients. An independent radiologist manually delineated all lesions and the resultant lesion volumes served as the "gold standard" for validation of the method's accuracy.

RESULTS: The algorithm received a median overlap, overestimation ratio, and underestimation ratio of 82.3%, 6.0%, and 11.5%, respectively, and a median average boundary distance of 1.2 mm.

CONCLUSIONS: Preliminary results have shown that volumes of liver metastases on contrast-enhanced CT images can be accurately estimated by a semiautomatic segmentation method. Full text

  • Segmentation of Lung Tumors on CT Scans using Watershed and Active Contours. Tan Y, Schwartz LH, and Zhao B. Med Phys. 2013; 40(4):043502. 


PURPOSE: Lung lesions vary considerably in size, density, and shape, and can attach to surrounding anatomic structures such as chest wall or mediastinum. Automatic segmentation of the lesions poses a challenge. This work communicates a new three-dimensional algorithm for the segmentation of a wide variety of lesions, ranging from tumors found in patients with advanced lung cancer to small nodules detected in lung cancer screening programs.

METHODS: The authors' algorithm uniquely combines the image processing techniques of marker-controlled watershed, geometric active contours as well as Markov random field (MRF). The user of the algorithm manually selects a region of interest encompassing the lesion on a single slice and then the watershed method generates an initial surface of the lesion in three dimensions, which is refined by the active geometric contours. MRF improves the segmentation of ground glass opacity portions of part-solid lesions. The algorithm was tested on an anthropomorphic thorax phantom dataset and two publicly accessible clinical lung datasets. These clinical studies included a same-day repeat CT (prewalk and postwalk scans were performed within 15 min) dataset containing 32 lung lesions with one radiologist's delineated contours, and the first release of the Lung Image Database Consortium (LIDC) dataset containing 23 lung nodules with 6 radiologists' delineated contours. The phantom dataset contained 22 phantom nodules of known volumes that were inserted in a phantom thorax.

RESULTS: For the prewalk scans of the same-day repeat CT dataset and the LIDC dataset, the mean overlap ratios of lesion volumes generated by the computer algorithm and the radiologist(s) were 69% and 65%, respectively. For the two repeat CT scans, the intra-class correlation coefficient (ICC) was 0.998, indicating high reliability of the algorithm. The mean relative difference was -3% for the phantom dataset.

CONCLUSIONS: The performance of this new segmentation algorithm in delineating tumor contour and measuring tumor size illustrates its potential clinical value for assisting in noninvasive diagnosis of pulmonary nodules, therapy response assessment, and radiation treatment planning. Full text

  • Malignant lesion segmentation in contrast-enhanced breast MR images based on the marker-controlled watershed. Cui Y, Tan, Y, Zhao B, Liberman L, Parbhu R, Kaplan J, Theodoulou M, Hudis C, Schwartz L. Medical Physics 2009; 36:4359-4369. 


Breast tumor volume measured on MRI has been used to assess response to neoadjuvant chemotherapy. However, accurate and reproducible delineation of breast lesions can be challenging, since the lesions may have complicated topological structures and heterogeneous intensity distributions. In this article, the authors present an advanced computerized method to semiautomatically segment tumor volumes on T1-weighted, contrast-enhanced breast MRI. The method starts with manual selection of a region of interest (ROI) that contains the lesion to be segmented in a single image, followed by automated separation of the lesion volume from its surrounding breast parenchyma by using a unique combination of the image processing techniques including Gaussian mixture modeling and a marker-controlled watershed transform. Explicitly, the Gaussian mixture modeling is applied to an intensity histogram of the pixels inside the ROI to distinguish the tumor class from other tissues. Based on the ROI and the intensity distribution of the tumor, internal and external markers are determined and the tumor contour is delineated using the marker-controlled watershed transform. To obtain the tumor volume, the segmented tumor in one slice is propagated to the adjacent slice to form an ROI in that slice. The marker-controlled watershed segmentation is then used again to obtain a tumor contour in the propagated slice. This procedure is terminated when there is no lesion in an adjacent slice. To reduce measurement variations possibly caused by the manual selection of the ROI, the segmentation result is refined based on an automatically determined ROI based on the segmented volume. The algorithm was applied to 13 patients with breast cancer, prospectively accrued prior to beginning neoadjuvant chemotherapy. Each patient had two MRI scans, a baseline MRI examination prior to commencing neoadjuvant chemotherapy and a 1 week follow-up after receiving the first dose of neoadjuvant chemotherapy. Blinded to the computer segmentation results, two experienced radiologists manually delineated all tumors independently. The computer results were then compared with the manually generated results using the volume overlap ratio, defined as the intersection of the computer- and radiologist-generated tumor volumes divided by the union of the two. The algorithm reached overall overlap ratios of 62.6% +/- 9.1% and 61.0% +/- 11.3% in comparison to the two manual segmentation results, respectively. The overall overlap ratio between the two radiologists' manual segmentations was 64.3% +/- 10.4%. Preliminary results suggest that the proposed algorithm is a promising method for assisting in tumor volume measurement in contrast-enhanced breast MRI. Full text

  • Semi-automated Segmentation of Multimodal Brain Tumor Using Active Contours. Guo X, Schwartz LH and Zhao B. Proceedings of MICCAI 2013, BRATS: 17-30. 

Publications Related to Reproducibility and Proof-of-Concept Clinical Studies 

A Novel Imaging Analysis Provides An Early Readout on the Overall Survival of Patients with a Diagnosis of Melanoma Treated with Immunotherapy. Dercle L, Zhao B, Goenen M, et al.  JAMA Oncol. Published online January 20, 2022.


IMPORTANCE: Existing criteria to estimate the benefit of a therapy in patients with cancer rely almost exclusively on tumor size, an approach that was not designed to estimate survival benefit and is challenged by the unique properties of immunotherapy. More accurate prediction of survival by treatment could enhance treatment decisions.

OBJECTIVE: To validate, using radiomics and machine learning, the performance of a signature of quantitative computed tomography (CT) imaging features for estimating overall survival (OS) in patients with advanced melanoma treated with immunotherapy.

DESIGN, SETTING, AND PARTICIPANTS: This prognostic study used radiomics and machine learning to retrospectively analyze CT images obtained at baseline and first follow-up and their associated clinical metadata. Data were prospectively collected in the KEYNOTE-002 (Study of Pembrolizumab [MK-3475] Versus Chemotherapy in Participants With Advanced Melanoma; 2017 analysis) and KEYNOTE-006 (Study to Evaluate the Safety and Efficacy of Two Different Dosing Schedules of Pembrolizumab [MK-3475] Compared to Ipilimumab in Participants With Advanced Melanoma; 2016 analysis) multicenter clinical trials. Participants included 575 patients with a diagnosis of advanced melanoma who were randomly assigned to training and validation sets. Data for the present study were collected from November 20, 2012, to June 3, 2019, and analyzed from July 1, 2019, to September 15, 2021.

INTERVENTIONS: KEYNOTE-002 featured trial groups testing intravenous pembrolizumab, 2 mg/kg or 10 mg/kg every 2 or every 3 weeks based on randomization, or investigator-choice chemotherapy; KEYNOTE-006 featured trial groups testing intravenous ipilimumab, 3 mg/kg every 3 weeks and intravenous pembrolizumab, 10 mg/kg every 2 or 3 weeks based on randomization.

MAIN OUTCOMES AND MEASURES: The performance of the signature CT imaging features for estimating OS at the month 6 posttreatment landmark in patients who received pembrolizumab was measured using an area under the time-dependent receiver operating characteristics curve (AUC).

RESULTS: A random forest model combined 25 imaging features extracted from tumors segmented on CT images to identify the combination (signature) that best estimated OS with pembrolizumab in 575 patients. The signature combined 4 imaging features, 2 related to tumor size and 2 reflecting changes in tumor imaging phenotype. In the validation set (287 patients treated with pembrolizumab), the signature reached an AUC for estimation of OS status of 0.92 (95% CI, 0.89-0.95). The standard method, Response Evaluation Criteria in Solid Tumors 1.1, achieved an AUC of 0.80 (95% CI, 0.75-0.84) and classified tumor outcomes as partial or complete response (93 of 287 [32.4%]), stable disease (90 of 287 [31.3%]), or progressive disease (104 of 287 [36.2%]).

Conclusions and Relevance  The findings of this prognostic study suggest that the radiomic signature discerned from conventional CT images at baseline and on first follow-up may be used in clinical settings to provide an accurate early readout of future OS probability in patients with melanoma treated with single-agent programmed cell death 1 blockade.

Understanding Sources of Variation to Improve the Reproducibility of Radiomics. Zhao B. Front. Oncol. 29 March 2021. doi: 10.3389/fonc.2021.633176.


Radiomics is the method of choice for investigating the association between cancer imaging phenotype, cancer genotype and clinical outcome prediction in the era of precision medicine. The fast dispersal of this new methodology has benefited from the existing advances of the core technologies involved in radiomics workflow: image acquisition, tumor segmentation, feature extraction and machine learning. However, despite the rapidly increasing body of publications, there is no real clinical use of a developed radiomics signature so far. Reasons are multifaceted. One of the major challenges is the lack of reproducibility and generalizability of the reported radiomics signatures (features and models). Sources of variation exist in each step of the workflow; some are controllable or can be controlled to certain degrees, while others are uncontrollable or even unknown. Insufficient transparency in reporting radiomics studies further prevents translation of the developed radiomics signatures from the bench to the bedside. This review article first addresses sources of variation, which is illustrated using demonstrative examples. Then, it reviews a number of published studies and progresses made to date in the investigation and improvement of feature reproducibility and model performance. Lastly, it discusses potential strategies and practical considerations to reduce feature variability and improve the quality of radiomics study. This review focuses on CT image acquisition, tumor segmentation, quantitative feature extraction, and the disease of lung cancer. Full text

Uncontrolled confounders may lead to false or overvalued radiomics signature: A proof of concept using survival analysis in a multicenter cohort of kidney cancer. Lu L, Ahmed FS, Akin O, Lyndon L, X Guo, Yang H, Yoon J, Hakimi A, Schwartz LH and Zhao B. Front. Oncol. 06 April 2021


Purpose: We aimed to explore potential confounders of prognostic radiomics signature predicting survival outcomes in clear cell renal cell carcinoma (ccRCC) patients and demonstrate how to control for them.

Materials and Methods: Preoperative contrast enhanced abdominal CT scan of ccRCC patients along with pathological grade/stage, gene mutation status, and survival outcomes were retrieved from The Cancer Imaging Archive (TCIA)/The Cancer Genome Atlas—Kidney Renal Clear Cell Carcinoma (TCGA-KIRC) database, a publicly available dataset. A semi-automatic segmentation method was applied to segment ccRCC tumors, and 1,160 radiomics features were extracted from each segmented tumor on the CT images. Non-parametric principal component decomposition (PCD) and unsupervised hierarchical clustering were applied to build the radiomics signature models. The factors confounding the radiomics signature were investigated and controlled sequentially. Kaplan–Meier curves and Cox regression analyses were performed to test the association between radiomics signatures and survival outcomes.

Results: 183 patients of TCGA-KIRC cohort with available imaging, pathological, and clinical outcomes were included in this study. All 1,160 radiomics features were included in the first radiomics signature. Three additional radiomics signatures were then modelled in successive steps removing redundant radiomics features first, removing radiomics features biased by CT slice thickness second, and removing radiomics features dependent on tumor size third. The final radiomics signature model was the most parsimonious, unbiased by CT slice thickness, and independent of tumor size. This final radiomics signature stratified the cohort into radiomics phenotypes that are different by cancer-specific and recurrence-free survival; HR (95% CI) = 3.0 (1.5–5.7), p <0.05 and HR (95% CI) = 6.6 (3.1–14.1), p <0.05, respectively.

Conclusion: Radiomics signature can be confounded by multiple factors, including feature redundancy, image acquisition parameters like slice thickness, and tumor size. Attention to and proper control for these potential confounders are necessary for a reliable and clinically valuable radiomics signature. Full text

  • Identifying robust radiomics features for lung cancer by using in-vivo and phantom lung lesions. Lu L, Sun SH, Afran A, Yang H, Lu ZF, So J, Schwartz LH, Zhao B. Tomography 2021, 7(1), 55-64.


We propose a novel framework for determining radiomics feature robustness by considering the effects of both biological and noise signals. This framework is preliminarily tested in a study predicting the epidermal growth factor receptor (EGFR) mutation status in non-small cell lung cancer (NSCLC) patients. Pairs of CT images (baseline, 3-week post therapy) of 46 NSCLC patients with known EGFR mutation status were collected and a FDA-customized anthropomorphic thoracic phantom was scanned on two vendors’ scanners at four different tube currents. Delta radiomics features were extracted from the NSCLC patient CTs and reproducible, non-redundant, and informative features were identified. The feature value differences between EGFR mutant and EGFR wildtype patients were quantitatively measured as the biological signal. Similarly, radiomics features were extracted from the phantom CTs. A pairwise comparison between settings resulted in a feature value difference that was quantitatively measured as the noise signal. Biological signals were compared to noise signals at each setting to determine if the distributions were significantly different by two-sample t-test, and thus robust. Four optimal features were selected to predict EGFR mutation status, Tumor-Mass, Sigmoid-Offset-Mean, Gabor-Energy and DWT-Energy, which quantified tumor mass, tumor-parenchyma density transition at boundary, line-like pattern inside tumor and intratumoral heterogeneity, respectively. The first three variables showed robustness across the majority of studied CT acquisition parameters. The textual feature DWT-Energy was less robust. The proposed framework was able to determine robustness of radiomics features at specific settings by comparing biological signal to noise signal. Identification of robust radiomics features may improve the generalizability of radiomics models in future studies. Full text

  • Towards radiomics for assessment of response to systemic therapies in lung cancer. Sun S, Besson FL, Zhao B, Schwartz LH, and Dercle L. Oncotarget 2020; 11 (51), 4677-4680.


This editorial comment explains recent developments in radiomics regarding the use of quantitative imaging biomarkers to predict lung cancer sensitivity to a variety of cancer therapies. Tumor response assessment has been a crucial component guiding cancer treatment. Evaluation of treatment response was standardized and classically based on measuring changes in tumor lesion size. Recent breakthroughs in artificial intelligence pave the way for the use of radiomics in tumor response assessment. Such objective techniques would bring a remarkable transformation to conventional methods, which can be inherently subjective. Successful implementation of these technologies would allow for faster and more accurate predictions of treatment efficacy, which will be critical to the advancement of personalized medicine. Full text

  • Radiomics Prediction of EGFR Status in Lung Cancer—Our Experience in Using Multiple Feature Extractors and The Cancer Imaging Archive Data. Lu L, Sun SH, Yang H, E L, Schwartz LH, Zhao B. Tomography. 2020 Jun; 6(2): 223–230.


We investigated the performance of multiple radiomics feature extractors/software on predicting epidermal growth factor receptor mutation status in 228 patients with non–small cell lung cancer from publicly available data sets in The Cancer Imaging Archive. The imaging and clinical data were split into training (n = 105) and validation cohorts (n = 123). Two of the most cited open-source feature extractors, IBEX (1563 features) and Pyradiomics (1319 features), and our in-house software, Columbia Image Feature Extractor (CIFE) (1160 features), were used to extract radiomics features. Univariate and multivariate analyses were performed sequentially to predict EGFR mutation status using each individual feature extractor. Our univariate analysis integrated an unsupervised clustering method to identify nonredundant and informative candidate features for the creation of prediction models by multivariate analyses. In training, unsupervised clustering-based univariate analysis identified 5, 6, and 4 features from IBEX, Pyradiomics, and CIFE as candidate features, respectively. Multivariate prediction models using these features from IBEX, Pyradiomics, and CIFE yielded similar areas under the receiver operating characteristic curve of 0.68, 0.67, and 0.69. However, in validation, areas under the receiver operating characteristic curve of multivariate prediction models from IBEX, Pyradiomics, and CIFE decreased to 0.54, 0.56 and 0.64, respectively. Different feature extractors select different radiomics features, which leads to prediction models with varying performance. However, correlation between those selected features from different extractors may indicate these features measure similar imaging phenotypes associated with similar biological characteristics. Overall, attention should be paid to the generalizability of individual radiomics features and radiomics prediction models. Full text

  • Differentiation of Focal-Type Autoimmune Pancreatitis From Pancreatic Ductal Adenocarcinoma Using Radiomics Based on Multiphasic Computed Tomography. E L, Xu Y, Wu Z, Li L, Zhang N, Yang H, Schwartz LH, Lu L, Zhao B. J Comput Assist Tomogr. 2020;44(4):511-518.


Objectives: The aim of this study was to develop a radiomics model for a differential diagnosis of focal-type autoimmune pancreatitis (AIP) from pancreatic ductal adenocarcinoma.

Methods: A total of 96 patients, 45 with AIP and 51 with pancreatic ductal adenocarcinoma, were retrospectively evaluated. All patients underwent pretreatment abdominal computed tomography imaging acquired at noncontrast, arterial, and venous phases. Furthermore, 1160 radiomics features were extracted from each phasic image to build radiomics models. The performance of radiomics model was evaluated by sensitivity, specificity, and accuracy. The results of radiomics model were also compared with those of radiologists' visual assessments.

Results: The sensitivity, specificity, and accuracy of the optimal radiomics model were 93.3%, 96.1%, and 94.8%, respectively. They were higher than those of the radiologists' assessments with sensitivity of 57.78% and 73.33%, specificity of 88.24% and 90.20%, and accuracy of 75.00% and 81.25%, respectively.

Conclusion: Radiomics is helpful for a differential diagnosis of AIP in clinical practice as a noninvasive and quantitative method. 

  • Identification of Non-Small Cell Lung Cancer Sensitive to Systemic Cancer Therapies using Radiomics. Dercle L, Fronheiser M, Lu L, Du S, Hayes W, Leung DK, Roy A, Wilkerson J, Guo P, Fojo T, Schwartz LH, Zhao B. Clinical Cancer Research 2020.


Purpose: Using standard-of-care CT images obtained from patients with a diagnosis of non–small cell lung cancer (NSCLC), we defined radiomics signatures predicting the sensitivity of tumors to nivolumab, docetaxel, and gefitinib.

Experimental Design: Data were collected prospectively and analyzed retrospectively across multicenter clinical trials [nivolumab, n = 92, CheckMate017 (NCT01642004), CheckMate063 (NCT01721759); docetaxel, n = 50, CheckMate017; gefitinib, n = 46, (NCT00588445)]. Patients were randomized to training or validation cohorts using either a 4:1 ratio (nivolumab: 72T:20V) or a 2:1 ratio (docetaxel: 32T:18V; gefitinib: 31T:15V) to ensure an adequate sample size in the validation set. Radiomics signatures were derived from quantitative analysis of early tumor changes from baseline to first on-treatment assessment. For each patient, 1,160 radiomics features were extracted from the largest measurable lung lesion. Tumors were classified as treatment sensitive or insensitive; reference standard was median progression-free survival (NCT01642004, NCT01721759) or surgery (NCT00588445). Machine learning was implemented to select up to four features to develop a radiomics signature in the training datasets and applied to each patient in the validation datasets to classify treatment sensitivity.

Results: The radiomics signatures predicted treatment sensitivity in the validation dataset of each study group with AUC (95 confidence interval): nivolumab, 0.77 (0.55–1.00); docetaxel, 0.67 (0.37–0.96); and gefitinib, 0.82 (0.53–0.97). Using serial radiographic measurements, the magnitude of exponential increase in signature features deciphering tumor volume, invasion of tumor boundaries, or tumor spatial heterogeneity was associated with shorter overall survival.

Conclusions: Radiomics signatures predicted tumor sensitivity to treatment in patients with NSCLC, offering an approach that could enhance clinical decision-making to continue systemic therapies and forecast overall survival. Full text.

  • Radiomics Response Signature for Identification of Metastatic Colorectal Cancer Sensitive to Therapies Targeting EGFR Pathway. Dercle L, Lu L, Schwartz LH, Qian M, Tejpar S, Eggleton P, Zhao B, Piessevaux H. Journal of the National Cancer Institute 2020.


Background: To forecast survival and enhance treatment decisions for patients with colorectal cancer liver metastases (mCRC) by using on-treatment radiomics signature to predict tumor sensitiveness to FOLFIRI±cetuximab.

Methods: We retrospectively analyzed 667 mCRC patients treated with FOLFIRI alone [F] or in combination with cetuximab [FC]. CT quality was classified as high (HQ) or standard (SD). Four datasets were created using the nomenclature [treatment]-[quality]. Patients were randomly assigned (2:1) to training or validation sets: FCHQ: 78:38, FCSD: 124:62, FHQ: 78:51, FSD: 158:78. Four tumor imaging biomarkers measured quantitative radiomics changes between standard of care CT scans at baseline and 8 weeks. Using machine learning, the performance of the signature to classify tumors as treatment-sensitive or treatment-insensitive was trained and validated using ROC curves. Hazard Ratio (HR) and Cox Regression models evaluated association with overall survival (OS).

Results: The signature (AUC[95CI]) used temporal decrease in tumor spatial heterogeneity plus boundary infiltration to successfully predict sensitivity to anti-EGFR therapy (FCHQ: 0.80 [0.69-0.94], FCSD: 0.72 [0.59-0.83]) but failed with chemotherapy (FHQ: 0.59 [0.44-0.72], FSD: 0.55 [0.43-0.66]). In cetuximab-containing sets, radiomics signature outperformed existing biomarkers (KRAS-mutational status, and tumor shrinkage by RECIST 1.1) for detection of treatment-sensitivity and was strongly associated with OS (two-sided P < 0.005).

Conclusions: Radiomics response signature can serve as an intermediate surrogate marker of overall survival. The signature outperformed known biomarkers in providing an early prediction of treatment-sensitivity and could be used to guide cetuximab treatment continuation decisions. Full text

  • A Quantitative Imaging Biomarker for Predicting Disease-free-survival Associated Histologic Subgroups in Lung Adenocarcinoma. Lu L, Wang D, Wang L, E L, Guo P, Lie Z, Xiang J, Yang H, Li H, Yin S, Schwartz LH, Xie C, Zhao B. European Radiology 2020.


Objectives: Classification of histologic subgroups has significant prognostic value for lung adenocarcinoma patients who undergo surgical resection. However, clinical histopathology assessment is generally performed on only a small portion of the overall tumor from biopsy or surgery. Our objective is to identify a noninvasive quantitative imaging biomarker (QIB) for the classification of histologic subgroups in lung adenocarcinoma patients.

Methods: We retrospectively collected and reviewed 1313 CT scans of patients with resected lung adenocarcinomas from two geographically distant institutions who were seen between January 2014 and October 2017. Three study cohorts, the training, internal validation, and external validation cohorts, were created, within which lung adenocarcinomas were divided into two disease-free-survival (DFS)-associated histologic subgroups, the mid/poor and good DFS groups. A comprehensive machine learning– and deep learning–based analytical system was adopted to identify reproducible QIBs and help to understand QIBs’ significance.

Results: Intensity-Skewness, a QIB quantifying tumor density distribution, was identified as the optimal biomarker for predicting histologic subgroups. Intensity-Skewness achieved high AUCs (95% CI) of 0.849(0.813,0.881), 0.820(0.781,0.856) and 0.863(0.827,0.895) on the training, internal validation, and external validation cohorts, respectively. A criterion of Intensity-Skewness ≤ 1.5, which indicated high tumor density, showed high specificity of 96% (sensitivity 46%) and 99% (sensitivity 53%) on predicting the mid/poor DFS group in the training and external validation cohorts, respectively.

Conclusions: A QIB derived from routinely acquired CT was able to predict lung adenocarcinoma histologic subgroups, providing a noninvasive method that could potentially benefit personalized treatment decision-making for lung cancer patients.

  • Radiomics for Classifying Histological Subtypes of Lung Cancer based on Multiphasic Contrast-Enhanced Computed Tomography.  J Comput Assist Tomogr. 2019. 43(2): p. 300-306. E L, Lu L, Li L, Yang H, Schwartz LH, Zhao B. 


OBJECTIVES: The aim of this study was to evaluate the performance of the radiomics method in classifying lung cancer histological subtypes based on multiphasic contrast-enhanced computed tomography (CT) images.

METHODS: A total of 229 patients with pathologically confirmed lung cancer were retrospectively recruited. All recruited patients underwent nonenhanced and dual-phase chest contrast-enhanced CT; 1160 quantitative radiomics features were calculated to build a radiomics classification model. The performance of the classification models was evaluated by the receiver operating characteristic curve.

RESULTS: The areas under the curve of radiomics models in classifying adenocarcinoma and squamous cell carcinoma, adenocarcinoma and small cell lung cancer, and squamous cell carcinoma and small cell lung cancer were 0.801, 0.857, and 0.657 (nonenhanced); 0.834, 0.855, and 0.619 (arterial phase); and 0.864, 0.864, and 0.664 (venous phase), respectively. Moreover, the application of contrast-enhanced CT may affect the selection of radiomics features.

CONCLUSIONS: Our study indicates that radiomics may be a promising tool for noninvasive predicting histological subtypes of lung cancer based on the multiphasic contrast-enhanced CT images. Full text

  • Radiomics machine-learning signature for diagnosis of hepatocellular carcinoma in cirrhotic patients with indeterminate liver nodules. Mokrane F, Lu L, Vavasseur A, Otal P, Peron J, Yang H, Rousseau H, Zhao B, Schwartz LH, Dercle L. European radiology. 2019.


Purpose: To enhance clinician’s decision-making by diagnosing hepatocellular carcinoma (HCC) in cirrhotic patients with indeterminate liver nodules using quantitative imaging features extracted from triphasic CT scans. Material and methods We retrospectively analyzed 178 cirrhotic patients from 27 institutions, with biopsy-proven liver nodules classified as indeterminate using the European Association for the Study of the Liver (EASL) guidelines. Patients were randomly assigned to a discovery cohort (142 patients (pts.)) and a validation cohort (36 pts.). Each liver nodule was segmented on each phase of triphasic CT scans, and 13,920 quantitative imaging features (12 sets of 1160 features each reflecting the phenotype at one single phase or its change between two phases) were extracted. Using machine-learning techniques, the signature was trained and calibrated (discovery cohort), and validated (validation cohort) to classify liver nodules as HCC vs. non-HCC. Effects of segmentation and contrast enhancement quality were also evaluated. Results Patients were predominantly male (88%) and CHILD A (65%). Biopsy was positive for HCC in 77% of patients. LI-RADS scores were not different between HCC and non-HCC patients. The signature included a single radiomics feature quantifying changes between arterial and portal venous phases: DeltaV-A_DWT1_LL_Variance-2D and reached area under the receiver operating characteristic curve (AUC) of 0.70 (95%CI 0.61–0.80) and 0.66 (95%CI 0.64–0.84) in discovery and validation cohorts, respectively. The signature was influenced neither by segmentation nor by contrast enhancement. Conclusion A signature using a single feature was validated in a multicenter retrospective cohort to diagnose HCC in cirrhotic patients with indeterminate liver nodules. Artificial intelligence could enhance clinicians’ decision by identifying a subgroup of patients with high HCC risk. Key Points • In cirrhotic patients with visually indeterminate liver nodules, expert visual assessment using current guidelines cannot accurately differentiate HCC from differential diagnoses. Current clinical protocols do not entail biopsy due to procedural risks. Radiomics can be used to non-invasively diagnose HCC in cirrhotic patients with indeterminate liver nodules, which could be leveraged to optimize patient management. • Radiomics features contributing the most to a better characterization of visually indeterminate liver nodules include changes in nodule phenotype between arterial and portal venous phases: the “washout” pattern appraised visually using EASL and EASL guidelines. • A clinical decision algorithm using radiomics could be applied to reduce the rate of cirrhotic patients requiring liver biopsy (EASL guidelines) or wait-and-see strategy (AASLD guidelines) and therefore improve their management and outcome.

  • Vol-PACT: An FNIH public-private partnership supporting sharing of clinical trial data for development of improved imaging biomarkers in oncology. Dercle L, Connors DE, Tang Y, Adam SJ, Gönen M, Hilden P, Karovic S, Maitland M, Moskowitz CS, Kelloff G, Zhao B, Oxnard GR, Schwartz LH. JCO Clin Cancer Inform. 2018 Dec; 2:1-12. PMID: 30559455 


PURPOSE: To develop a public-private partnership to study the feasibility of a new approach in collecting and analyzing clinically annotated imaging data from landmark phase III trials in advanced solid tumors.

PATIENTS AND METHODS: The collection of clinical trials fulfilled the following inclusion criteria: completed randomized trials of > 300 patients, highly measurable solid tumors (non-small-cell lung cancer, colorectal cancer, renal cell cancer, and melanoma), and required sponsor and institutional review board sign-offs. The new approach in analyzing computed tomography scans was to transfer to an academic image analysis laboratory, draw contours semi-automatically by using in-house-developed algorithms integrated into the open source imaging platform Weasis, and perform serial volumetric measurement.

RESULTS: The median duration of contracting with five sponsors was 12 months. Ten trials in 7,085 patients that covered 12 treatment regimens across 20 trial arms were collected. To date, four trials in 3,954 patients were analyzed. Source imaging data were transferred to the academic core from 97% of trial patients (n = 3,837). Tumor imaging measurements were extracted from 82% of transferred computed tomography scans (n = 3,162). Causes of extraction failure were nonmeasurable disease (n = 392), single imaging time point (n = 224), and secondary captured images (n = 59). Overall, clinically annotated imaging data were extracted in 79% of patients (n = 3,055), and the primary trial end point analysis in each trial remained representative of each original trial end point.

CONCLUSION: The sharing and analysis of source imaging data from large randomized trials is feasible and offer a rich and reusable, but largely untapped, resource for future research on novel trial-level response and progression imaging metrics. Full text

  • CT Slice Thickness and Convolution Kernel Affect Performance of a Radiomic Model for Predicting EGFR Status in Non-Small Cell Lung Cancer: A Preliminary Study. Li Y, Lu L, Xiao M, Dercle L, Huang Y, Zhang Z, Schwartz LH, Li D, and Zhao B. Sci Rep. 2018 Dec 17;8(1):17913. PubMed PMID: 30559455 


We evaluated whether the optimal selection of CT reconstruction settings enables the construction of a radiomics model to predict epidermal growth factor receptor (EGFR) mutation status in primary lung adenocarcinoma (LAC) using standard of care CT images. Fifty-one patients (EGFR:wildtype = 23:28) with LACs of clinical stage I/II/IIIA were included in the analysis. The LACs were segmented in four conditions, two slice thicknesses (Thin: 1 mm; Thick: 5 mm) and two convolution kernels (Sharp: B70f/B70s; Smooth: B30f/B31f/B31s), which constituted four groups: (1) Thin-Sharp, (2) Thin-Smooth, (3) Thick-Sharp, and (4) Thick-Smooth. Machine learning algorithms selected and combined 1,695 quantitative image features to build prediction models. The performance of prediction models was assessed by calculating the area under the curve (AUC). The best prediction model yielded AUC (95%CI) = 0.83 (0.68, 0.92) using the Thin-Smooth reconstruction setting. The AUC of models using thick slices was significantly lower than that of thin slices (P < 10-3), whereas the impact of reconstruction kernel was not significant. Our study showed that the optimal prediction of EGFR mutational status in early stage LACs was achieved by using thin CT-scan slices, independently of convolution kernels. Results from the prediction model suggest that tumor heterogeneity is associated with EGFR mutation. Full text

  • Reproducibility of radiomics for deciphering tumor phenotype with imaging. Zhao B, Tan, Y, Tsai W-Y, Qi J, Xie C, Lu L, Schwartz LH. Nat Sci Rep 6; 23428, 2016. doi:10.1038/srep23428. 


Radiomics (radiogenomics) characterizes tumor phenotypes based on quantitative image features derived from routine radiologic imaging to improve cancer diagnosis, prognosis, prediction and response to therapy. Although radiomic features must be reproducible to qualify as biomarkers for clinical care, little is known about how routine imaging acquisition techniques/parameters affect reproducibility. To begin to fill this knowledge gap, we assessed the reproducibility of a comprehensive, commonly-used set of radiomic features using a unique, same-day repeat computed tomography data set from lung cancer patients. Each scan was reconstructed at 6 imaging settings, varying slice thicknesses (1.25 mm, 2.5 mm and 5 mm) and reconstruction algorithms (sharp, smooth). Reproducibility was assessed using the repeat scans reconstructed at identical imaging setting (6 settings in total). In separate analyses, we explored differences in radiomic features due to different imaging parameters by assessing the agreement of these radiomic features extracted from the repeat scans reconstructed at the same slice thickness but different algorithms (3 settings in total). Our data suggest that radiomic features are reproducible over a wide range of imaging settings. However, smooth and sharp reconstruction algorithms should not be used interchangeably. These findings will raise awareness of the importance of properly setting imaging acquisition parameters in radiomics/radiogenomics research. Full text

  • Variability in assessing treatment response: metastatic colorectal cancer as a paradigm. Zhao B, Lee S, Lee HJ, Tan Y, Qi J, Persigehl T, Mozley PD and Schwartz LH. Clin Cancer Res. 2014; 20(13):3560-8. (Article featured in highlights of this issue). 


Purpose: The cutoff values currently used to categorize tumor response to therapy are neither biologically based nor tailored for measurement reproducibility with contemporary imaging modalities. Sources and magnitudes of discordance in response assessment in metastatic colorectal cancer (mCRC) are unknown.

Experimental Design: A subset of patients' CT images of chest, abdomen, and pelvis were randomly chosen from a multicenter clinical trial evaluating insulin-like growth factor receptor type 1–targeted therapy in mCRC. Using Response Evaluation Criteria in Solid Tumors (RECIST), three radiologists selected target lesions and measured “uni” (maximal diameter), “bi” (product of maximal diameter and maximal perpendicular diameter), and “vol” (volume) on baseline and 6-week posttherapy scans in the following ways: (i) each radiologist independently selected and measured target lesions and (ii) one radiologist's target lesions were blindly remeasured by the others. Variability in relative change of tumor measurements was analyzed using linear mixed effects models.

Results: Three radiologists independently selected 138, 101, and 146 metastatic target lesions in the liver, lungs, lymph nodes, and other organs (e.g., peritoneal cavity) in 29 patients. Of 198 target lesions total, 33% were selected by all three, 28% by two, and 39% by one radiologist. With independent selection, the variability in relative change of tumor measurements was 11% (uni), 19% (bi), and 22% (vol), respectively. When measuring the same lesions, the corresponding numbers were 8%, 14%, and 12%.

Conclusions: The relatively low variability in change of mCRC measurements suggests that response criteria could be modified to allow more accurate and sensitive CT assessment of anticancer therapy efficacy. Clin Cancer Res; 20(13); 3560–8. ©2014 AACR.  Full text

  • Abdominal fat is associated with lower bone formation and inferior bone quality in healthy premenopausal women: a transiliac bone biopsy study. Cohen A, Dempster DW, Recker RR, Lappe JM, Zhou H, Wirth AJ, van Lenthe GH, Zwahlen A, Müller R, Zhao B, Guo X, Lang T, Saeed I, Liu XS, Guo XE, Cremers S, Rosen CJ, Stein EM, Nickolas TL, McMahon DJ, Young P, Shane E. Clin. Endocrinol. Metab. 2013; 98(6):2562-72.  (Study interviewed and reported by Prevention Magazine). 


CONTEXT: The conventional view that obesity is beneficial for bone strength has recently been challenged by studies that link obesity, particularly visceral obesity, to low bone mass and fractures. It is controversial whether effects of obesity on bone are mediated by increased bone resorption or decreased bone formation.

OBJECTIVE: The objective of the study was to evaluate bone microarchitecture and remodeling in healthy premenopausal women of varying weights.

DESIGN: We measured bone density and trunk fat by dual-energy x-ray absorptiometry in 40 women and by computed tomography in a subset. Bone microarchitecture, stiffness, remodeling, and marrow fat were assessed in labeled transiliac bone biopsies.

RESULTS: Body mass index (BMI) ranged from 20.1 to 39.2 kg/m(2). Dual-energy x-ray absorptiometry-trunk fat was directly associated with BMI (r = 0.78, P < .001) and visceral fat by computed tomography (r = 0.79, P < .001). Compared with women in the lowest tertile of trunk fat, those in the highest tertile had inferior bone quality: lower trabecular bone volume (20.4 ± 5.8 vs 29.1 ± 6.1%; P = .001) and stiffness (433 ± 264 vs 782 ± 349 MPa; P = .01) and higher cortical porosity (8.8 ± 3.5 vs 6.3 ± 2.4%; P = .049). Bone formation rate (0.004 ± 0.002 vs 0.011 ± 0.008 mm(2)/mm · year; P = .006) was 64% lower in the highest tertile. Trunk fat was inversely associated with trabecular bone volume (r = -0.50; P < .01) and bone formation rate (r = -0.50; P < .001). The relationship between trunk fat and bone volume remained significant after controlling for age and BMI.

CONCLUSIONS: At the tissue level, premenopausal women with more central adiposity had inferior bone quality and stiffness and markedly lower bone formation. Given the rising levels of obesity, these observations require further investigation. Full text

  • A pilot study of volume measurement as a method of tumor response evaluation to aid biomarker development. Zhao B, Oxnard GR, Moskowitz CS, Kris MG, Pao W, Guo P, Rusch VM, Ladanyi M, Rizvi NA, Schwartz LH.  Clin Cancer Res 2010; 16:4647-53. (Article highlighted in an accompanying commentary). 


PURPOSE: Tissue biomarker discovery is potentially limited by conventional tumor measurement techniques, which have an uncertain ability to accurately distinguish sensitive and resistant tumors. Semiautomated volumetric measurement of computed tomography imaging has the potential to more accurately capture tumor growth dynamics, allowing for more exact separation of sensitive and resistant tumors and a more accurate comparison of tissue characteristics.

EXPERIMENTAL DESIGN: Forty-eight patients with early stage non-small cell lung cancer and clinical characteristics of sensitivity to gefitinib were studied. High-resolution computed tomography was done at baseline and after 3 weeks of gefitinib. Tumors were then resected and molecularly profiled. Unidimensional and volumetric measurements were done using a semiautomated algorithm. Measurement changes were evaluated for their ability to differentiate tumors with and without sensitizing mutations.

RESULTS: Forty-four percent of tumors had epidermal growth factor receptor-sensitizing mutations. Receiver operating characteristic curve analysis showed that volumetric measurement had a higher area under the curve than unidimensional measurement for identifying tumors harboring sensitizing mutations (P = 0.009). Tumor volume decrease of >24.9% was the imaging criteria best able to classify tumors with and without sensitizing mutations (sensitivity, 90%; specificity, 89%).

CONCLUSIONS: Volumetric tumor measurement was better than unidimensional tumor measurement at distinguishing tumors based on presence or absence of a sensitizing mutation. Use of volume-based response assessment for the development of tissue biomarkers could reduce contamination between sensitive and resistant tumor populations, improving our ability to identify meaningful predictors of sensitivity. Full text

  • Evaluating variability in tumor measurements from same-day repeat CT scans in patients with non-small cell lung cancer. Zhao B, James LP, Moskowitz C, Guo P, Ginsberg MS, Lefkowitz RA, Qin Y, Riely GJ, Kris MG, Schwartz LH. Radiology 2009; 252:263-272. (RADIOLOGY Select Podcast Interview) 


PURPOSE: To evaluate the variability of tumor unidimensional, bidimensional, and volumetric measurements on same-day repeat computed tomographic (CT) scans in patients with non-small cell lung cancer.

MATERIALS AND METHODS: This HIPAA-compliant study was approved by the institutional review board, with informed patient consent. Thirty-two patients with non-small cell lung cancer, each of whom underwent two CT scans of the chest within 15 minutes by using the same imaging protocol, were included in this study. Three radiologists independently measured the two greatest diameters of each lesion on both scans and, during another session, measured the same tumors on the first scan. In a separate analysis, computer software was applied to assist in the calculation of the two greatest diameters and the volume of each lesion on both scans. Concordance correlation coefficients (CCCs) and Bland-Altman plots were used to assess the agreements between the measurements of the two repeat scans (reproducibility) and between the two repeat readings of the same scan (repeatability).

RESULTS: The reproducibility and repeatability of the three radiologists' measurements were high (all CCCs, >or=0.96). The reproducibility of the computer-aided measurements was even higher (all CCCs, 1.00). The 95% limits of agreements for the computer-aided unidimensional, bidimensional, and volumetric measurements on two repeat scans were (-7.3%, 6.2%), (-17.6%, 19.8%), and (-12.1%, 13.4%), respectively.

CONCLUSION: Chest CT scans are well reproducible. Changes in unidimensional lesion size of 8% or greater exceed the measurement variability of the computer method and can be considered significant when estimating the outcome of therapy in a patient. Full text