1 Introduction
Many medical conditions are indicated by pathological shapes, such as abnormal skull shape (craniosynostosis), hip deformities (femoroacetabular impingement), and shoulder dislocation (scapula and humerus injury). Current medical imaging methods aim to identify abnormal anatomical configurations and guide physicians toward diagnosis and treatment plans; severe cases may warrant surgical intervention, while observation might suffice for mild deformities. But how can one objectively quantify the severity of a pathological threedimensional (3d) shape deformity? There is no easy answer, because ”normal” anatomy represents a spectrum, comprised of a variety of different shapes and sizes due to genetic and environmental factors, that does not necessarily result in pathology or compromised function.
Current clinical practice frequently relies on severity assessment by medical professionals. However, despite the best efforts of clinicians and their rigorous medical training, there is a concern for bias in individual clinicians[25],due to variables such as specific training sites or years of practice. For example, there is a wide disparity of opinion regarding the diagnosis of mild nonsyndromic metopic synostosis. [31]
. The concern of bias applies also to supervised machine learning approaches
[2], because the bias in the training data may translate to bias in the model predictions. Some types of biases are well studied (e.g., discrimination due to gender or ethnicity), but the biases in evaluating shape pathologies are more subtle and are poorly understood. Limited number of pathological samples, especially with relatively uncommon diseases, such as craniosynostosis, poses a problem of data skewness in supervised analysis.
To circumvent these issues, in this paper we explore an unsupervised approach to quantify the severity of shape pathologies. By unsupervised, we mean that our method relies solely on examples of shapes of normal anatomical variations. They are provided as point correspondences that describe normal shape variations in a population (as in point distribution models [9, 6, 23]). We don’t use any example shapes of a given pathology (such as metopic craniosynostosis), which means that our method is fully general and agnostic to any specific type of pathologies. Because it merely quantifies the distance from normal anatomical variations, we call it “Shape Normality Metric” (SNM).
We found that even without being informed by examples of specific pathologies, SNM is very useful in predicting severity of shape deformity, as we validated by retroactively comparing the SNM results with evaluations by human experts. However, compared with supervised methods, the SNM method has distinct advantages: no pathological training data required, thus immune to any biases present in such data. We foresee a practical utility of SNM in clinical decision making – not as a method to automatically issue a diagnosis, but rather by providing a new tool to the clinicians, akin to a more sophisticated ”measuring tape” (one which measures statistical shape differences).
2 Related work
Unsupervised severity quantification makes heavy use of statistical shape modeling [27]. It involves representing a set of 2D/3D shapes such that it captures the population statistics, followed by applicationspecific analysis. A popular way of shape representation is via correspondence based methods/particle distribution models [14]. Correspondences are geometrically consistent points placed manually or automatically on the set of shapes, simple to use and convenient to parametrize shape for subsequent statistics. Recent years have seen a rise of methods that automatically places a dense set of correspondences on a population of anatomical shapes [26, 11, 6]
. The usual methodology of analyzing dense correspondence representation is to express them in a lowdimensional space via principal component analysis (PCA)
[5] and use that as the shape descriptor. These methods have proved useful in many fronts of medical imaging such as orthopedics [17] and cardiology [13]. They have also been used in parametrizing skull shape [10], serving as a platform for our proposed modeling on pediatric skulls.Also related is the field of unsupervised anomaly detection. These methods usually learn the distribution of normal samples and detect the anomaly by measuring its distance/dissimilarity. One of the most common ways to detect anomalies from a given distribution is via the use of Mahalanobis distance. A robust variant of the Mahalanobis distance based on the Minimum Covariance Determinant has been proposed, tackling the sensitivity to outliers in observed normal samples
[20]. Another branch adopts deep learning related techniques while avoiding its preconditions on a large number of pathological samples. AnoGAN is proposed to identify anomalous images, composed of a Generative Adversarial Networks together with a novel anomaly scoring scheme
[24]. Further, Autoencoder based methods are utilized for the detection of lesion regions by learning data distribution of healthy brain MRIs [7]. Recently, Variational AutoEncoders have been used to check and localize suspicious parts within images [32].There have been efforts to quantify the severity of metopic craniosynostosis by parametrizing cranial shapes. Diagnosis is particularly challenging as the metopic suture normally closes before 1 year of age [29] which can limit the methods for detecting suture fusion (used for other forms of craniosynostosis) [12, 21]. Common approaches thus far involve defining a simple geometric measure on the skull and correlate that with the craniosynostosis severity, the most popular of these being the interfrontal angle (IFA) [18], which captures the degree of trigonocephaly (triangular frontal head shape). IFA can be calculated from the CT scans and has been effective in distinguishing normal and metopic head shapes [30, 18], though its utility is limited in predicting the severity among abnormal shapes. Recent work used expert supervision with an expressive shape descriptor to measure the craniosynostosis severity [2], however, concerns exist about biases in the export ratings.
3 Materials and methods
By denoting the th 3D shape as , where indicates whether the shape is normal or pathological , we have the shape observations .
Shapes are continuous surfaces and thus inconvenient for calculation. With Shapeworks [4], we obtain the particlebased representation. Specifically, each shape is described by a cloud of ordered particles serving as correspondences across different shape samples. This particle cloud is denoted as
, a flattened vector of these
correspondences’ coordinates .As shown in Figure 1
, there is 1) training, and 2) inference (testing) pipeline. In training, we used Shapeworks to obtain correspondences of present normal shapes which will be frozen during inference. Meanwhile, Shapeworks estimates density function of these correspondences
. Next for each new shape where is unknown, we run Shapeworks on jointly with previously frozen normal correspondences . This way we obtain the correspondences for shape , as the optimal landmark points under the density function of , which both faithfully encode the shape information of and are consistent with .With the shapes represented in the form of coordinate vectors, we can apply probabilistic methods to model the shape variations. To consider both seen and unseen variations with limited normal samples for training, we adopt the Probabilistic Principal Component Analysis (PPCA) [3].
3.1 Probabilistic Principle Component Analysis
Assume we have normal observations , where the th observation is a dimensional feature vector . In the generative view of PPCA, we assume that the principle normal shape variations originate from a lowdimensional vector in a compact latent space , with . Specifically,
(1) 
where obeys standard multivariate Gaussian distribution ,
is the weighting matrix controlling the linear transformation,
is the expectation of normal shape’s correspondences, and is the dimensional noise.With Maximum Likelihood Estimation, we have the latent parameters estimated from normal observations:
(2) 
are the descending eigenvalues of the empirical covariance matrix
, , is the diagonal matrix with largest , has the corresponding eigenvectors as its columns, andis an arbitrary orthogonal matrix. As
does not affect we simply set to the identity.3.2 Shape Normality Metric
3.2.1 Overall Mahalanobis Distance
To measure the abnormality of , we calculate its Mahalanobis distance as the Shape Normality Metric (SNM):
(3) 
The distance in Equation 3 consists of 2 components. First is the Mahalanobis distance in the dimensional latent space , measuring deformity in normal variations:
(4) 
The other is the normalized distance to in the dimensional null space, measuring deformity in abnormal variations undetected by the training normal shapes:
(5) 
where has columns as orthonormal null vectors to . It generalizes SNM for unseen deformity, with its superiority corroborated in our experiments.
3.2.2 Pointwise Mahalanobis Distance.
Equation 3 can also be regarded as a norm, of the new shape’s whitened deviations from mean:
(6) 
With whitened deviations in th correspondence coordinates expressed as , we use the whitened deviation’s norm to measure the shape’s local deformity:
(7) 
Equation 7 normalize the deviation on different correspondences to lie in the same variance scale so that they are comparable.
4 Results
In this section, we present experimental results on various medical conditions demonstrating that SNM provides accurate deformity scores, which are strongly indicative of general pathology deformations. For Metopic Craniosynostosis, SNM outperforms deformity ratings issued by human experts and is comparable with their aggregation, according to Area Under Curve (AUC) [16] with the known diagnosis as ground truth.
4.1 Metopic Craniosynostosis
Metopic Craniosynostosis is characterized by skull deformation due to the premature fusion of the metopic suture. Affected individuals typically exhibit trigonocephaly with the classic triad of a narrow ”keelshaped” forehead, biparietal widening, and hypotelorism.
4.1.1 Metopic Craniosynostosis dataset
We aggregated 124 head CT scans of patients between 616 months old. 30 (24%) of these patients are diagnosed with Metopic Craniosynostosis; while the remaining 94 (76%) are trauma patients with no known intracranial or calvarial abnormality (normal patients). All 124 scans were segmented as outer skull surfaces. We used 74 randomly selected normal skull shapes for training, and left 20 normal shapes, as well as all 30 pathological shapes, for testing.
To collect the ground truth deformity degrees of the 50 testing skulls, we displayed their 3d segmentation on our website and asked physicians to estimate their deformity on a 5 point Likert scale. In total, 31 physicians participated in the survey, among them 14 rated all of the 50 skulls and the other 17 rated 20 skulls randomly. We modeled the expert ratings by Latent trait theory [28, 22]
which accounts for potential raters’ bias and inconsistency and estimated the continuous latent severity of the 50 skulls using Maximum Likelihood Estimation (MLE). The AUC of diagnosis and ratings from the 14 physicians as individuals has mean of 0.8860 and standard deviation of 0.03426, with the highest one being 0.9458. AUC of diagnosis and MLE severity is 0.9850, indicating that aggregation of physicians estimations is more accurate than individuals.
4.1.2 Severity predictions
We trained SNM on 74 normal skulls and predicted deformity scores for the 50 testing skulls. To unify the skulls as inputs for SNM, we cropped the skulls above the plane defined by the nasion and two porion points. Next, they went through 2 pipelines in Figure 1. In the training pipeline, we ran Shapeworks optimization on the 74 normal skulls jointly, obtained their correspondences of 2048 particles, calculated their mean and covariance matrix using PPCA. PPCA requires specification of the latent subspace dimension and by convention we set it as the dimension where it explains 95% of the variance. In the testing pipeline, for every testing skull, we fed it into Shapeworks together with the frozen normal correspondences, got its correspondences and applied deformity analysis by calculating Mahalanobis Distance.
4.1.3 Correlation analysis
The agreement between SNM and the continuous latent severity is evaluated by Pearson Correlation Coefficient [1] and Spearman Correlation Coefficient [19]. The agreement between SNM and the diagnosis labels is evaluated by AUC.
The results are shown in Figure 2. Our SNM is [PPCATrunc@0.95], which achieves the Pearson Correlation Coefficient of 0.7501, Spearman Correlation Coefficient of 0.7881 and AUC of 0.965, showing strong agreement with the ground truth. Also, AUC is even higher than the highest of our 14 physicians, which is encouragingly showing that our SNM is a better indicator in the case of Craniosynostosis compared with deformity estimations by individual physicians.
We also compare the proposed SNM metric with several baselines. First, [PPCANull@0] is essentially calculating the Mahalanobis distance with an isotropic covariance matrix and it has relatively poor performance. This indicates that it is disadvantageous to treat all the deformation components with the same variance. It can also be shown in Figure 3, where different pointwise deviations of the same testing skull are compared using MeshLab [8]
. In the one whitened by SNM, pointwise signed distance is more symmetric and similar to root of Chisquared distribution, and abnormal variations are accentuated on the forehead (Metopic Craniosynostosis pattern). Figure
2 also shows that SNM is not very sensitive to hyperparameter settings, as 3 different explained variance ratios generate similar performance which is close to the best presented.4.2 CamFAI and Shoulder Dislocation
Camtype femoroacetabular impingement can cause hip pain and is characterized by Pincer morphology with acetabular overcoverage [15]. We collected 37 femur bones with CamFAI and 59 normal ones.
The type of shoulder dislocation we studied is characterized by aberrant positioning of the superior portion of the humerus away from its usual location in the glenohumeral joint (the ball and socket joint of the shoulder). It can result in bone deformation located primarily at the joint. We collected 53 humerus scans and 54 scapula scans from patients diagnosed with shoulder dislocation, as well as 41 normal humerus and scapula scans. These respectively formed our humerus and scapula datasets.
For the 3 datasets above, we segmented the scans to extract the outer bone surfaces and cropped them uniformly. We evaluated SNM on each of the 3 datasets using 3repeat 3fold crossvalidation. Specifically, we randomly shuffled the normal samples data, divided it into 3 folds, trained SNM on each pair of 2 folds, predicted severity on the remaining fold along with pathological samples and calculated the AUC with the predictions and diagnosis. This entire process is repeated 3 times. For each dataset, we report the average of these 9 AUC scores.
As is shown in Figure 4, on all the 3 datasets, [PPCAColumn@d] methods are much worse than the others, indicating that the informative variations mainly lie in the null space which is not captured by the training samples. This indicates it is an advantage of SNM to calculate Mahalanobis distance in the entire space. The results also show that SNM is not sensitive to the hyperparameters. For the explained variance ratio, all 3 different explained variance ratios are generating nearly the best scores. For the number of particles in Shapeworks, we used 1024 particles for Femur and Humerus, and we did 2 parallel experiments using 2048 and 4096 particles for Scapula. All the numbers are providing visually adequate and smooth coverage of the shape. As shown in Figure 4, SNM is not sensitive to the number of particles as well. And the whitened deviations are shown in Figure 3, where the pathological variations are highlighted after whitening. While it is hard to find a predominant pattern for the scapula, maybe due to its complicated structure and deformation patterns.
5 Conclusions
We proposed Shape Normality Metric (SNM) as an objective and unsupervised method to measure shape deformity. SNM employs particlebased shape representation and applies Probabilistic Principle Component Analysis to calculate Mahalanobis distance in the whole space, properly accounting for both seen and unseen variations. SNM does not require any data or information about pathological samples and is thus unaffected by pathological population size limitations and/or bias in ratings issued by human experts. Extensive experiments on realworld datasets validated that SNM provides an accurate and reproducible metric for a wide range of shape deformity quantification. Therefore, SNM has the potential to be applied to a variety of clinical scenarios, with particular promise in objectifying communication on shape deformations and planning interventions.
References
 [1] (2009) Pearson correlation coefficient. In Noise reduction in speech processing, pp. 1–4. Cited by: §4.1.3.
 [2] (2020) Quantifying the severity of metopic craniosynostosis: a pilot study application of machine learning in craniofacial surgery. Journal of Craniofacial Surgery. Cited by: §1, §2.
 [3] (2006) Pattern recognition and machine learning. springer. Cited by: §3.
 [4] (2017) Shapeworks: particlebased shape correspondence and visualization software. In Statistical Shape and Deformation Analysis, pp. 257–298. Cited by: §3.
 [5] (2008) Particlebased shape analysis of multiobject complexes. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pp. 477–485. Cited by: §2.
 [6] (2007) Shape modeling and analysis with entropybased particle systems. In Biennial International Conference on Information Processing in Medical Imaging, pp. 333–345. Cited by: §1, §2.
 [7] (2018) Unsupervised detection of lesions in brain mri using constrained adversarial autoencoders. arXiv preprint arXiv:1806.04972. Cited by: §2.

[8]
(2008)
Meshlab: an opensource mesh processing tool.
. In Eurographics Italian chapter conference, Vol. 2008, pp. 129–136. Cited by: §4.1.3. 
[9]
(2004)
Statistical models of appearance for computer vision
. Technical report, University of Manchester. Cited by: §1.  [10] (2009) Particle based shape regression of open surfaces with applications to developmental neuroimaging. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pp. 167–174. Cited by: §2.
 [11] (200205) A minimum description length approach to statistical shape modeling. IEEE Transactions on Medical Imaging 21 (5), pp. 525–537. External Links: Document, ISSN 02780062 Cited by: §2.
 [12] (2014) Black Bone MRI: a potential alternative to ct with threedimensional reconstruction of the craniofacial skeleton in the diagnosis of craniosynostosis. European Radiology 24 (10), pp. 2417–2426. Cited by: §2.
 [13] (201304) A pointcorrespondence approach to describing the distribution of image features on anatomical surfaces, with application to atrial fibrillation. In 2013 IEEE 10th International Symposium on Biomedical Imaging, Vol. , pp. 226–229. External Links: Document, ISSN 19457928 Cited by: §2.
 [14] (1991) Hands: A pattern theoretic study of biological shapes. Springer, New York. Note: Cited by: §2.
 [15] (2010) Prevalence of camtype femoroacetabular impingement morphology in asymptomatic volunteers. JBJS 92 (14), pp. 2436–2444. Cited by: §4.2.
 [16] (1982) The meaning and use of the area under a receiver operating characteristic (roc) curve.. Radiology 143 (1), pp. 29–36. Cited by: §4.
 [17] (2013) Statistical shape modeling of cam femoroacetabular impingement. Journal of Orthopaedic Research 31 (10), pp. 1620–1626. External Links: ISSN 1554527X, Link, Document Cited by: §2.
 [18] (2012) Interfrontal angle for characterization of trigonocephaly: part 1: development and validation of a tool for diagnosis of metopic synostosis. Journal of Craniofacial Surgery 23 (3), pp. 799–804. Cited by: §2.

[19]
(2000)
CRC standard probability and statistics tables and formulae
. Crc Press. Cited by: §4.1.3.  [20] (2018) Detecting multivariate outliers: use a robust variant of the mahalanobis distance. Journal of Experimental Social Psychology 74, pp. 150–156. Cited by: §2.
 [21] (2014) Personalized assessment of craniosynostosis via statistical shape modeling. Medical Image Analysis 18 (4), pp. 635–646. Cited by: §2.
 [22] (1998) Statistical analysis with latent variables. Mplus User’s guide 2012. Cited by: §4.1.1.
 [23] (2016) Entropybased particle correspondence for shape populations. International journal of computer assisted radiology and surgery 11 (7), pp. 1221–1232. Cited by: §1.
 [24] (2017) Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In International conference on information processing in medical imaging, pp. 146–157. Cited by: §2.
 [25] (2006) Bias in research studies. Radiology 238 (3), pp. 780–789. Cited by: §1.
 [26] (200607) Framework for the statistical shape analysis of brain structures using spharmpdm. Note: http://hdl.handle.net/1926/215 Cited by: §2.
 [27] (1942) On growth and form.. On growth and form.. Cited by: §2.
 [28] (1993) A latent trait finite mixture model for the analysis of rating agreement. Biometrics, pp. 823–835. Cited by: §4.1.1.
 [29] (2003) Metopic synostosis: defining the temporal sequence of normal suture fusion and differentiating it from synostosis on the basis of computed tomography images. Plastic and Reconstructive Surgery 112 (5), pp. 1211–1218. Cited by: §2.
 [30] (2016) What’s in a name? accurately diagnosing metopic craniosynostosis using a computational approach. Plastic and Reconstructive Surgery 137 (1), pp. 205–213. Cited by: §2.
 [31] (2015) Classification and management of metopic craniosynostosis. Journal of Craniofacial Surgery 26 (6), pp. 1812–1817. Cited by: §1.
 [32] (2019) Unsupervised anomaly localization using variational autoencoders. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pp. 289–297. Cited by: §2.
Comments
There are no comments yet.