Identification of Race: A Three-Dimensional Geometric Morphometric and Conventional Analysis of Human Fourth Cervical Vertebrae in Adult Malaysian Population

Introduction: Estimation of race plays a significant role in establishing personal identity in forensic anthropology. A cervical vertebra is one of the bones that is least researched in forensic applications. Our study aims to investigate the morphologic variations of the fourth cervical vertebrae (C4) between the different major races in the adult Malaysian population using a three-dimensional (3D) geometric morphometrics method. Methods: Computer tomography images of C4 vertebra, which consist of 386 subjects (169 Malay, 82 Chinese, and 135 Indian) were collected retrospectively from University of Malaya. Twenty-eight landmarks were placed on the images. Procrustes MANOVA, canonical variates analysis (CVA), discriminant function analysis (DFA), and linear measurement were performed using Planmeca Romexis, Checkpoint Stratovan, Morpho J, and Graphpad Prism software respectively to analyze the morphological variations of C4. Results: Procrustes MANOVA showed significant differences in the shape (p <0.0001) and centroid size (p = 0.0003) of the C4 vertebra between races. Canonical variate analysis showed significant differences for Mahalanobis (p <0.0001) and Procrustes (p <0.0001) distances among races. Besides that, a cross-validation value of 66.5% was demonstrated by discriminant function analysis. The use of linear measurements reveals no significant differences between the races, these measurements are the vertebral body height, anterior-posterior length of the vertebral body, length of superior articular facet, and spinous process length. Both intraand interobservational reliabilities showed that acceptable human errors for measurement accuracy. Conclusions: Morphologic variations in the shape of C4 can assist in race estimation of the adult Malaysian population using the 3D geometric morphometric approach.


INTRODUCTION
Identification of an individual in forensic science is accomplished by establishing biological profiles such as sex, age, and race. In cases of burning, decomposition, fragmentation, or commingling of human remains, forensic chemistry techniques are ineffective as the biological tissues found during postmortem may be decomposed and deemed nonviable [1]. Under these situations, forensic anthropology plays an important role in identifying the deceased [2].
predicting the race of the deceased [4].
In forensic anthropology, conventionally, a direct measurement is performed on the bone using calipers to measure the length and width [3]. However, this conventional method has lower validity and reliability [5], as well as a limitation on the visualization of human skeletal morphology [6]. Contrarily, the geometric morphometric approach provides more information on the shape and shows better reliability, accuracy, and validity with great reproductivity [7]. Furthermore, in geometric morphometrics, the morphology of an object is obtained based on threedimensional (3D) landmark coordinates for visualization of shape using imaging modalities, such as plain radiography, computerized tomography (CT), and Magnetic Resonance Imaging (MRI) [8].
The identification of human skeletal remains is challenging especially if there are only a few bones remain, or ideal bones, such as long and robust bones, are broken or missing [9]. Thus, assessment of other bones can be served as an alternative, and confirmation for forensic identification [10]. It has been reported that the cervical vertebra can be used to establish biological profiles, such as age, sex, race, and stature, in a forensic investigation [11][12][13]. However, in these cases, most of the cervical vertebra's morphology was examined based on qualitative analysis rather than quantitative analysis [14]. One example of using cervical vertebrae in race estimation was reported by Duray et al. using the frequency of bifidity of the cervical spinous process at different vertebral levels [15]. A study on the metric analysis from C3 to C7 of the Chinese population showed a significant correlation of the ratio of lower endplate width and lower endplate depth with ancestry determination [4]. A study by Chazono et al evaluated the pedicle and spinal canal dimensions from C3 to C7. The study showed a significant correlation between spinal canal diameters and race estimation between European, American, and Asian populations [16]. However, the identification of race has not been explained properly using cervical vertebrae, especially in the Malaysian adult population.
Besides, there is a lack of studies evaluating C4 in Malaysian and other populations. By contrast, there are more studies on sex assessment rather than race using different levels of cervical vertebrae, such as identification of sex using C1 in white Scottish population [17], C2 in Portuguese population [18], C7 in Spain population [19]. Thus, there is a need of assessing C4 for application in forensic identification in the Malaysian adult population, especially for race identification.
This study aimed to investigate the morphology variations of the C4 of the three main races in the Malaysian adult population, namely the Malay, Chinese, and Indian, using 3D geometric morphometric techniques. We will be the first to utilize the 3D geometric morphometrics method on C4 in the adult Malaysian population, to the best of our knowledge. Our findings may develop a new method for race estimation for adults in the Malaysian population.

MATERIALS AND METHODS
This retrospective study was conducted in the Department of Biomedical Imaging, University of Malaya Medical Centre from the years 2018 -2019. Computed tomography (CT) images of the cervical spine in 0.75mm slice thickness were extracted from the picture archiving and communication system (PACS). The CT images were processed into DICOM format with Planmeca Romexis software and converted into 3D multi-planar reconstruction with Checkpoint Srtatovan software. Ethic approval was obtained from the Institution of Ethics Committee, University of Malaya (Ethics number: 201944-7288) and MAHSA University (Ethics number: RMC/EC34/2020).

Study population
A total of 386 CT cervical spine images were collected. The parameters includes both males and females of the three main racial groups namely, 169 Malay (43.8%), 82 Chinese (21.2%), and 135 Indians (35%) aged from 18 to 70 years old in the Malaysian population. The entire sample comprised "known individuals" in which their sex, race, and age statement documented in the hospital's registry. CT images from individuals with cervical congenital malformations, previous cervical surgery history, and non-Malaysian were excluded.

Superimposition
In this study, Generalized Procrustes Analysis (GPA) for superimposition of the CT images with the Morpho J software. The GPA was used to fit the configurations of specimens in a given sample to a mean configuration by eliminating size, orientation, location differences among specimens with methods of translating, rotating, reflecting and scaling them to best fit (20).

Landmarking application
The outline of C4 was traced by using cubic spline curves. A total of 28 landmarks were placed using Checkpoint Stratovan software on the vertebral outlines in predefined positions ( Figure 1) (21). Landmarks were chosen to correspond to those commonly used in the traditional metrical and geometric morphometric systems for a complete description and illustration (14,22).
The 28 landmarks of C4 were used for shape analysis. On the left side, four landmarks were placed on the body (LC4bd i-iv), four landmarks were on the articular process (LC4ap i-iv), three landmarks were on the spinous process (LC4sp i-iii), two landmarks were on the lamina (LC4lm i-ii), and one landmark was on the transverse process (LC4tp i). On the right side, another four landmarks were on the body (RC4bd i-iv), four landmarks were on the articular process (RC4ap iiv), three landmarks were on the spinous process (RC4sp i-iii), two landmarks were on the lamina (RC4lm i-ii), and one landmark was on the transverse process (RC4tp i) ( Table 1).

Outliers
All outliers in this study had been identified by MorphoJ software and the landmarks which was found as outliers has been re-applied. The final data which has been included was those within the normal variation. And the histogram of the data in this study are normally distributed.

Software used in data collection and analysis
Planmeca Romexis software was used to view, process, and store volumetric images of CT scans. The Checkpoint Stratovan software was utilized in the landmark application on the images of CT scans. The MorphoJ (Version 1.06d) software and Graphpad Prism software was employed in data analysis such as Procrustes MANOVA, canonical variate analysis (CVA), discriminant function analysis (DFA), and T test.
Procrustes MANOVA is a method used to analyse the variance differences in centroid size or shape. The centroid size is an analytical method to estimate the size as the square root of the sum of squared distances of all the landmarks of an object from their centroid (23). CVA is an approach used to identify and compare the variation of at least two groups in a given population (24). Mahalanobis distance (MD) is an efficient metric for a multivariate distance that measures the distance between a point and a distribution. Procrustes distance is the square root of the sum of squared differences in the landmarks' configuration between groups (25). This is broadly used to evaluate the similarity or dissimilarity of the morphology of objects and indicates the difference in mean shape between groups (26). Discriminant function analysis (DFA) measures the Mahalanobis distance from an unknown individual to the centroids of groups included in an analysis for classification (27).

Linear measurements
Four measurements of the length in C4 were based on the location of the landmarks. The measurements were: 1) Vertebral body height (the length between LC4bd I and LC4bd iv) 2) anterior-posterior length of the vertebral body (the distance between LC4bd i and LC4bd ii), 3) length of superior articular facet (the distance between LC4ap i and LC4ap ii), and 4) spinous process length (the distance between LC4sp i and LC4sp ii). The method was blinded and the data was taken once by inter landmark measurements using the metric unit in millimeter. And the method of measurement was adopted from the previous study (28).

Reliability analysis
The 20 samples were taken from the same database and analyzed twice on separate occasions and two observers. The inter-and intra-observer reliability were evaluated from centroid size using dependent t-test and independent t-test.

Reliability analysis
Based on the reliability analysis, the result showed no significant differences between the two observers (p = 0.3141) and between two measurements done by a single observer in two months period (p = 0.1417), indicating that human errors were negligible for the specified measurement between observers.

Procrustes MANOVA
Procrustes MANOVA analysis outputs for both centroid size and shape were presented in separate MANOVA tables. The results revealed that both centroid size and shape of the C4 were significantly different between races. The C4 showed that the racial groups in the study population were significantly different for centroid size, (p = 0.0003, SS% = 4.19) ( Table 2), and shape (p < 0.0001, SS% = 1.48) ( Table  3).

Canonical Variate Analysis (CVA)
CVA produces canonical variates (CV) from the rotation and scaling of the centroids and generates Mahalanobis distance (MD) between groups based on the centroids of samples (29). In this study, CVA showed that all the races did not exhibit a substantial individual overlap, with the most separation was found between the Chinese and Indian.
Comparison of the mean shape in positive and negative directions demonstrated in the wireframe graph (Figure 2), with a scaling factor of -4 to 4 for CV1 and CV2 for the race. The positive direction in the xaxis showed a narrower body and shorter and wider spinous process of the C4 than the average shape. Both Chinese and Indian had more tendency of having this variation, followed by Malay and Chinese. Contrarily, the negative direction in the x-axis indicated the wider body and longer and narrower spinous process of the C4. Interestingly, Chinese tend to deviate more toward negative variation than the Malay and Indian.

Figure 2
Scatter plot of between-group principal components (bgPCs) of C4 shape from CVA analysis for race. Malay group is shown by red filled region, Chinese group by green filled region, and Indian group by blue region. Shape differences associated with canonical variate axes are visualized by wireframe graphs illustrating the shape changes corresponding to scores of -4 and 4 for CV1 and -4 and 4 for CV2.
Meanwhile, the positive direction in the y-axis showed a wider body and shorter and wider spinous process with posteriorly deviated posterior articular facet in the C4 than the average shape. Similarly, a high tendency was demonstrated in both Chinese and Indian compared to Malay. While the negative direction located in the lower interface of the left panel in the y-axis of the CV graph represented a narrower body and longer and narrower spinous process with anteriorly deviated posterior articular face in the C4 than the average shape. In this case, the Malay were relatively deviated toward a negative variation of CV2 in the y-axis compared to other races.
MD (Mahalanobis distance) measures how many standard deviations away a point is from the mean of a distribution. When the distance is increased, the variation between a sample and a distribution will be wider. Our result showed significant differences in the distance between all races, with the greatest distance was found between Chinese and Indian (MD = 1.9941) followed by MD = 1.3933 for Malay and Indian and MD = 1.1757 for Malay and Chinese ( Table 4).
The Procrustes distance proved that the shape of the C4 was significantly different between all races. Procrustes distance between Chinese and Indian (PD = 0.0327) was the greatest compared to Malay and Chinese (PD = 0.0258) and Malay as well as Indian (PD = 0.0214) ( Table 5).

Discriminant function analysis (DFA)
DFA produces the greatest contrast among reference groups by selecting variables that will produce the greatest variation between groups (30). It also provides the accuracy of sample classification (31) according to their factor structures, such as race, in this study. The result of the mean accuracy of the classification in the C4 was s shown in Table 6. For comparison between Malay and Chinese, the classification accuracy as measured by MorphoJ software was 79.3% for Malay and 68.3% for Chinese, with 68% and 50% after cross-validation, respectively. The mean percentage between Malay and Chinese after cross-validation was 59% (Table 6). Similarly, both Malay and Indian had an accuracy rate of 75.1% and 75.6% respectively, and 65.9% and 66.7% were demonstrated after cross-validation. After crossvalidation, the mean percentage between them was 66.2%. Our result also showed that both Chinese and Indian could be correctly classified, with an accuracy rate of 81.7% and 86.7%. After cross-validation, the classification accuracy was 70.7% for Chinese and, 77.8% for Indian. The mean percentage after crossvalidation between Chinese and Indian was 74.3%.
In terms of race determination by DFA, the overall mean percentage of DFA was 66.5%. The prediction accuracy of the ancestral identification in the shape of the C4 for Chinese and Indian in this study achieved the highest accuracy rate (74.3%) after crossvalidation, followed by Malay and Indian (66.2%) and Malay and Chinese (59%).

Linear measurements
The result reveals no significant differences (p > 0.05) in the same measurements used for race identification in the study population. Generally, the Chinese held larger and longer C4 vertebra landmarks distance compared to Malay and Indian. Besides that, the mean + SEM of vertebrae body height were the highest in Chinese (11.56 mm + 0.292) followed by Malay (11.07 mm + 0.124), and Indian (10.99 mm + 0.147). Similarly, the Chinese also had a relatively higher mean of anterior-posterior length of vertebrae body (

DISCUSSION
Race determination is a major component for identifying unknown individuals which are known to be specific to the population. Moreover, the deceased's population affinity is one of the factors to define the circumstances of the deceased's death in a forensic investigation [32]. Race is estimated by morphological and metric features of human skeletal remains corresponding to their genetic and geographical origins [32].
In the present study, Procrustes MANOVA, CVA, and DFA analysis indicated that the C4 was significantly different among Malay, Chinese, and Indian in the adult Malaysian population, except for conventional linear measurement. The results of this study confirmed our hypothesis that there is a significant difference in the variation of the C4 shape between Malay, Chinese, and Indians as analyzed by the 3D Geometric morphometric method, but not in the distance of anatomical landmarks.
Procrustes MANOVA analysis indicated that the shape of C4 significantly differed between races, with Goodall's F statistic value (F) of 8.38 and SS% of 4.19 for centroid size and F = 14.64 and SS% of 1.48 for shape. Both Goodall's F statistic value and SS% is greater on centroid size compared to shape. We compared the shape variation in the centroid size between cervical and other bones. A study on the shape of maxillary and mandibular first molars in Europeans and Asians showed significant differences in both maxillary and mandibular first molars between Europeans and Asians in terms of centroid size and shape [33]. However, another study assessing the morphology of the lumbar spines between the Mediterranean and a South African population did not reveal statistically significant difference in the mean centroid size [34].
CVA analysis showed that the shape of C4 was significantly different, with the greatest separation between Chinese and Indian (MD =1.99, PD = 0.03), followed by between Malay and Indian (MD = 1.39, PD = 0.02). This finding is supported by a similar study that cervical vertebrae demonstrated a significant difference between Hispanics versus (vs.) both whites and African Americans, with F = 2.748, by using a method that described six cervical vertebrae maturation stages [35]. Another study conducted in the Chinese population showed significant differences in the cervical vertebrae between Chinese and other races, such as Japanese, Indian, and Whites, in their sagittal developmental diameter [36]. In addition, a study on the morphology of endplate of C3 to C7 in the Chinese population indicated that "there are racial differences between Chinese and Whites in terms of linear parameters including upper endplate width (EPWu), upper endplate depth (EPDu), lower endplate width (EPWl), and lower endplate depth (EPDl) and area parameters including upper endplate area (EPAu) and lower endplate area (EPAl)" [37]. Lastly, the bifidity of cervical spinous processes from C3 to C6 in the American population demonstrated significant differences between Whites and Blacks [38]. The observation of such discrepancies among races in the C4 could be explained by the anthropologists that different races have their distinct characteristics [39].
Our study also showed statistically significant differences among races in terms of discriminant functions, with an overall accuracy of 66.5% after cross-validation analysis. The highest classification accuracy was demonstrated between Chinese and Indian, followed by between Malay and Indian, and between Malay and Chinese. Comparing our finding to other vertebra bones, the analysis from a study on the complete vertebrae and the sacrum measurements in the South African population revealed an accuracy rate of 98 % in males and 93.5 % in females, which is higher than our study [40].
We also compared classification accuracy between cervical and other bones used for race or ethnic identification. A study by Murphy and Garvin evaluated ancestry classification between American white and black using the 3D surface scan method [41]. They found that the crania in the population were correctly classified, with a classification accuracy of 92.4%. [41]. The discriminant function analyses performed on the cranial outlines in this study produced higher correct classification rates for ancestry determination [41]. Furthermore, the study using dental shape variation for estimating races in the American population showed that dental bones were correctly classified to estimate African or European American vs. Hispanic American, with accuracy rates, ranging from 66.7% to 89.3%.
While the classification accuracy for African Americans vs. European Americans was 71.4% to 100% [42]. However, the classification presented were higher in other population compared to our study which was done in Malaysia population. This suggest that the accuracy rate of using cervical in race identification is specific to population.
Apart from these, our study showed that the mean of all four measurements on the vertebral body height, anterior-posterior length of a vertebral body, length of superior articular facet, and spinous process length in the C4 was the greatest for Chinese compared to Malay and Indian. In fact, several studies have demonstrated similar results. For instance, a study on a metric analysis of the complete vertebrae, except for atlas and axis, and the sacrum in South African black and white populations showed that whites exhibited larger and longer vertebrae than blacks, and the difference in size was apparent in males [40]. Another study assessed pedicle dimensions and spinal canal diameters in cervical vertebrae, including pedicle width (PW), pedicle transverse angle (PTA), an anteriorposterior diameter of the spinal canal (APD), and transverse diameter of the spinal canal (TD), from C3 to C7 between European/American and Asian populations. The result showed that spinal canal morphology determined racial differences [43].
It has been reported that variation of skeletons between different races is influenced by genetic, geographical factors, socioeconomic status [44], nutrition, or regional [45]. Stull et al. demonstrated the relationship between geographic and genetic distance compared to skeletons' morphological differences in the South African population [46]. According to the study, skeletal shape variations were recognized between races because geographical distances limit gene flow, thus decreasing group interaction and increasing morphological differences between races [46]. As the subjects in our study population were from similar environmental factors and with similar socioeconomic status, the differences between studies races may be mainly influenced by genetic differences.

CONCLUSION
In conclusion, our study proposes that 3D geometric morphometric techniques present to be a useful alternative method for determining race in the Malaysian adult population using C4. It compiled an extensive population database derived from multiple landmarks applied on the outlines of C4 from CT scans using geometric morphometrics. Morphology of C4 is significantly ethnic dimorphic, which results in statistically significant differences among major racial groups in Malaysian adult population. The shape variations of C4 were found to be most significantly different between Chinese and Indian groups in the study population. Thus, the result of this study can aid in future victim identification among Malaysians in forensic scenarios. We suggest that further data, which consist of other minority races such as Dusun, Kadazan, Iban, may be helpful for a better representative of the Malaysian population holistically.