Materials
The collection and use of data used in this study and ethical review were approved by the Institutional Review Board of Korea National Sport University (20220411-021). All methods were carried out in accordance with relevant guidelines and regulations. Informed consent was obtained from all subjects and/or legal guardians regarding including their information/images in an online open-access publication and paper. We confirmed that informed consent was obtained from all subjects (for participation). This experiment faithfully followed the strict regulations and guidelines of the Institutional Review Board. The dataset used in this study was collected between 2022–04–11 and 2022–06–30. Recruitment was based on BMI to have a BMI distribution similar to Size Korea. There were a total of 160 subjects: 73 women and 87 men. As shown in Fig. 1, the men wore tight-fitting bottoms and swimming caps and the women wore tight-fitting tops and bottoms and swimming caps. The subjects were measured using a 3D scanner, DXA, and BIA. Through this, a total of 160 3D body data, DXA data, and BIA data were obtained. Statistics for the collected data are presented in Table 1.

Subjects’ attire and posture.
First, we verified whether our 160 data were a fair representation of general Korean anthropometric data. The 8th Anthropometric Data collected between 2020 and 2021 by Size Korea was used for validation32. As our main participants were in their 20 s and 30 s, we used the data of 1547 women and 1306 men in 20 s and 30 s from amongst the 8th Anthropometric Data for accurate comparison. In order to verify whether the anthropometric characteristics of Koreans can be viably represented, a Kolmogorov–Smirnov (KS) test was conducted to confirm that the two sample distributions were identical. This test is useful for determining the difference in variance between two samples. In the case of men, the p value of the KS test came out as 0.926, which is valid at the significance level of 0.01. In the case of women, the p value of the KS test was 0.052, which is valid at the significance level of 0.01. And a t-test was conducted to compare the average BMI within the data we collected with the average BMI within the Size Korea data. For men, the t-test result was a p value of 0.5807, at a significance level of 0.01, so we hypothesized that there is no difference between the average BMI of Korean men in 20 s and 30 s and the average BMI of men we collected. Similarly, for women, the t-test result was a p value of 0.0532, and at a significance level of 0.01, so we hypothesized that there is no difference between the average BMI of Korean women in 20 s and 30 s and the average BMI of women we collected. Thus, we verified that the data we collected was representative of Koreans in 20 s and 30 s.
3D scanner
The 3D scanner was a PFS-304A model (PMT innovation company, Gyeonggi-do, Korea, PFS software ver. 1.3). Figure 2a shows the 3D scanner used in this study, wherein the camera module rotates 360° through a motor mounted on the top when it scans and subjects can be measured in a stationary position. The subjects’ postures were measured by taking the A-pose recommended by ISO-725033. If the A-pose is not taken (e.g., anthropometry is not performed when the arm is close to or attached to the body), a blind spot is formed and the correct mesh shape is not created. As shown in Fig. 2a, the measurements of the 3D scanner were taken in an indoor lighting environment, and they are summarized in Table 2.

Equipment used in the experiment. (a) 3D scanner (PFS-304 of PMT), (b) DXA (Lunar of GE), (c) BIA (Inbody770).
Dual-energy X-ray absorptiometry (DXA)
Lunar (GE Healthcare, Madison Wisconsin, America, EnCore software ver. 13.60.03) in Fig. 2b was used for DXA. DXA is a body component-measuring instrument that measures body fat, lean body mass, and bone mass and has long been regarded as the gold standard for measuring body components. The DXA device was handled and measured by an expert; during measurement, the subject maintained an immobile posture while lying down, and the following body parts were examined: arm (left, right), leg (left, right), trunk (left, right), Android, and Gynoid. BMD, BMC, fat%, fat(g), and lean(g) were obtained through DXA for each body part. Here, BMD represents bone density and BMC represents bone mineral mass. Fat% represents fat percentage, fat (g) represents fat weight, and lean (g) represents weight excluding fat. This study classified obesity in individuals based on the total body fat% (bf%) according to DXA.
Bioelectrical impedance analysis (BIA)
BIA is a method of measuring body composition through the impedance difference between body fat and lean body mass34. As shown in Fig. 2c, Inbody770 was used as the measuring device and the subjects were measured in the same clothing and immobile posture as in the previous experiments. Total Body Water (TBW), IntraCellular Water (ICW), ExtraCellular Water (ECW), protein, minerals, Body Fat percentage (bf%), and Fat-free Mass (FFM) were obtained through BIA. Among them, bf% was used for comparison using the methodology proposed in this study. The obesity group of BIA was divided using the bf% criteria presented in Table 3.
Data preprocessing
Data obtained through DXA and BIA only used bf%. Based on the bf% obtained from the DXA of the test subjects in the experiment, the obesity class label was derived as per the cutoff in Table 3. The bf% standard cutoff for Obesity was set in accordance with the WHO35 classification, and the standard cutoffs from Lobman et al. were used for the rest, namely underweight, normal, and overweight36. Currently, there is no clear obesity category for DXA37, but the cutoff for obesity was based on 25%(bf%) for men and 35%(bf%) for women by the Korean Society for Obesity, and the rest of the groups were classified by reference to McArdle and Chang38,39. Therefore, subjects were labeled into four groups: underweight, normal, overweight, and obese. The distribution results of labeling based on bf% measured by DXA Table 3 are depicted in Table 4. As shown in Fig. 3, the data obtained from the 3D scanner extracted body measurements from the mesh data based on five landmarks: the back of the neck, the umbilicus, the groin, and the armpit (left, right).

Sample of the 3D mesh data and standard landmarks for measurement.
Framework
We propose a machine learning-based methodology to classify obesity groups. Figure 4 shows the proposed machine learning-based framework. Body measurements are first obtained from the 3D scanner, and then data preprocessing is performed by matching it with the bf% of DXA and labeling it. After that, the final model is selected through the process of “Choose ML model” and “Feature selection Genetic Algorithm.”

Overall framework of this study.
Choose ML model
In data preprocessing, the data in the state of finished preprocessing the 3D body measurements and Sex are used as input values for the Logistic Regression40, Decision Tree41, Random Forest42, Support Vector Machine (SVM)43, Gradient Boosting44, and AdaBoost45 and fivefold cross validation is performed. It is divided into 120 training sets and 40 test sets. Among these models, Accuracy, F1, Recall, and Precision values were compared, and a model with good performance was selected and used as a classifier in the “Feature selection Genetic Algorithm” process. In Table 4, as the quantity of data is small and imbalances exist, the model is selected by referring to Precision, Recall, and F1 score values rather than simply using Accuracy. Accuracy is the ratio of correctly predicted numbers to the total number. Precision is the sum of true positives and false positives and the ratio of true positives. Recall is the sum of true positives and false negatives and the ratio of true positives. F1 score is the harmonic mean of Precision and Recall.
Logistic regression predicts the probability of occurrence by using a linear combination between variables of input data, and the result is classified into a specific class. This study used multiclass logistic regression with a cross-entropy function. Decision Tree is the most preferred machine learning model as an explanatory model. It outputs a class in which input data is classified based on input variables through a tree structure; it is a way to perform a query on a node and branching out. Its performance is not as good as other models, and it is vulnerable to overfitting. Random Forest is a machine learning model that uses a bundle of basic decision trees and averages them to compensate for performance. Through this, the performance can be generalized and made more robust against overfitting compared to a decision tree. Support Vector Machine determines the hyperplane to maximize the margin between support vectors; its purpose is to maximize the distance between various classes and to find a hyperplane that has a large difference from the training data to which the hyperplane is closest. The main idea of Gradient Boosting is to connect multiple non-deep decision trees, that is, weak learners. As basic trees can classify some data well, performance improves when trees are added. The loss function is defined and gradient descent is used to supplement the value to be classified by the next tree. AdaBoost stands for Adaptive Boosting. Unlike Gradient Boosting, this model is trained by adding weights to the classified samples. At this time, the learning model is created by adding weights to the next model in the sample that is poorly classified.
Feature selection Genetic Algorithm
This process selects the input features of the previously selected machine learning model through a Genetic Algorithm. Selectively choosing the input features of the machine learning model not only improves the model’s performance but also identifies whether a specific value among the 3D body measurements in Table 2 affects the classification of obesity.
While selecting input features, finding the Global Optimum by comparing all sets of input features combinations is practically impossible. Therefore, a meta-heuristic algorithm approach was chosen to find an optimal solution close enough to the Global Optimum. Previous studies have demonstrated that the Genetic Algorithm is superior to other meta-heuristic algorithms in variable selection46,47. In this study, the Genetic Algorithm (GA) was used as a feature selection method. GA takes a meta-heuristic approach to solving complex problems through efficient trial and error48, hence mimicking Charles Darwin’s theory of natural selection and mammalian reproduction. In this study, GA aims to find the best input feature through repeated generation reproduction. GA involves six steps. In Step 1, it initializes the combination of chromosomes, i.e., the initial input features, and sets the parameters. These parameters include population and mutation ratio, where population refers to the number of chromosomes in each generation, i.e., the number of combinations of input features. The mutation ratio refers to the ratio of gene mutations among all chromosomes; this corresponds to the ratio of selection of input features. We set the population to 100 and the mutation ratio to 20%. Step 2 involves learning each input feature in a Random Forest. In Step 3, fitness was evaluated for the chromosomes of each input feature and the fitness function was used to determine the accuracy. In Step 4, out of the current generation and current chromosomes, we selected excellent chromosomes with Accuracy. In this study, the top 80% were selected as excellent chromosomes. Step 5 involved generating next-generation chromosomes through crossover and mutation. In this case, crossover means mixing the selected adoptive parent chromosomes in half. We set this to stop when the 100th generation was passed, and until then, it was set to return to Step 2 and repeat all intervening steps. In Step 6, we selected the final model, picking the model that generated the highest Accuracy.
A total of 100 generations were generated and the input feature of the generation with the highest Accuracy, Recall, Precision, and F1 score was selected. Among the 100 generations, the generation with Accuracy, Recall, Precision, and F1 score of 0.8, 0.767, 0.842, and 0.792 was the highest and the corresponding input feature was selected as the final input feature. Accuracy reached 80% in the 50th epoch, after which it converged or even decreased. Figure 5 shows the flow of accuracy by generation. Table 5 presents the final selected features.

Accuracy flowchart by generation.
link