A Brazilian cohort of pregnant women with overt diabetes: analyses of risk factors using a machine learning technique

ABSTRACT Objective: Pregnancy complicated by type 2 diabetes is rising, while data on type 2 diabetes first diagnosed in pregnancy (overt diabetes) are scarce. We aimed to describe the frequency and characteristics of pregnant women with overt diabetes, compare them to those with known pregestational diabetes, and evaluate the potential predictors for the diagnosis of overt diabetes. Subjects and methods: A retrospective cohort study including all pregnant women with type 2 diabetes evaluated in two public hospitals in Porto Alegre, Brazil, from May 20, 2005, to June 30, 2021. Classic and obstetric factors associated with type 2 diabetes risk were compared between the two groups, using machine learning techniques and multivariable analysis with Poisson regression. Results: Overt diabetes occurred in 33% (95% confidence interval: 29%-37%) of 646 women. Characteristics of women with known or unknown type 2 diabetes were similar; excessive weight was the most common risk factor, affecting ~90% of women. Age >30 years and positive family history of diabetes were inversely related to a diagnosis of overt diabetes, while previous delivery of a macrosomic baby behaved as a risk factor in younger multiparous women; previous gestational diabetes and chronic hypertension were not relevant risk factors. Conclusion: Characteristics of women with overt diabetes are similar to those of women with pregestational diabetes. Classic risk factors for diabetes not included in current questionnaires can help identify women at risk of type 2 diabetes before they become pregnant.

INTRODUCTION P regnancy associated with type 2 diabetes is rising, following the burden of excessive weight in women of childbearing age (1). The glycemic status of a woman is crucial to reduce unfavorable outcomes for mothers and fetuses. Overt diabetes, hyperglycemia reaching non-pregnant criteria for diabetes and first diagnosed in pregnancy (2), can be as hazardous as the alreadyknown type 2 diabetes (1). Undiagnosed diabetes is not uncommon in adults (3), but few data exist regarding this condition in pregnancy. In a Canadian study, 2.6% of 68 163 women evaluated up to one year after gestational diabetes presented a diagnosis of type 2 diabetes, pointing to a likely diagnosis of overt diabetes in pregnancy (4); in a Brazilian cohort, 48 of the 224 pregnant women with hyperglycemia (21.4%) fulfilled diagnostic criteria for overt diabetes (5). Larger studies on type 2 diabetes characteristics and outcomes in pregnancy excluded women with overt diabetes (6,7).
Several predictors have been proposed to assess the risk of diabetes in adults (8)(9)(10). Many questionnaires set 40 (9) or 45 years (8,10) as the lowest age, limiting its adoption to most women of childbearing age. Moreover, except for previous gestational diabetes, no other questions on obstetric antecedents appear in those questionnaires. Therefore, we aimed to: describe the frequency of diabetes with the first diagnosis in pregnancy, i. e. overt diabetes; compare characteristics of women with overt Risk factors for overt diabetes Arch Endocrinol Metab, 2023, v.67(5), 1-9, e000628. diabetes to those with known pregestational type 2 diabetes; and evaluate factors that could identify, before pregnancy, women at risk of presenting a diagnosis of overt diabetes.

SUBJECTS AND METHODS
In this retrospective cohort study, we included all pregnant women receiving high-risk antenatal care in the two major public hospitals (Hospital de Clínicas de Porto Alegre (11) and Hospital Nossa Senhora da Conceição (12) in Porto Alegre, Brazil, from May 20, 2005, to June 30, 2021.
The study protocol was approved on July 28, 2016 (number 16-0331) and registered in Plataforma Brasil, CAAE 57365016.3.0000.5327; all authors signed a data use agreement form to ensure the privacy of data collected from medical registries.
We included all women with known pregestational type 2 diabetes; and all those fulfilling the 2013 World Health Organization criteria for overt diabetes (13) and/or glycated hemoglobin (HbA1c) ≥ 6.5% (9). We did not exclude women with twin pregnancies. We included data from the first pregnancy in women who became pregnant more than once during the study span. We excluded women with type 1 diabetes, gestational diabetes, or an unclear diagnosis of hyperglycemia. A multi-professional team provided care at both hospitals.
We followed the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) statement to write the manuscript (14).
Duration of diabetes and pre-pregnancy weight were self-informed. We categorized skin color as white or nonwhite; and education, as more than 11 years or 11 years or less of formal education. The presence of diabetes complications, smoking, family history of diabetes or chronic hypertension, personal history of hypertension, previous gestational diabetes, or macrosomia (birth weight ≥ 4,000 g) were considered positive when reported in the hospital charts. The same was applied to a family history of diabetes or hypertension in relatives of the first or second degree. Absent information on these variables was labeled as negative.
Height and weight were measured at the first prenatal appointment. Pregestational BMI was calculated as the informed weight in kilograms divided by the square of height, in meters, and women were classified as having normal BMI, overweight, or obesity (15).
HbA1c was measured at booking and labeled as initial HbA1c, regardless of gestational age. Assays were conducted with high-performance liquid chromatography (Variant II Turbo HbA1c; BioRad, Hercules, CA, USA) in line with the National Glycohemoglobin Standardization Program guidelines (http://www.ngsp.org/index.asp).
We calculated the frequency of women with overt diabetes and compared their baseline characteristics to those with known pregestational type 2 diabetes using univariable analysis.
We applied two approaches to evaluate possible risk factors with overt diabetes as the dependent variable. The machine learning technique was used as an exploratory tool, while multivariable analyses estimated relative risks for each factor.
The machine learning technique analyses included all risk factors in the model; the program layered them. Data were tested using either continuous or categorized variables to generate the models. Cross-validation was used as the sampling method. We chose the number of folds the dataset should be split based on the best resulting area under the curve (AUC). Each fold represents the number of splits the dataset was divided, to train and test the model. In cross-validation, the training and testing subsets are trained and tested according to the selected number of folds. The decision tree model was chosen as the algorithm. Besides AUC, precision (positive predictive value) and recall (sensitivity) were also retrieved.
After this preliminary analysis, we ran models with the ADA's "Are you at risk for type 2 diabetes" questionnaire (9) as the matrix. We entered age, continuous or dichotomized; previous gestational diabetes (no/yes); family history of diabetes (no/yes); personal history of chronic hypertension (no/yes); and pregestational BMI, continuous or categorized as normal, overweight, or obesity. Two items of the questionnaire were excluded: question 2 (gender) and question 6 (physical activity, information not available in the dataset). Obstetric variables included the number of deliveries, continuous or dichotomized; previous miscarriage (no/yes); and previous macrosomia (no/ yes). Women with overt diabetes were compared to those with pregestational type 2 diabetes by estimating relative risks for main risk factors.
Three models were chosen. The first included all risk factors; then, we included only variables of the ADA's risk questionnaire; and finally, we evaluated a combination of ADA's risk score plus risk factors related to pregnancy: number of deliveries dichotomized as ≥ 2 and history of previous macrosomia.
Statistical analyses were performed with SPSS version 18.0 (SPSS, Chicago, IL, USA). Results were expressed as mean ± standard deviation (SD) or median (interquartile range, IQR) according to a normal distribution as determined by Shapiro-Wilk test, or number (percentage). The Student t-test, the chi-square test (coupled with the Z test for comparison of proportions, with Bonferroni correction when appropriate), and the Mann-Whitney U test were used to compare baseline characteristics of women with overt diabetes to those with pregestational diabetes.
We used the Orange Workflow version 3.30.2 for machine learning analyses; relative risk (RR) with 95% CI was estimated using Poisson regression and was performed with SPSS.

RESULTS
We enrolled 648 women; we excluded two due to missing information on diabetes duration. Data on 127 women with type 2 diabetes from HCPA have been previously described (16).
The baseline characteristics of women are in Table 2. The median number of deliveries was 1.0 [IQR 1.0-2.0]. One hundred and one women (47.6%) with overt diabetes and 206 (47.5%) women with pregestational diabetes had > two deliveries (p = 1.000).
Results of the machine learning algorithm are displayed in Table 3 and illustrated in the Figure 1. AUC was under 0.6 in all models; we chose the three models with the higher AUCs. Model 1 included, besides the classical risk factors of ADA's questionnaire, demographic characteristics plus previous macrosomia In Model 1, in women ≤ 30 years, the number of deliveries appeared at the second level, followed by previous macrosomia in those with ≥ 2 deliveries. A family history of diabetes appeared at the second level in women > 30 years, followed by BMI category in those without a family history of diabetes, and by previous macrosomia in those with a family history of diabetes. Demographic characteristics appeared at the fourth level.
Model 2 comprised the five items of ADA's questionnaire; AUC was 0.581. In this model, chronic hypertension was the second risk factor in women ≤ 30 years, while in those > 30 years, a history of family diabetes was the second.
In Model 3, we added previous macrosomia and ≥ 2 deliveries to Model 2. AUC was 0.576, and in women ≤ 30 years, the number of deliveries followed; previous macrosomia appeared at the third level in those with ≥ 2 deliveries. Family history of diabetes remained at the second level in women with age > 30 years, and previous macrosomia appeared at the third level in those with a family history of diabetes, while in those without, BMI was at the third level.
The results of the multivariable analysis are in Table 4. An age cutoff of > 30 years conferred a 33% lower risk and family history of diabetes, a 20% lower risk of overt diabetes, while a history of previous macrosomia enhanced the risk by 32%. Chronic hypertension was not significant after adjustments.

DISCUSSION
Diabetes was unveiled for the first time in one-third of this cohort of women with diabetes. Baseline characteristics of women with known or unknown pregestational diabetes were similar; ~6.0% of women presented chronic complications of the disease, mainly those with pregestational diabetes. An age of > 30 years and a positive family history of diabetes were inversely associated with the diagnosis of overt diabetes; in younger women with two or more deliveries, and in those > 30 years with a family history of diabetes, previous macrosomia was a predictor of overt diabetes. Prior GDM history and chronic hypertension did not discriminate groups.

Risk factors for overt diabetes
Arch Endocrinol Metab, 2023, v.67(5), 1-9, e000628.       In Brazil, diabetes occurs in 0.9% of women aged 18-24 years and in 5.7% of those aged 35-44 years (17). In the ELSA-Brasil study, 5.1% of women aged 35-44 years were diagnosed with diabetes; 56.9% were unaware of having hyperglycemia (18), in contrast to lower figures found in the American population (3). One-third of the participants were unaware of the diagnosis in our study, mirroring numbers reported for non-pregnant women. Women with overt diabetes arrived later to the specialized prenatal care as expected, adding to the risk of hyperglycemia-related fetal malformations. Why did we not diagnose them before? A delayed diagnosis may be explained by the low socioeconomic and/or educational profiles in this sample, as illustrated by the high frequency of low schooling and pregnancy planning. Non-planned pregnancies are not exclusive to women with diabetes; in a cohort of Southern Brazilian women, 52.2% said their pregnancies were unplanned (19), mainly in the lower socioeconomic stratum. Could diabetes have been diagnosed before pregnancy? We believe yes if some risk factors had been sought.
Age is a relevant risk factor for type 2 diabetes and the first query in most questionnaires (8)(9)(10). Women presenting overt diabetes were only one year younger than those with pregestational diabetes here, a difference that does not explain why these women were not diagnosed before pregnancy. Current guidelines recommend that women < 35 years should only be screened for diabetes in the presence of excessive weight; or when pregnant. Age was the first risk factor to be dichotomized by the machine learning technique in our study; it was an independent risk factor in most multivariable analyses.
Excessive weight is probably the most relevant risk factor related to diabetes, both running in parallel in several world regions (20). Overweight and obesity rates are rapidly growing among women of childbearing age. A Brazilian survey revealed that obesity was present in 11.2% of women aged 18 to 24 years, and in ~26.0% of those aged 35 to 44 years, overweight ranged from 31.7% to 61.9% (17). More than 70% of women presented with obesity here. BMI is part of all risk scores for diabetes screening in women of childbearing age, either in those to detect diabetes by universal screening (21) or in those to predict the risk of diabetes in women with prior gestational diabetes (22). The dyad excessive weight/lower age recently prompted the United States Task Force decision to lower the age for diabetes screening to 35 years (23); had this rule been applied here, one-third of the women with overt diabetes would have been screened earlier.
Gestational diabetes is a well-known risk factor for presenting type 2 diabetes in the future (24). Previous gestational diabetes history, although present in ~30% of women here, was not discriminative, contrasting to findings of others (24): in machine learning models, it would appear at third or fourth levels; and it was not significant in multivariable analyses. A positive history of gestational diabetes was a relevant factor in a Mexican
A family history of diabetes points to an underlying genetic and/or environmental factor. A positive family history of diabetes was inversely associated with overt diabetes, in adjusted models, probably because having cases of diabetes in the family led women to seek earlier screening, and it only appeared at the second or third levels in the learning machine algorithms. Family history almost doubled the chance of undiagnosed diabetes in one study that compared people with and without diabetes (26).
Chronic hypertension was more common in women with pregestational diabetes, although they were neither older nor heavier. They presented diabetes for at least the previous four years, and diabetes complications were more frequent. Chronic hypertension was present in 17.6% of women aged 35 to 44 years in Brazil (17), similar to that found in younger women with overt diabetes, but lower than the frequency in women with pregestational diabetes. Hypertension was not relevant in adjusted models here, in opposition to the findings of others (26), which included only non-pregnant women and older participants; it was also not relevant in a cohort of Mexican women of childbearing age (25).
Delivery of macrosomic babies is linked to maternal weight and hyperglycemia (27) and was associated with an increased risk of future type 2 diabetes, irrespective of earlier gestational diabetes (28). The frequency of macrosomic babies born to women without type 2 diabetes was 5.9% in one study (29) and 7.8% in another (30). Macrosomia occurred in 11.5% of the women without previous gestational diabetes, and 38.2% of those with prior gestational diabetes, and was associated with an increased risk of overt diabetes.
The main message of this study is that it is not enough to measure glycemia or HbA1c in the first trimester of pregnancy. We dare to say that screening for type 2 diabetes has to begin before conception to ensure the benefits that the knowledge of having diabetes can provide to women of childbearing age (31). Chronic complications were probably unexpected due to the short length of diabetes in both groups; nevertheless, they occurred in ~6.0 % of women. Diabetes complications were reported in 0.7% of women with overt diabetes in another study (32), compared to 0.9% here, and in 3.2% of those with known pre-pregnancy type 2 diabetes (32), compared to 5.5% here.
Based on risk factors, women could be diagnosed before they become pregnant. Risk scores based on ADA's questionnaire and including a lower age stratum (30 years) plus the information on the delivery of a macrosomic baby might help to identify diabetes in childbearing-age women.
Our study has strengths: we evaluated a large group of women with hyperglycemia in pregnancy and compared them to those with known pregestational diabetes. Specific risk factors were found according to the age of women, despite the similarity of the groups. The results also suggest that we need to anticipate the screening for type 2 diabetes in women of childbearing age.
Limitations of the study must be cited. We included data retrieved from medical registries; we assumed risk factors not recorded in medical charts as absent, probably underestimating their actual frequency. Regarding lack of information on lifestyle, in a Brazilian survey, only ~35% of women of childbearing age declared they exercised regularly, and less than 40% ate fruits and vegetables five or more days a week (17). This way, we assumed that lifestyle information would not significantly impact our results. Low precision estimates, like the low AUCs found for the risk factors models, may reflect the similarity between the two groups. We could not re-evaluate women with overt diabetes after delivery to confirm the diagnosis of diabetes, nor could we diagnose potential cases of Maturity Onset Diabetes of Youth (MODY) among women classified as having type 2 diabetes due to technical limitations to carry out genetic tests as routine care. MODY accounts for only ~1% of pregnancy-associated diabetes (33), and we believe this limitation did not impact our results. Lastly, we did not test our risk score in pregnant women without diabetes; they would probably perform better had we included these women.
In conclusion, classic risk factors could identify women at risk of type 2 diabetes before they become pregnant. Setting a lower age cut point and including the previous delivery of a macrosomic baby in the current screening questionnaires could improve their performance in reproductive-aged women.