The prediction of colorectal cancer: Perspective of smoking and socioeconomic influence of culture in East Java with SEM analysis

Purpose: Colorectal cancer (CRC) is a global malignancy with 10.2% of cases, ranking third in worldwide cancer cases. According to Indonesia Health Research, CRC's prevalence has risen significantly, reaching 1, 8 per mile in 2018, and reports rising cancer cases in the past 5 years. This study explores the relationship model between smoking, socioeconomics, and CRC, considering cultural moderation. Design/Methodology/Approach: Conducted as a retrospective study with


INTRODUCTION
Colorectal cancer (CRC) ranks at the third in the malignancy disease in the world with 1,849,518 cases (10.2%).It is the second leading cause of death after lung cancer, with 880,792 cases (9.2%) (Bishehsari et al., 2018;Sung et al., 2021).Indonesian Health Research (Riskesdas) states that there has been an increase in cancer cases in the last years in Indonesia, in 2013 as much as 1.4 per mile and increased to 1.8 per mile in 2018 (Kesehatan, 2018).The incidence of CRC in Indonesia, Globocan 2020, in men ranks number 2 after lung cancer and in general number (Sung et al., 2021).Smoking, both active and passive, is one of the risk factors for CRC.The pathophysiology of smoking against the occurrence of colorectal cancer has been known through biomedical research, otherwise in epidemiological studies there is a difference in results.Research in Singapore shows smoking has a risk of CRC, OR = 1.43 in light smokers and 2.64 in heavy smokers (Tsong et al., 2007).Research in Thailand showed no significant

Concept
This study uses a case control design.There are two variables independent (smoking and socialeconomics), one dependent variable (colorectal cancer), and one moderator variable (culture).The data that has been collected is carried out the fitness of models and hypothesis tests.The inclusion criteria consist of CRC diagnosis with adeno carcinoma histopathology examination (sufferer / case), age 40 to 70 years, resident dan born in East Java.The exclusion criteria consist of dementia, suffering breast cancer, prostate cancer and cervix cancer.

Smoking
Smoking is the activity of deliberately smoking a cigarette (active) or being in the same room with an active smoker (passive).There are 4 data from smoking variables, among them are smoking status, early smoking age, passive smoking and type of cigarette.Where it falls on the ordinal variable scale, with the highest score indicates poor smoking behavior.

Socialeconomics
Socialeconomis is environmental conditions around the patient and control.This variable consists of 4 data, including CRC knowledge, income, poverty, and education.Where it falls on the ordinal variable scale, with the highest value indicating poor economic conditions.

Culture
Culture serves as a moderating factor in this study and is inherent to the research individuals, unalterable, encompassing three main cultural areas present in East Java Province, Indonesia.It falls within the nominal variable, where the number 1 represents Arek culture, the number 2 represents Mataraman culture, and the number 3 represents Pandalungan culture.

Ethics
The Committee ensures that this research adheres to ethical guidelines, emphasizing voluntary participation, informed consent, data security, confidentiality, and sensitivity to cultural aspects.Approval was granted after a thorough consideration of the ethical implications of the proposal, demonstrating alignment with ethical standards and a commitment to safeguarding participant rights and well-being.On June 14, 2022, the Ethics Committee at Saiful Anwar General Hospital in Malang, East Java, Indonesia, meticulously reviewed and approved this research proposal titled "Predictive Model of Colorectal Cancer Risk Factors in the Perspective of Cultural Dietary Patterns, Smoking, and Socioeconomic Factors in East Java".

Statistic
The sample selection method employs non-random sampling (purposive sampling), selecting all subjects meeting the criteria from several hospitals in East Java, including RSUD (General Hospital) dr.Saiful Anwar Malang, RSUD dr.Soetomo Surabaya, RSUD dr.M Soewandi Surabaya, RSUD dr.M IskakTulungagung, and RSUD dr.SoebandiJember.According to Hair, Hult, Ringle, and Sarstedt (2017) the sample size should ideally reach 100 or more.In this study, there are 4 items related to smoking, 6 items for socioeconomic questions, and 1 each for cultural and cancer diagnosis questions, resulting in a minimum required sample size of 180.Before conducting the analysis, it is necessary to check the model variables and perform prerequisite tests.Variable model checks are carried out by analyzing the model using the WarpPLS 7.0 application.The analysis involves testing outer weight, collinearity (VIF) on formative variables (smoking and socioeconomic), and reflective variables (culture and CRC).This is followed by testing outer loading factors, namely average variance extracted (AVE), and composite reliability (Hair et al., 2017).After the variable model check, the next step is to perform prerequisite tests, namely the Goodness of Fit (GoF) with a minimum of two model fit indicators such as Sympson's paradox ratio (SPR), R-squared contribution ratio (RSCR), Statistical suppression ratio (SSR), and Nonlinear bivariate causality direction ratio (NLBCDR).These indicators are used to evaluate the quality of the model (Kenny, 2020).Subsequently, the SEM model test is conducted using the WarpPLS 7.0 application, where this test can be performed to determine the relationship values between independent and dependent variables in the previously designed model.The established relationship comprises a single model, where the model illustrates the overall relationship of the variable set with colorectal cancer (CRC) as the focal point of the relationship, and culture as the moderating point for the entire cultural group.This can be understood as follows:

Characteristic of Study
In Table 1 Table 2 provides various statistics related to the demographic and health characteristics of the study population.It includes the average age of individuals from three different cultural groups, the percentage of individuals who smoke and the kind of smoke they consume and their socioeconomic.It also reports the percentage of individuals with cancer and their CRC status.This information can help understand the health profile of the study population and inform public health interventions.The CRC status data reveals that out of the total 212 cases, 40 individuals (19%) have CRC on the right side, distributed as follows: Caecum (C18.0) with 10 cases (5%), ascending colon (C18.2) with 22 cases (10%), flexura hepatica (C18.3) with 1 case (0%), and transverse colon (C18.4) with 7 cases (3%).The majority of CRC cases, individuals (81%), occur on the left side, including descending colon (C18.6) with 9 cases (4%), sigmoid colon (C18.7) with 34 cases (16%), rektosigmoid (C19) with 37 cases (17%), and rectum (C20) with 92 cases (43%).
The smoking status within the study population, totaling 212 individuals.In terms of current smoking habits, individuals (45%) are categorized as Not Smoke, 23 individuals (11%) as Ex Smoker, 36 individuals (17%) as Rare Smoker, and 57 individuals (27%) as Active Smoker.The provided socioeconomic data outlines the poverty levels within the study population.102 individuals (48%) are categorized as having low poverty status, while individuals (31%) fall into the middle-low poverty category.Additionally, 32 individuals (15%) are classified as middle-high poverty, and 12 individuals (6%) are designated as high poverty.

Analysis of Study
The analysis of this study was conducted using the warpPLS application, which consists of two steps: prerequisites test and conducting the PLS analysis.

Prerequisites Test
The analysis of this study was conducted using the warpPLS application, which consists of two steps: testing the prerequisites and conducting the PLS analysis.
Based on Table 3, there are three variables that do not meet the criteria and, therefore, must be excluded from the analysis, namely X1.1 (smoking status) and X1.4 (cigarette type), as they do not meet the requirement to have a VIF value below 5 with a P-Value < 0.05.Specifically, X1.1 (smoking status) has a VIF value of 15.177 and a P-Value < 0.001, while X1.4 (cigarette type) has a VIF value of 8.361 and a P-Value < 0.001.This exclusion is necessary as these two variables could introduce errors or lead to misinterpretation during the analysis process.Thus, the   Ideally TenenhausGoF(GoF)=0,324, small >= 0,1, medium >= 0,25, large >= 0,36 Medium Sympson's paradox ratio (SPR)=1,000, acceptable if >= 0,7, ideally = 1 Ideally R-squared contribution ratio (RSCR)=1,000, acceptable if >= 0,9, ideally = 1 Ideally Statistical suppression ratio (SSR)=0,636, acceptable if >= 0,7 Acceptable Nonlinear bivariate causality direction ratio (NLBCDR)=0,818, acceptable if >= 0,7 Acceptable Based on Table 5. the analysis of model fit can be show by the SPR value is 1, indicating an ideal indicator, and the NLBCDR value is 0.818, falling within the accepted category.Therefore, it can be concluded that the model created meets the requirements for conducting PLS analysis.

PLS Analysis
The following are the results of PLS analysis using the WarpPLS application, from the hypothesis path created, there are three significant paths and one un-significant path at a significance level of 5%, as explained below:  The Figure 2 or Table 6 includes four hypotheses (H1-H4) testing the relationship between several variables and CRC.The first three hypotheses test the direct pathways between smoking and socioeconomic status, each towards CRC.H1 shows a significant positive relationship between smoking and CRC with sig 0,040 < α 0,05.H2 shows a significant positive relationship between socioeconomic status and CRC with sig 0.001 < α 0.05.The last three hypotheses (H3-H4) test the indirect pathways through the mediation of cultural moderation of smoking habits and socioeconomic status, respectively, towards CRC.H3 is non-significant with sig 0.318 > α 0.05, while H4 is significant with sig 0.047 < α 0.05.

Asociation of Smoking with CRC
Hypothesis H1, which posits a positive association between smoking habits and CRC prediction in East Java Province (Direct Path), yields significant results with a P-value= 0,040 <α 0,05.This indicates a positive relationship with a coefficient value of β = 0,081.This value is solely derived from the sub-variables of age at first smoking (X1.2) and passive smoking (X1.3), while the sub-variables of smoking status (X1.1) and cigarette type (X1.4) are not yet eligible for analysis.As a result, the prognosis of CRC deteriorates with increasingly detrimental smoking behavior.The early age of smoking initiation and exposure to passive smoke become crucial determining variables in this context.The severity of colorectal cancer outcomes is intricately linked to the negative impact of smoking habits, where both the early age of smoking initiation and exposure to passive smoke play pivotal roles as significant factors in East Java.Subsequently, Hypothesis H3, which suggests that smoking habits positively moderation between culture and CRC prediction in East Java Province (Indirect Path), indicates that the culture variable is not yet robust enough to moderate or strengthen the relationship between smoking factors and the occurrence of CRC in East Java Province with P-value= 0,318 <α 0,05 and β = 0,022.
The meta-analysis conducted by Botteri et al. (2020), involving a total of 106 studies and 40,719 cancer cases, indicates that smoking is directly associated with an increased risk of colorectal cancer, while quitting smoking may reduce the risk of colorectal cancer.This study also examines the relationship between smoking and alcohol consumption with the risk of colorectal cancer (CRC) based on molecular subtypes and pathological pathways.The study results show that smoking (P-value = <0.001) and alcohol consumption (P-value = <0.012)are associated with a higher risk of CRC, particularly in CRC with high Microsatellite Instability (MSI), BRAF mutation, KRAS wild-type, and high CpG island methylator phenotype (CIMP).Study in investigating of relationship between smoking and the risk of colorectal cancer (CRC), while taking into account various factors such as sex, age, and the anatomical subsite of the cancer.The study observed that both male and female smokers had higher prediction of CRC in comparison to non-smokers, and smoking was associated with an increased risk of left colon and rectal cancer in both sexes.Although the association between smoking and right colon cancer was not statistically significant for men, women who had ever smoked showed a 20% higher risk of right colon cancer.Moreover, the study noted that smoking dose and duration were directly linked to CRC risk, with no significant differences observed between sexes.Based on the results, smoking was identified as an important risk factor for CRC, particularly for left colon and rectal cancer.As a consequence, it is recommended that both male and female smokers be encouraged to quit smoking to minimize their risk of developing CRC (Gram, Park, Wilkens, Haiman, & Le Marchand, 2020).
Research on the impact of smoking status (active smokers) affecting the prognosis of CRC remains challenging to obtain.The available research focuses on the influence of smoking duration on patient survival.Patients who smoked for more than 30 years have the highest risk of mortality (HR = 1.14; p = 0.0076) (Huang et al., 2023).higher among individuals with higher levels of education, particularly among men, but there was no significant difference for rectal cancer.Trends in the incidence of colon cancer by education level varied over time, and the differences between groups decreased in recent years.The study also found that the incidence of colon cancer was higher among individuals with higher socioeconomic status, particularly among men, with the largest differences observed in distal colon cancer.However, the differences between socioeconomic groups were smaller in women.

Moderation of Culture with CRC
The cultural variable is a moderation variable, which affects (strengthen or weaken) the relationship between the independent and dependent variables.In hypothesis H3: Culture is positively moderating the association between smoking habit and CRC prediction in East Java Province (Indirect Path), the relationship is not significant with Pvalue 0.318 < α 0.05 and a coefficient of β = 0.022.This indicates that the cultural variable has no relationship with the relationship between smoking and CRC.In the next hypothesis, H4: Culture is positively moderating the association between socio-economic status and CRC prediction in East Java Province (Indirect Path), the relationship is significant with P-value 0.047 < α 0.05 and a coefficient of β = 0.078, and it strengthens the existing relationship.Overall, the results obtained are relatively different from one another.Treatment plans for different races and cultures vary, for example, Hispanic patients are less likely to receive surgery compared to White patients, while Asian patients are more likely to receive chemotherapy.Patients listed as "Other" in the race/ethnicity category are also less likely to receive surgery and chemotherapy.Therefore, the characteristics of each ethnicity differ significantly in terms of habits and physiological adaptation (Tramontano et al., 2020).Culture should also enhance the strength of community groups.A meta-analysis study shows that several studies have reported that peer support can significantly increase the awareness and intention to receive colorectal cancer screening in ethnic minorities and is an ideal choice for promoting the screening among ethnic minorities, especially in diverse communities.Peer support intervention is recommended to promote the implementation of screening in Asian Americans, where they mostly build their communities more intimately.The concept of peer counseling is worth promoting, such as church-based peer counseling programs, but the challenges obtained require enhanced management to maintain sustainability (Hu, Wu, Ji, Fang, & Chen, 2020).

CONCLUSION
The article discusses the association between smoking , socioeconomic factors, colorectal cancer (CRC) prediction, and moderation by culture in East Java Province.The study found a positive association between smoking habit and CRC prediction in men in East Java Province.The meta-analysis conducted by Botteri et al. (2020) also supports this finding, demonstrating that smoking is associated with an increased risk of CRC, while smoking cessation may reduce the risk.Moreover, smoking and alcohol consumption are associated with a higher risk of CRC, particularly with MSI-high, BRAF-mut, KRAS-wt, and CIMP-high CRC.The study results also indicate that smoking is associated with a higher risk of cancer that develops through traditional or serrated pathological pathways.The study recommends that both male and female smokers be encouraged to quit smoking to minimize their risk of developing CRC.The study also found a positive association between social economic factors and CRC prediction in East Java Province.Socioeconomic factors and inequities such as poverty and lack of insurance have an impact on the predicting of CRC, especially the out-of-pocket expenses associated with the increasing cost of chemotherapy, making it very difficult for those in need to access screening, care, and further management of CRC.However, different findings were reported by Savijärvi et al. (2019) who found that the prediction of colon cancer was higher among individuals with higher levels of education, particularly among men, but there was no significant difference for rectal cancer.The study also found that the prediction of colon cancer was higher among individuals with higher socioeconomic status, particularly among men, with the largest differences observed in distal colon cancer.However, the differences between socioeconomic groups were smaller in women.In conclusion, the study provides evidence that smoking and unfavorable socioeconomic conditions are factors exacerbating CRC in the East Java Province.Detrimental cultural practices further compound the weakened socioeconomic relationships with the severity of CRC.The study recommends that individuals should be encouraged to quit smoking and address socioeconomic factors to minimize their risk of developing CRC.Cultural factors must be considered in preventing adverse CRC incidence, particularly in conditions of weak socioeconomic status.Further research is needed to confirm these findings and to investigate other potential risk factors for CRC prediction.

Figure 1 .
Figure 1.Relationship models using SEM with WarpPLS.Note: The Figure 1 above illustrates the proposed hypothesis structure layout in the SEM model to be conducted.It outlines the SEM relationship model between the independent variables, smoking and socioeconomic status, and the dependent variable, colorectal cancer (CRC).This is moderated by cultural factors in East Java Province, including Mataraman, Arek, and Pendalungan.

Figure 2 .
Figure 2. Results of relationship analysis using SEM with WarpPLS.Note: Figure 2 above illustrates the layout of the hypothesis structure and the values of the relationship generated in the conducted SEM model.It explains the SEM relationship model between the independent variables, smoking, and socioeconomic status, and the dependent variable, colorectal cancer (CRC).Moderation is carried out by cultural factors in East Java Province, involving Mataraman, Arek, and Pendalungan.
This is moderated by cultural factors in East Java Province, including Mataraman, Arek, and Pendalungan.

Table 1 .
, the distribution of Colorectal Cancer (CRC) Patients among Hospitals in the Region is depicted.The table depicts the number of CRC patients treated at various hospitals in the region, including SaifulAnwal General Hospital Malang, Iskak General Hospital Tulungagung, patients of RSAL (Marine Forces Hospital) Ramelan, RS DKT (Army Hospital) Jember, and other hospitals.SaifulAnwal General Hospital Malang is treating 96 patients with CRC, accounting for 55% of the total number of CRC patients in the region.Similarly, Iskak General Hospital Tulungagung is treating 31 patients, representing 18% of the total number of CRC patients.RSAL Ramelan is treating 20 patients (11%), DKT Hospital Jember is treating 11 patients (6%), and other hospitals are treating 17 patients (10%).Spread of colorectal patients.

Table 2 .
Characteristic data of the control and sample groups.The table presents data on the mean age of individuals within three distinct cultural groups: Arek, Mataraman, and Pendalungan.The highest average age is noted in the Arek group at 54.4 years, followed by the Mataraman group with an average of 53.6 years and the lowest in the Pendalungan group with an average of 51.4 years.Within the Arek culture group, there were 47 males (12%) and 46 females (12%), totaling 93 individuals.The Mataraman culture group consisted of 40 males (10%) and 30 females (8%), amounting to 70 individuals.Meanwhile, the Pendalungan culture group included 26 males (7%) and 23 females (6%), totaling 49 individuals.The percentage and count of individuals based on CRC status, categorized into rigthCRC group and leftCRC Group.

Table 3 .
The factor loading test results were used to determine the qualifying indicators for the variable.

Table 4 .
The variables determined after filtering out unqualified indicator variables.After that, the input variables in the application are reduced by the two unqualified variables.Subsequently, the analysis is conducted again, and the results are obtained as shown in Table4.This table provides an overview of various variables in the analysis, including VIF, P-Value, Outer Loading, AVE (Average Variance Extracted), and Composite Reliability.The Culture variable is reflective with an outer weight of 1.000, VIF of 1.000, and P-Value <0.001, qualifying for further analysis.The Smoking variable (X1.2 and X1.3) is formative, displaying outer weights, VIF values, and P-Values that all qualify for continued analysis.Similarly, the Socioeconomic variable (X2.1 to X2.4) exhibits formative characteristics with Outer Loading, VIF, and P-Value meeting the criteria for further analysis.The Colorectal Cancer (CRC) variable and Moderation variables (Culture x Smoking and Culture x Socioeconomic) are reflective, each meeting the criteria for Outer Loading, VIF, and P-Value.These results indicate that the variables in the analysis have acceptable levels of reliability and validity, supporting their inclusion in further assessments.The next step is to test the model fit by checking at least 5 significant indicators, such as:

Table 5 .
The model fit results were used to determine the goodness of fit (GoF) for the model used.

Table 6 .
The path model testing results were obtained using SEM with Warp-PLS.