The inverse problem algorithm (IPA) uses mathematical calculations to estimate the expectation value of a specific index according to patient risk factor groups. The contributions of particular risk factors or their cross-interactions can be evaluated and ranked by their importance.
This paper quantified the potential risks from multiple biological factors by integrated case studies in clinical diagnosis via the IPA technique. Acting as artificial intelligence field component, this technique constructs a quantified expectation value from multiple patients’ biological index series, e.g., the optimal trigger timing for CTA, a particular drug in blood concentration data, the risk for patients with clinical syndromes.
Common biological indices such as age, body surface area, mean artery pressure, and others are treated as risk factors upon their normalization to the range from 1.0 to 1.0, with a non-dimensional zero point 0.0 corresponding to the average risk factor index. The patients’ quantified indices are re-arranged into a large data matrix. Next, the inverse and column matrices of the compromised numerical solution are constructed.
This paper discusses quasi-Newton and Rosenbrock analyses performed via the STATISTICA program to solve the above inverse problem, yielding the specific expectation value in the form of a multiple-term nonlinear semi-empirical equation. The extensive background, including six previous publications of these authors’ team on IPA, was comprehensively re-addressed and scrutinized, focusing on limitations, stumbling blocks, and validity range of the IPA approach as applied to various tasks of preventive medicine. Other key contributions of this study are detailed estimations of the effect of risk factors’ coupling/cross-interactions on the IPA computations and the convergence rate of the derived semi-empirical equation viz. the final constant term.
The main findings and practical recommendations are considered useful for preventive medicine tasks concerning potential risks of patients with various clinical syndromes.
This study quantified the potential risk from multiple biological factors by integrating six case studies on clinical diagnosis based on the IPA (inverse problem algorithm) approach. The latter has been widely used in modern research in the last decade due to its unique feature in predicting the biological index from a group of patients’ risk factors, immediately alerting medical doctors to take precautions against acute syndromes. The IPA uses mathematical calculations to estimate the expectation value of a specific index according to patient risk factor groups. Furthermore, the contributions of particular risk factors or their cross-interactions can be evaluated and ranked by their importance [1, 2, 3, 4, 5, 6]. Unlike the Taguchi optimization algorithm, which is used to optimize one purpose from a combination of multiple factors [7, 8, 9, 10], IPA directly estimates the expectation value from a group of data. Thus, IPA provides more quantified information than Taguchi’s approach, suggesting just the optimal combination of factors.
A practical application of IPA usually starts from the expectation value definition, which can be the concentration of a particular drug in the patient’s blood , the severity of coronary artery , carotid stenosis , or even the optimal CTA timing for head and neck scanning [5, 6] as well. The IPA is operated according to numerous data of the patient’s biological index. Thus, the precise estimation of expectation value is made by the numerical analysis of a specific inverse matrix. The rank of the original data matrix can be defined as [N M]. Then, the corresponding inverse matrix is expanded to [N M N M], yielding the solution via the IPA technique. For instance, for 300 patients with six risk factors, the original data matrix will be [300 22], with six risk factors corresponding to 22 terms of the semi-empirical formula. The corresponding inverse matrix has to be expanded to [6600 6600] in the computer’s memory buffer, yielding the compromised solution via quasi-Newton analysis  or Rosenbrock  analyses. However, such computations are hindered by the CPU’s analytical ability limitations. Further development succeeded in the last decade due to extra-computational powers obtained, so that IPA could be executed to proceed with the respective research.
Six IPA-related research topics are evaluated in this study. Each study used five, six, or seven biological indices, and the adopted patients’ number varied from 100 to 1001. In the overview of the IPA technique, we provided a flowchart to illustrate the general idea of how researchers apply IPA in application of artificial intelligence and how to convert the risk factor into dimensionless numerical digits with the normalization process. In the discussion section, we elaborate on the outcomes of the STATISTICA program, interpret risk factors’ cross-interactions, and discuss the IPA procedure’s convergence.
2.An overview of IPA
Figure 1 illustrates the flowchart of IPA operation in application of artificial intelligence. As depicted, the quantified expectation value of the project must be defined first, and then the number of risk factors should be preset. Noteworthy is that the adopted factors have to maintain their orthogonality to each other. In addition, the estimated expectation value must be verified through another group of patients’ data to ensure its accuracy. Any failure in verifying or checking the program outcomes (loss function, variance, or correlation coefficient) from STATISTICA necessitates to go back to the preliminary stage in re-defining the risk factors or increasing the number of patients’ data because the program may not converge to an acceptable range due to the limited data scope.
2.2Assigning the essential risk factors
The semi-empirical formula, as recommended by IPA, includes five to seven individual factors that cannot be derived from each other. For instance, weight and body mass index (BMI) cannot be assigned as two independent factors in one study since BMI can be derived from the patient’s weight and height W/H [kg/m]. In contrast, the systolic blood pressure (SBP) and diastolic one (DBP) can be integrated as MAP (SBP 2 DBP)/3, with the mean arterial pressure (MAP) is a resulting biological index for the risk factor . If the number of risk factors is less than five, the researchers are recommended to add body surface area (BSA) as additional factor since the pharmacokinetic model always relies heavily on the patient’s body surface according to the metabolic mechanism. In contrast, BMI dominantly attenuates the radioactive dose or imaging quality from the external point source. Thus, it is quite rarely used in most IPA-related studies.
2.3Defining the semi-empirical formula
Multiple terms are also fixed once the number of risk factors is determined. The semi-empirical formula contains only contributions from the factor and the cross-interactions between two factors. Therefore, neither triple (, or , etc.) nor quadruple (, or , etc.) cross-interactions among factors are considered, while all residual multiple cross-interactions are merged into the final constant term as minor oscillation to reach convergence of the numerical solution. The semi-empirical formulas for seven, six, and five terms, yielding the respective expectation values , , and , are given below:
The coefficient matrix  can be constructed via the inverse problem algorithm as defined below.
First assume that in a linear equation , is the input value, is the expected value, and is the sensitivity of to . Next, consider a similar correlation between the input data matrix (V) and its expected column matrix (Y) via the following column matrix of coefficients (A):
Assuming that is a standard loss function, we get:
Here the dataset matrix-vector is [case number coefficient term], while its transpose matrix is denoted as . Using L’Hospital rule, which implies that extreme values of any function correspond to zero values of its differential, this kernel function’s extremes are derived by assuming a zero first-order differential of the loss function in Eq. (7). Next, the coefficients’ column matrix is constructed via Eqs (5)–(9) and the inverse matrix (). The STATISTICA program facilitates deriving the minimum loss function (customized by the user) and the sought-for compromised solution.
2.4Normalization of risk factors
Each risk factor must be normalized to become dimensionless before its input into the STATISTICA program. This ensures that the contribution from each factor can be equally dealt with although technically the program can be run without normalization. Each risk factor needs to be converted into the same range between 1.0 and 1.0. Therefore, the averaged normalized data of all patients’ equals exactly 0.0. In addition, the minimal and maximal values become 1.0 and 1.0, respectively, after the normalization. The respective conversion has the following form:
As denoted, the is restricted from 1.0 to 1.0 and becomes dimensionless. For instance, the recorded weights for all patients’ groups are 45, 110, and 77.5 kg for minimal, maximal, and average values, respectively. Then, the newly converted values become 1.0, 1.0, and 0.0 after normalization. In addition, the original patient’s weight of 90 kg is eventually converted to 0.385. The normalization is crucial to eliminate the physical meaning and become a pure numerical digit for further computation in the STATISTICA program.
2.5Running the STATISTICA default program
STATISTICA 7.1 version  was run to realize the inverse program algorithm, yielding the kernel function . Correlations and cross-interactions among the risk factors were analyzed via user-defined regressions and first-order nonlinear models. Data from the primary group of specific patients were normalized and fed into the numerical tests for the loss function customization. Alternatively, Simplex or Rosenbrock pattern search techniques could yield converged solutions for these loss functions, while deriving the minimum loss function necessitated involving the Rosenbrock or quasi-Newton integrated approaches. Figure 2 depicts a typical STATISTICA program in function. The user needs to follow the suggested option and define the unique loss function to obtain the coefficients’ matrix according to the IPA technique. The STATISTICA program is fully compatible with EXCEL, so the original data can be calculated and arranged in EXCEL and then copied to STATISTICA for further analysis to save the processing time.
3.Integrated review of the IPA technique
3.1Integrated study of quantified prediction of expectation
|Expectation value||Serum creatinine after administering CMS||Coronary artery stenosis||Digoxin reading||Carotid stenosis||LRA/US||CTA-TT|
|Serum creatinine before administering CMS||MAP||Creatinine||MAP||Heart rate||HR|
|BUN||Sugar AC||Na ions||Sugar AC||CMS||CMD|
|Systolic blood pressure||LDL-C||K ions||CRP||Pre||Pre|
|Original patient’s number||70||93||168||217||216||802|
|Verified patient’s number||30||45||45||55||35||199|
|of actual vs. predicted line||0.986||0.795||0.892||0.875||0.931||0.897|
Table 1 shows the integrated evaluation of the IPA-related papers. As depicted, the expectation value may be the serum creatinine index after contrast administration for the patient with cardiac diagnosis  or digoxin reading after the patient was administered digoxin . Nevertheless, the expectation values in the same study could be either the maximal ratio of both left-and-right-arterial-to-upper sinuses (LRA/US) or suitable triggered timing of computer tomography angiography (CTA-TT) [5, 6]. Yet, the preliminary study of LRA/US helps to confirm the correlation among CTA-TT with other risk factors. Thus, a large group of patients was recommended to collect the data for further analysis of CTA-TT in the follow-up study. Eventually, the derived semi-empirical formula can provide instant estimation for patients who have undergone CTA examination. Most risk factors are biological indices collected in routine examination, such as Age, BSA, Sugar AC, or MAP. The number of patients for further verification is strongly suggested as 1/5 to the original patient’s number to create the database of STATISTICA 7.1. The loss function is defined as (OBS-PRED), whereas and are observed (OBS) and predicted (PRED) expectation values, respectively, in a clinical study (cf. Eq. (6)). A small loss function is always preferable to imply an excellent numerical analysis outcome and conclude with high variance, and coefficient of correlation, . Accordingly, serum creatinine can be precisely estimated after CM administration  since reaches as high as 0.986 (1.00 is the maximal), and even in the worst case among all, still holds 0.795 in the study of Pan et al. . The accurate estimation of either coronary artery or carotid stenosis helps cardiac doctors to grasp the principle in clinical diagnosis before having any interventional examination [2, 4]. Nevertheless, an appropriate trigger timing for CTA preset essentially reduces the exposed dose (CTA trigger timing range from 20 down to 2 sec) for patients who underwent routine examination .
3.2Interpretation of coefficients of risk factors
The obtained coefficients of risk factors from STATISTICA running imply the importance of the specific risk factor. The personal biological examination’s original data of risk factors are normalized to eliminate their dimensionality. Therefore, Age, BSA, MAP, and all other factors’ data become converted into integer values between 1.0 and 1.0. Thus, the derived coefficient of any specific risk factor can reflect its dominance in the semi-empirical formula. For instance, if coefficient of factor A is four times more significant than that of factor B, then the contribution of A exceeds that of B by four times. Moreover, factor B can be treated as a minor factor in the expectation value in preventive medicine.
3.3Cross-interactions among factors
In some special cases, the individual factor may not offer a dominant contribution to the expectation value. In contrast, cross-interactions among factors can strongly dominate the performing. According to IPA computational presumption, the cross-interaction between two factors (A and B) can be interpreted as A B and mathematically defined as a cross-product (A B) with a vertical vector to both A and B, as depicted in Fig. 3A. As clearly illustrated, A B has a vertical vector to both A and B, which points upward, whereas the directional vector of C D points downward, as shown in Fig. 3A. Furthermore, the assigned vector of either factor itself or cross-interaction among factors creates a specific path along the vector to follow for optimizing the compromised solution in the computational model . Thus, additional terms in the semi-empirical formula provide more alternative paths for optimizing the compromised solution.
3.4The IPA prediction convergence
A sizeable constant term in the semi-empirical formula is always preferable in the IPA predictions. Although it reduces the accuracy of estimation, it indicates the system stability in pursuing a compromised solution via numerical analysis. As illustrated in Fig. 3B, a constant term in the semi-empirical formula can be treated as the average expectation value of all individual patients. In contrast, contributions of other terms in the formula are either dominant (terms with large coefficients) or minor (terms with small ones) respectively to the outcome. Specifically, Fig. 3A shows a comparatively small constant value. In other words, other large coefficients might dominate the formula’s performance and cause more time to optimize the compromised solution. Figure 4 interprets the mathematical phenomena of convergence in IPA presumption. Once a large constant dominates the IPA performance, the compromised solution series may rapidly damp to a stable value. Otherwise, a long computational time is required for its convergence. However, a large constant also indicates a minor alignment that can be achieved in optimizing the solution. Noteworthy is that only in the study of Pan et al. , the rank of constant (rank 14/16) was less than any others, namely, ranks 6/16, 7/29, 5/16, 5/22, and 6/22, respectively in the studies [2, 3, 4, 5, 6]. Thus, the optimized loss function, (cf. Eq. (6)) in  was as low as 0.235, yielding a high correlation coefficient 0.986, whereas other loss function fluctuated about 2.014–4.408 values (cf. Table 1). In addition, the low loss function might be due to small constant term to have a large oscillated range in optimizing the final solution, whereas high loss function can be effectively suppressed by increasing the number of patient’s data, since more original data help greatly in constructing the correct coefficient matrix for solving IPA.
3.5The application of IPA in preventive medicine
A simple visualization of the IPA technique prospects in preventive medicine can be obtaining by plotting the IPA calculated outcomes via a ladder diagram [4, 6]. In doing so, three dominant risks are assigned as X-, Y-, and Z-axis, respectively. The other risk factor is set as 0.0 after normalization because 0.0 implies an average value of that specific factor (cf. Eq. (10)). Thus, the preset scenario describes a general case of patients. Figure 5, (A) presents IPA-based timing optimization of head and neck CTA for 1001 patients in 2020–2021, whereas the respective ladder diagram represents BSA (from 1.19 to 2.36 m) in the X-axis, MAP (from 40 to 158 mmHg) in the Y-axis, and age (from 30 to 96 years) in the X-axis, with HR, CMD, and Pre preset at 69/min, 52 cc, and 132 mmHg, respectively (i.e., 0.0 after normalization) ; (B) The carotid stenosis risks for 272 patients with ischemic stroke symptoms were analyzed via the IPA technique, with the dominant risk factors (C-reactive protein, glucose AC, and MAP) aligned in seven frames along the Z-, X-, and Y-axes, respectively . The benefit of drawing a ladder diagram is that it can visualize the IPA-provided information and furnish the medical staff with an instant quantified index for referring before having precise computation from the STATISTICA program.
This paper re-addressed and integrated six previous studies of these authors, focusing on the validity range and stumbling blocks of the inverse problem algorithm implementation into artificial intelligence and computer-aided medical applications. The integral part of this technique’s practical realization was normalizing several risk factors’ indices, thus eliminating their various dimensionalities and yielding a quantified integer data interval from 1.0 to 1.0, with the middle point, 0.0, corresponding to the average risk factor. The IPA technique recommends five to seven factors to ensure the expectation value. The value could be a digoxin reading or any index of the clinical syndrome. Within framework of the robust designation procedure, either expectation value or risk factor must be quantified to create the digital data matrix for the STATISTICA program to analyze and then interpret with medical definition. The IPA technique expands the horizon of exploring potential syndromes in application of artificial intelligence.
This study was financially supported by the Taichung Armed Forces General Hospital, Taiwan (Contract no. TCAFGH-D-111037).
Conflict of interest
None to report.
Pan LF, Davaa O, Chen CY, Pan LK. Quantitative evaluation of contrast-induced-nephropathy in vascular post-angiography patients: Feasibility study of a semi-empirical model. Bio-Medical Materials and Engineering. (2015) ; 26: : s851-860.
Pan LF, Chiu SW, Xiao MF, Chen CH, Pan LK. Revised inverse problem algorithm-based prediction of coronary artery stenosis readings from the clinical data of patients with coronary heart diseases. Computer Assisted Surgery. (2017) ; 22: (s1): 70-78.
Lin YH, Hsiao KY, Chang YT, Kittipayak S, Pan LF, Pan LK. Assessment of effective blood concentration readings from clinical data on patients with heart failure diseases after digoxin intake: A projection based on the inverse problem algorithm. JMMB. (2019) ; 19: (8): 1940061.
Lin YH, Chiu SW, Lin YC, Lin CC, Pan LK. Inverse problem algorithm application to semi-quantitative analysis of 272 patients with ischemic stroke symptoms: Carotid stenosis risk assessment for five risk factors. JMMB. (2020) ; 20: (9): 2040021.
Lin CS, Chen YF, Deng J, Yang DH, Pan LF, Pan LK. A six-parameter semi-quantitative analysis of 251 patients for the enhanced triggered timing of head and neck CT angiography scanning via the inverse problem algorithm. JMMB. (2020) ; 20: (10): 2040045.
Liang CC, Pan LF, Chen MH, Deng J, Yang DH, Lin CS, Pan LK. Timing optimization of head and neck CT angiography via the inverse problem algorithm: In-vivo survey for 1001 patients in 2020–2021. JMMB. (2021) ; 21: (10): 2140055.
Lin CS, Chen YF, Deng J, Yang DH, Chen MH, Lin YH, Pan LK. Taguchi-based optimization of head and neck CT angiography: In-vivo enhanced triggered timing for 600 patients. JMMB. (2021) ; 21: (9): 2140041.
Chiang CY, Chen YH, Pan LF, Cho CC, Peng BR, Pan LK. The minimum detectable difference of C.T. angiography scans at various cardiac beats: Evaluation via a customized oblique V-shaped line gauge and PMMA phantom. JMMB. (2021) ; 21: (10): 2140066.
Pan LF, Chen YH, Wang CC, Peng BR, Kittipayak S, Pan LK. Optimizing cardiac C.T. angiography minimum detectable difference via Taguchi’s dynamic algorithm, a V-shaped line gauge, and three PMMA phantoms. THC. (2022) ; 30: (S1): 91-103.
Ke CH, Liu WJ, Peng BR, Pan LF, Pan LK. Optimizing the minimum detectable difference of gamma camera SPECT images via the Taguchi analysis: A feasibility study with a V-shaped slit gauge. Applied Sciences. (2022) ; 12: : 2708.
Huang W, Gallivan KA, Absil PA. A Broyden class of quasi-Newton methods for Riemannian optimization. SIAM J. OPTIM. (2015) ; 25: (3): 1660-1685.
Luan VT, Ostermann A. Exponential Rosenbrock methods of order five construction, analysis and numerical comparisons. Journal of Computational and Applied Mathematics. (2014) ; 255: : 417-431.
Sesso HD, Stampfer MJ, Rosner B, et al. Systolic and diastolic blood pressure, pulse pressure, and mean arterial pressure as predictors of cardiovascular disease risk in men. Hypertension. (2000) ; 36: : 801-807.
STATISTICA version 220.127.116.11, http://www.statsoft.com.
Kreyszig E. Advanced Engineering Mathematics, 10ed. Wiley Co. Ld., USA. 2011 ISBN: 0470646136.