# Application of genetic algorithm combined with improved SEIR model in predicting the epidemic trend of COVID-19, China

### Traditional SEIR infectious disease prediction model

Because COVID-19 has an incubation period, and its average incubation period is 7 days, we chose to consider the SEIR infectious disease model of the latent person as the basis for modeling. In the traditional SEIR model, the population is divided into the following four categories28:

Susceptible (S), healthy people who may be infected.

Exposed (E), people who have been infected and have not shown pathological features.

Infected (I), people who have been infected and show pathological features.

Removed (R), people who have died or cured that are no longer contagious and will not be infected.

At the same time, the following assumptions are often made29,30:

1. 1.

It is assumed that the total population in a certain area is constant, and the natural birth rate and natural death rate are not considered, and the movement of people between regions is not considered.

2. 2.

Recovered persons can develop antibodies and will not be infected again recently.

3. 3.

The exposed (E) is not contagious.

Therefore, based on the traditional SEIR model, the mutual conversion relationship among the four populations is shown in Fig. 2. The susceptible have a certain probability of being infected into the exposed after contact with the infected, and exposed persons will be transformed into infected persons after a period of incubation period, and the infected persons will be cured or die and become removers. r1 is the number of effective contacts of infected persons, and $$\beta_1$$ is the probability of infection of infected persons each time they come into contact with susceptible persons. $$\upalpha$$ is the conversion rate of exposed to infected persons, and $$\upgamma$$ is the removal rate of infected persons, which is the reciprocal of the treatment cycle. At the same time, set the total population in the area to N, and S + E + I + R = N. Establishing differential equations for the above relationship can be obtained:

$$\left\{ {\beginarray*20l \frac\textdS\textdt = – \textr_1 \upbeta _1 \textI\frac\textS\textN \\ \frac\textdE\textdt = \textr_1 \upbeta _1 \textI\frac\textS\textN – \alpha E \\ \frac\textdI\textdt = \alpha E – \gamma I \\ \frac\textdR\textdt = \gamma I \\ \endarray } \right.$$

(1)

By (1) is modified to iterative forms available:

$$\left\{ {\beginarray*20l \textS_\textt + 1 = \textS_\textt – \textr_1 \upbeta _1 \textI_\textt \frac\textS_\textt \textN \\ {\textE_\textt + 1 = \textE_\textt + \textr_1 \upbeta _1 \textI_\textt \frac\textS_\textt \textN – \alpha E_\textt } \\ \textI_\textt + 1 = \textI_\textt + \alpha E_\textt – \gamma I_\textt \\ {\textR_\textt + 1 = \textR_\textt + \gamma I_\textt } \\ \endarray } \right.$$

(2)

### Improved SEIR model A: infectious disease prediction model considering the infectivity of incubation period

According to the “COVID-19 Diagnosis and Treatment Program (Trial Eighth Edition)” issued by the National Health Commission on August 18, 2020, the source of infection is not only the patients infected by novel coronavirus, but also the asymptomatic infection. Namely, COVID-19 is infectious in the incubation period. Therefore, the traditional SEIR model needs to be changed as follows:

As shown in Fig. 3, based on the traditional SEIR model, the infectivity of the latent person to the susceptible person is increased, and the susceptible person may be infected and become a new latent person after contact with the exposed or the infected. Among them, r2 is the number of effective contacts of the latent, and $$\upbeta _2$$ is the infection probability of each contact of the latent with a susceptible person, which can be concluded that the new differential equation:

$$\left\{ {\beginarray*20l {\frac{\textdS}\textdt = – \textr_1 \upbeta _1 \textI\frac\textS\textN – \textr_2 \upbeta _2 \textE\frac\textS\textN} \\ {\frac{\textdE}\textdt = \textr_1 \upbeta _1 \textI\frac\textS\textN + \textr_2 \upbeta _2 \textE\frac\textS\textN – \alpha E} \\ {\frac{\textdI}{{{\textdt}}} = \alpha E – \gamma I} \\ {\frac{\textdR}{{{\textdt}}} = \gamma I} \\ \endarray } \right.$$

(3)

By (3) is modified to iterative forms available:

$$\left\{ {\beginarray*20l {\textS_\textt + 1 = \textS_\textt – \textr_1 \upbeta _1 \textI\frac\textS_t \textN – \textr_2 \upbeta _2 \textE\frac\textS_t \textN} \\ {\textE_t + 1 = \textE_t + \textr_1 \upbeta _1 \textI_t \frac\textS_t \textN + \textr_2 \upbeta _2 \textE_t \frac{{\textS_t }}\textN – \alpha E_t } \\ \textI_t + 1 = \textI_t + \alpha E_t – \gamma I_t \\ \textR_t + 1 = \textR_t + \gamma I_t \\ \endarray } \right.$$

(4)

### Improved SEIR model B: predicting infectious diseases considering the infectivity of incubation period and isolation measures

In the early stage of the epidemic, due to the active actions of the government, many isolation measures were taken, such as the isolation treatment of infected people in Fangcang shelter hospitals, the centralized observation and home isolation. Isolation treatments are depending on the close contacts of COVID-19. The policy provides for different levels of isolation of the susceptible, the exposed and the infected, which plays a key role in the trend of the epidemic situation. Based on the existing epidemic prevention and control measures in China, the SEIR model B was further improved in order to simulate the trend of the epidemic accurately:

As shown in Fig. 4, on the basis of the previous model, measures to isolate various groups of people are considered. First infected after diagnosis will be quarantined as isolators $$\textI_g$$ no longer contagious infection31, and after exposed (E), infected (I) contact with susceptible people (S), all in close contact with infected people (I) became the exposed (E), while the uninfected close contacts remain in the susceptible crowd (S). The exposed (E) will be quarantined after the nucleic acid test result is positive, which is called exposed isolator $$\textE_g$$. The relevant uninfected contacts will be isolated as susceptible isolators $$\textS_g$$. Among them, $$\textS_g$$ will not be infected during the isolation period, and will return to susceptible crowd (S) after the end of the isolation period (set the isolation period as $$\upmu$$). Susceptible crowd (S) have the possibility of being infected again. After the incubation period is over, if the exposed (E) who have been the isolated will be confirmed cases of COVID-19 and become infected isolator $$\textI_g$$, $$\textE_g$$ and $$\textI_g$$ are no longer infectious because they are isolated. In the end, the infected and infected isolators will be cured or die as the remover. Where, $$\textq_S$$ is the proportion of close contacts isolated among the susceptible, $$\textq_E$$ is the probability of the exposed being isolated, and $$\textq_I$$ is the probability of the infected being isolated. Based on the above population relations, the original differential equation can be extended as follows:

$$\left\{ {\beginarray*20l {\fracdSdt = – \frac{S\left[ r_1 I\left( \beta_1 + q_S – q_S \beta_1 \right) + r_2 E\left( \beta_2 + q_S – q_S \beta_2 \right) \right]}N – S_g – E_g – I_g } \\ {\fracdS_g dt = \frac{Sq_S \left[ r_1 I\left( 1 – \beta_1 \right) + r_2 E\left( 1 – \beta_2 \right) \right]}N – S_g – E_g – I_g } \\ {\fracdEdt = \fracS\left( r_1 \beta_1 I + r_2 \beta_2 E \right)N – S_g – E_g – I_g – \alpha E\left( 1 – q_E \right) – Eq_E } \\ \begingathered \fracdEgdt = Eq_E – \alpha E_g \hfill \\ \fracdIdt = \alpha E\left( 1 – q_E \right) – Iq_I – I\gamma \left( 1 – q_I \right) \hfill \\ \fracdI_g dt = E_g \alpha + Iq_I – I_g \gamma_1 \hfill \\ \fracdRdt = I\gamma \left( 1 – q_I \right) + I_g \gamma_1 \hfill \\ \endgathered \\ \endarray } \right.$$

(5)

For the convenience of calculation, let $$\fracdAdt = \frac\textS{{\textN – \textS_\textg – \textE_\textg – \textI_\textg }}$$, in (5) and modified to iterative forms available:
$$\left\{ {\beginarray*20l {\textA_\textt = \frac{\textS_\textt }{{{\textN} – \textS_\textgt – \textE_\textgt – \textI_\textgt }}} \\ {\textS_\textt + 1 = \textS_\textt – \textA_\textt \textr_1 \textI_\textt \left( \upbeta _1 + \textq_\textS – \textq_\textS \upbeta _1 \right) – \textA_\textt \textr_2 \textE_\textt \left( \upbeta _2 + \textq_\textS – \textq_\textS \upbeta _2 \right)} + \textS_{\textg\left( \textt – \upmu \right)} \\ {\textS_{{\textg\left( \textt + 1 \right)}} = \textS_\textgt + \textA_\textt \textq_\textS \left[ \textr_1 \textI_\textt \left( 1 – \upbeta _1 \right) + \textr_2 \textE_\textt \left( 1 – \upbeta _2 \right) \right] – {\textS}_{{\textg\left( \textt – \upmu \right)}} } \\ \begingathered \textE_{\textt + 1} = \textE_\textt + \textA_\textt \left( \textr_1 \upbeta _1 \textI_\textt + \textr_2 \upbeta _2 \textE_\textt \right) – \alpha E_\textt \left( 1 – \textq_\textE \right) – \textE_\textt \textq_\textE \hfill \\ \textE_{{\textg\left( {\textt + 1} \right)}} = \textE_\textgt + \textE_\textt \textq_\textE – \alpha E_\textgt \hfill \\ \textI_{{\textt + 1}} = \textI_\textt + \textE_\textt \upalpha \left( 1 – \textq_\textE \right) – \textI_\textt \textq_\textI – \textI_\textt \upgamma \left( 1 – \textq_\textI \right) \hfill \\ \textI_{{\textg\left( {\textt + 1} \right)}} = \textI_\textgt + \textE_{{\textgt}} \upalpha + \textI_{\textt} \textq_\textI – \textI_{{{\textgt}}} \upgamma _1 \hfill \\ \textR_{{{\textt} + 1}} = \textR_{{\textt}} + \textI_{{\textt}} \upgamma \left( 1 – \textq_\textI \right) + \textI_{{{\textgt}}} \upgamma _1 \hfill \\ \endgathered \\ \endarray } \right.$$
The results of improved SEIR models, A and B, are greatly affected by the initial parameters. In order to establish the SEIR infectious disease model, appropriate values of the key parameters should be selected, which are E, $$\textq_S$$, $$\textq_E$$, $$\textq_I$$, $$\textr_1$$, $$\upbeta _1$$, $$\textr_2$$, $$\upbeta _2$$, $$\upalpha$$, $$\upgamma$$, $$\upgamma _1$$, $$\upmu$$. Among them, the incidence probability ($$\upalpha$$) of the exposed is taken as the inverse of the incubation period, and the incubation period is taken as 7 days in line with most reports32,33, namely $$\upalpha = \frac17 \approx 0.1429$$. According to the isolation policy of Wuhan and Beijing, the isolution period is 14 days, namely μ = 14. $$\textq_E$$ is the isolation rate of the exposed and $$\textq_I$$ is the isolation probability of the infected. According to the current epidemic prevention and control strategy, all confirmed patients will be isolated, so $$\textq_E$$ is the accuracy rate of confirmed patients and $$\textq_{\textI} = 1$$. According to a report on April 18, 2020, the accuracy of nucleic acid detection is about 50% to 70%, and with the epidemic under control, the number of existing cases is decreasing, and nucleic acid kits are sufficient. The accuracy rate should be improved compared with the initial stage of the epidemic, so its maximum value is set, namely $$\textq_E$$ = 0.7. Other parameters are estimated independently according to different conditions in different regions.