Analysis of Medication Patterns for Traditional Chinese Medicine in Antithrombotic Therapy Based on Data Mining and Network Pharmacology
Release Date:
2021-02-27
Analysis of Medication Patterns for Traditional Chinese Medicine in Antithrombotic Therapy Based on Data Mining and Network Pharmacology
Abstract: Objective Deep venous thrombosis is a major contributor to mortality and poor prognosis in critically ill patients with coronavirus disease 2019 (COVID-19). This study aims to elucidate the formulation and medication patterns of traditional Chinese medicine (TCM) for the prevention and treatment of thrombosis, thereby providing a reference for the application of TCM in managing deep venous thrombosis in critically ill COVID-19 patients. Methods Relevant literature indexed in the China National Knowledge Infrastructure and Wanfang Database was used as the data source. Excel 2010 and Clementine 12.0 software were employed to conduct association rule analysis on the included TCM herbs. Network pharmacology was applied to perform gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses on the core herb combinations, with the aim of investigating their mechanisms of action and validating these findings through molecular docking. Results A total of 356 antithrombotic formulas were screened from the China National Knowledge Infrastructure and Wanfang Database, revealing 25 high-frequency TCM herbs. Further association analysis identified seven high-frequency herb combinations. Among them, the core combination “Carthamus tinctorius–Prunus persica seed–Paeonia lactiflora–Ligusticum chuanxiong” contains 23 active chemical constituents and shares 41 common target proteins with thrombosis. The protein–protein interaction (PPI) network primarily involves key target proteins such as IL-6, VEGFA, CASP3, ALB, EGFR, and MAPK8. GO functional enrichment analysis yielded 971 biological process terms ( P <0.05), KEGG enrichment identified 105 pathways ( P (<0.05), primarily involving pathways such as Kaposi’s sarcoma-associated herpesvirus infection, human cytomegalovirus infection, fluid shear stress and atherosclerosis, and hepatitis B; molecular docking screening identified luteolin, quercetin, and baicalein as active compounds with strong binding affinity to IL-6. Conclusion: The key active constituents in the core drug combination exert antithrombotic effects by modulating critical targets such as IL-6 and by influencing pathways including Kaposi’s sarcoma-associated herpesvirus infection, human cytomegalovirus infection, fluid shear stress and atherosclerosis, and hepatitis B, thereby providing a reference for the prevention and treatment of deep vein thrombosis in critically ill patients with COVID-19.
1 Materials and Methods
1.1 Data Source
1.2 Inclusion criteria
① The literature must involve clinical cases (regardless of whether they are randomized controlled trials); ② The literature must be related to the antithrombotic effects of traditional Chinese medicine compound formulas; ③ The compound formula described in the literature must include the complete drug composition; ④ The literature must report biochemical indicators—such as D-dimer, fibrinogen (Fbg), prothrombin time (PT), thrombin time (TT), and activated partial thromboplastin time (APTT)—that reflect antithrombotic efficacy, or results from color Doppler ultrasound examinations; ⑤ The sample size must be ≥10.
1.3 Exclusion criteria
1.4 Data Analysis
Network-based visual analysis and presentation of the included traditional Chinese medicines were conducted using Clementine 12.0, followed by association rule analysis via Apriori modeling. [10] ; The R language version 3.6.2 was used to extract information based on the “Bioconductor” package; gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were performed on the potential targets of the core drug combination for antithrombotic activity; Cytoscape software was employed to construct a “disease–drug–active ingredient–target” network diagram and a protein–protein interaction (PPI) network of common targets shared by the active ingredients and the disease; crystal structures of key target proteins were downloaded from the RCSB Protein Data Bank, water molecules in the crystal structures were removed using AutoDockTools 1.5.6, hydrogenation was performed, and charge calculations were carried out on the proteins, after which the files were saved in PDBQT format; 3D structures of key active compounds screened from the PubChem database were downloaded in SDF format, and Chem3D 17.0 and PyMOL were used to process the protein–small-molecule structures, followed by energy minimization of the compound files; finally, molecular docking was conducted using AutoDock Vina, and the results were visualized with PyMOL. [11] 。
2 Results and Analysis
2.1 Data Filtering
Based on the aforementioned inclusion and exclusion criteria, 337 articles meeting the standards were identified. The core formulas contained in these articles were entered into the database, with any added or subtracted herbs excluded, resulting in a total of 356 formulas after consolidation. The specific procedures for database retrieval and literature and formula screening are illustrated in Figure 1. [12] 。

2.2 Usage of Single-Ingredient Traditional Chinese Medicines
Refer to the 2015 Edition of the Chinese Pharmacopoeia. [13] and “Chinese Materia Medica” [14] The drug names in the 356 prescriptions were standardized, revealing a total of 198 types of traditional Chinese medicines with a cumulative occurrence frequency of 3,482. The top 10% by usage frequency were designated as high-frequency drugs. [15] A total of 25 traditional Chinese medicinal herbs were included (with a frequency ≥38), accounting for a cumulative frequency of 2,437 occurrences (69.99% of the total). The top six herbs by frequency of use are Angelica sinensis (frequency: 224, 62.92%), Carthamus tinctorius (frequency: 205, 57.58%), Ligusticum chuanxiong (frequency: 201, 56.46%), Paeonia lactiflora (frequency: 188, 52.81%), Prunus armeniaca seed (frequency: 173, 48.60%), and Achyranthes bidentata (frequency: 165, 46.35%), as shown in Table 1.

2.3 High-Frequency Drug Association Rule Analysis
The Apriori algorithm is a classic method for association rule mining. It is based on the concept of 2-item frequent itemsets and employs a recursive approach to uncover hidden association rules within frequent itemsets. [16-17] In Clementine 12.0 software, association rule analysis was conducted on high-frequency drugs (with a usage frequency ≥38) using the Apriori algorithm, with the following parameter settings: minimum support of 20%, minimum confidence of 90%, maximum number of antecedents set to 5, and minimum lift of 1. [18] Seven potential drug combinations were identified among the 356 formulas, each with an odds ratio greater than 1, indicating that these combinations are statistically significant. [17] The core herbal combination with the highest confidence and lift is safflower–peach kernel–chuanxiong–danggui; specific parameters are provided in Table 2. The association network diagram among high-frequency herbs is shown in Figure 2.


2.4 Associated drug dosage situation
To ensure the accuracy and efficacy of clinical medication for DVT in the prevention and treatment of COVID-19 using associated drug combinations, this study conducted a statistical analysis of the dosages of traditional Chinese medicines involved in the identified drug combinations. The results showed that the primary dosage ranges for Danggui, Taoren, Honghua, Chishao, and Chuanxiong were all 10–15 g, while the primary dosage range for Niuxi was 10–18 g (Table 3).

2.5 Mechanistic Analysis of the Core Drug Combination for Antithrombotic Effects Based on Network Pharmacology
2.5.1 Collection of information on the active ingredients, molecular targets, and thrombotic targets of the core drug combination. The core drug combination consisting of Safflower, Peach Kernel, Red Peony Root, and Chuanxiong exhibits the highest confidence and the greatest degree of enhancement. On the Traditional Chinese Medicine Systems Pharmacology Database and Analysis Platform (TCMSP, http://tcmspw.com/tcmsp.php), the active constituents and molecular targets of these four herbs—Safflower, Peach Kernel, Red Peony Root, and Chuanxiong—are retrieved, with criteria set at oral bioavailability (OB) ≥ 30% and drug-likeness (DL) ≥ 0.18. [19] After removing duplicate entries and non-existent mapping relationships, 46 chemical constituents and 109 putative targets were identified. Using the GeneCards database (https://www.genecards.org/), 798 targets associated with thrombosis were retrieved.
2.5.2 Disease–Drug–Active Ingredient–Target Network and PPI Network Analysis A Venn diagram was used to identify 41 shared target proteins between the active ingredients of four traditional Chinese medicines and the disease. Protein–protein interaction data were obtained from the STRING database (https://string-db.org/), and Cytoscape software was then employed to construct the disease–drug–active ingredient–target network (Figure 3) and the PPI network (Figure 4). A barplot of the target proteins was generated using an R package (Figure 5). The disease–drug–active ingredient–target network comprises 69 nodes, including 41 target genes, 23 drug active ingredients, 4 traditional Chinese medicine names, and 1 disease name; among these, quercetin, β-sitosterol, baicalein, luteolin, kaempferol, and stigmasterol are the chemical constituents with the largest number of associated targets, as shown in Table 3. In the PPI network, the strength of interactions among target proteins is positively correlated with the size and color intensity of the corresponding circles, as well as with the thickness and color intensity of the connecting lines. The target protein barplot visually represents, in the form of a histogram, the number of nodes connected to each target protein in the PPI network; the greater the number of connected nodes, the stronger the interaction with other target proteins. Ultimately, the target proteins exhibiting the strongest interaction strengths (with more than 25 connected nodes) were identified as interleukin-6 (IL-6), vascular endothelial growth factor A (VEGFA), caspase-3 (CASP3), albumin (ALB), epidermal growth factor receptor (EGFR), and mitogen-activated protein kinase 8 (MAPK8).



2.5.3 GO and KEGG pathway enrichment analyses: GO and KEGG enrichment analyses were performed on the antithrombotic targets of the core drug combination using the database (https://org.Hs.eg.db). The ClusterProfilerGO.R package in R was used to generate advanced bubble plots for biological function and pathway enrichment analyses, focusing on the top 20 pathways. In these plots, larger bubbles indicate a greater number of enriched genes. P The lower the value, the redder the color, indicating a closer association with the core drug combination for thrombosis prevention and treatment. [11] . GO analysis identified 971 biological processes (BP), primarily involving oxidative stress response, homeostasis of anatomical structure, cellular response to oxidative stress, response to lipopolysaccharides, response to acidic chemical substances, and response to bacterial-derived molecules, among others. KEGG pathway enrichment analysis revealed 105 signaling pathways ( P (<0.05), primarily involving pathways such as Kaposi’s sarcoma-associated herpesvirus infection, human cytomegalovirus infection, fluid shear stress and atherosclerosis, and hepatitis B (Figures 6 and 7).

2.6 Molecular docking validation of the interactions between the common active ingredients of the core drug combination with antithrombotic activity and the core targets.
Molecular docking was performed between the core drug group and the chemically active constituents with the greatest number of intermediate targets and the core target protein IL-6, which has the highest number of connected nodes in the PPI network (Table 4), thereby validating the results obtained from network pharmacology. In molecular docking, a binding energy less than zero indicates that the ligand and receptor can bind spontaneously. [20] Moreover, the more stable the conformation, the lower the binding energy, and the greater the likelihood of small-molecule compounds interacting with proteins. [21] In this study, binding energy ≤ −20 kJ/mol was selected as the screening criterion. [12] Docking results indicate that all six active compounds can bind to IL-6; the docking poses are shown in Figure 8, with luteolin, quercetin, and baicalein exhibiting particularly strong binding affinities.

3 Discussion
In summary, this study employed data mining and network pharmacology to investigate the formulation patterns and mechanisms of action underlying the anti-thrombotic effects of traditional Chinese medicine. It was found that the key active constituents in the core herbal combination—safflower, peach kernel, red peony root, and chuanxiong—can modulate critical targets such as IL-6 and engage pathways associated with viral infection and inflammatory vascular diseases, thereby counteracting oxidative stress, repairing endothelial injury, inhibiting platelet aggregation, and exerting anti-thrombotic effects. These findings provide both empirical evidence and theoretical guidance for the prevention and treatment of deep vein thrombosis in critically ill patients with COVID-19.
Source: Cui Linlin, Miao Mingsan. Analysis of Medication Patterns for Traditional Chinese Medicine in Anti-thrombotic Therapy Based on Data Mining and Network Pharmacology [J]. Chinese Herbal Medicines, 2021, 52(4):1063–1072.