TY - JOUR AU - Li, Deming AB - Abstract In order to comprehensively evaluate the achievements of the 'Belt and Road' in integrated transportation, researchers need to optimize the method of generating evaluation indices and construct the framework structure of the 'Belt and Road' transportation index system. This paper used GDELT database as data source and obtained full text data of English news in 25 countries along ‘the Belt and Road’. The paper also introduced the topic model, combined with the unsupervised method (latent Dirichlet allocation, LDA) and the supervision method (labeled LDA) to mine the topics contained in the news data. It constructed the transportation development model and analyzed the development trend of transportation in various countries. The study found that the development trend of transportation in the countries along the line is unbalanced, which can be divided into four types: rapid development type, stable development type, slow development type and lagging development type. The method of this paper can effectively extract temporal and spatial variation of news events, discover potential risks in various countries, support real-time and dynamic monitoring of the social development situation of the countries along the border and provide auxiliary decision support for implementation of the ‘the Belt and Road’ initiative, which has important application value. 1. Introduction The Belt and Road (B&R) is short for ‘Silk Road Economic Belt’ and ‘21st Century Maritime Silk Road’. In September and October 2013, Chinese President Xi Jinping proposed the construction of new cooperative initiative of ‘Silk Road Economic Belt’ and ‘21st Century Maritime Silk Road’. Relying on the existing bilateral and multilateral mechanisms between China and the countries concerned, with the help of existing and effective regional cooperation platforms, the B&R initiative is to borrow the historical symbols of the ancient Silk Road, hold high the banner of peaceful development and actively develop economic cooperation partnership to jointly build a community of interests, a community of destiny and a community of responsibility with mutual political trust, economic integration and cultural tolerance. The B&R runs through Asia, Europe and Africa, connecting East Asia, ASEAN, South Asia, Central Asia, West Asia, Eastern Europe, Southern Europe, Central Europe and North Africa. In addition to China, there are more than 60 countries along the B&R. Leading countries contribute 30% of the world economy, with their goods and services accounting for more than 20% of the world’s exports, and their population more than 60% of the world [1]. The B&R initiative includes both the most economically active East Asian economic circle and the well-established European economic circle, as well as countries with great potential along the hinterland. Once the vision of the B&R construction is realized, it will become a huge joint economic engine to promote the early recovery of the world economy and achieve strong global economic growth in the near future [2, 3]. As a basic industry of the national economy, transportation plays a pivotal role in economic and social development. Advancing the implementation of the B&R initiative, transportation and interoperability is the foundation and the priority area. Transportation interconnection includes not only the hardware construction of transportation infrastructure but also the software and discourse system construction of system, rule and standard convergence. The transportation index along the B&R is an important basis for the assessment of the transportation interconnection construction and an important part of the demonstration of the construction results [4]. The first stage function of the B&R transportation index is to highlight the characteristic function of the index and establish the authority of the index. The development of the B&R transportation index, in the short and medium terms, mainly has the following functions. It can be used to establish a market vane [5]. An index can reflect a change in value in one period compared with a change in value in another period that serves as a standard for comparison. The core value of the index lies in that, as a dynamic indicator, it can truly reflect the current situation and changing trend of each market under the construction of the B&R and serve as a barometer for market participants to make effective micro or macro judgments. Enterprises understand the market situation and prosperity, improve the efficiency of allocation, help reduce operating costs and enhance the competitiveness of enterprises. It can be used to set up freight price benchmark. In recent years, competition in the transportation market has intensified, and the price system has been constantly impacted, which has brought a greater test to the income stability of transportation enterprises and freight forwarders, as well as greater uncertainty to the control of logistics costs by shippers [4, 6]. The the B&R transportation index, which reflects the market price, can provide a fair, objective and transparent price benchmark for both parties of market transactions and guide the steady development of the industry. This paper focuses on the study of the index method of the B&R, taking into account the interpretation of the index results. According to the results of the B&R index, the commonality is for some developed countries with perfect infrastructure, such as low-risk rich countries in Singapore or West Asia and countries with better bases in Eastern Europe, which often appear in the top positions of various indexes. At the same time, each index starts from its own business and pays attention to the impact of the national top-level strategy of the B&R initiative on the future development trend of its own business. At the same time, it can also provide effective support basis and future improvement directions for the implementation of the B&R strategy in a certain aspect through the indexed evaluation plan and generate policy recommendations. In the paper, the GDELT database is used as a data source to extract the webpage links of news events [7] and then to obtain the full text of the news event webpage. Based on the information such as the time and place of the event, the B&R traffic development news space-time database is constructed. Then the paper introduce the latent Dirichlet allocation (LDA) theme model [8] to mine the topics contained in the news data, evaluate the development of transportation in the countries along the line, analyze the development of each country and identify hot spots. On this basis, the labeled LDA theme model is introduced. Combined with manual annotation information, the traffic news information of hotspots is deeply explored. The transportation development trend is analyzed, and the national transportation development index along the B&R based on news big data is constructed. 2. Literature review At present, the B&R has become a research hotspot in economic geography, geopolitics and other related disciplines [9, 10], and many people from geopolitics risks [10] and challenges [10] and other perspectives [6] analyzed the significance of the B&R [2]. Song et al. [3] analyzed the geographical connotation of the B&R from the time and space dimensions, predicted four major risks, then proposed five major geographic pivots; Kolosov et al. [10] analyzed the spatial connotation of the B&R and considered the B&R with multiple spatial connotations and cross-scale features, a national policy to coordinate China’s comprehensive opening to the outside world; Li [4] pointed out that it is necessary to classify the countries along the B&R into four categories: general small and medium-sized countries, sovereign disputed countries, fulcrum countries and sub-region’s strongest countries. Different strategies and cooperation approaches are adopted for different types of countries to avoid potential political and economic risks. Fichtner et al. [5] believed that the ordinary index is very different from the big data index. On one hand, it is reflected in the method of index calculation, and on the other hand, it is reflected in the sample adjustment frequency and the screening of sample stocks. The method of compilation is also slightly different from the traditional index method. The frequency adjustment of big data index is more frequent than the traditional index. Not only that, but it is also a specific application of financial economics, which can intuitively reflect the changing trend of the market. Compared with the traditional index, the big data index has the characteristics of putting various transactions and interaction data in the Internet into the sample selection scheme. The big data index is compiled based on the traditional statistical survey index and combined with its Internet data characteristics. Mei et al. [11] believes that the big data index still belongs to the index system. It can complement the traditional statistical index. It is an index written on the basis of big data. It is a relative number index that can be used to visually reflect the comprehensive changes of certain signs. So far, the Internet has become the main channel for obtaining information, providing the most direct and comprehensive source of information for the B&R research. GDELT (global data on events, location and tone) [12] is a real-time, open source social event news database that provides a convenient access to global Internet news coverage for this research. At present, many scholars have used GDELT data for research. For example, Bodas-Sagi et al. [13] analyzed the public opinion of the Spanish public on the energy policy and the correlation between public sentiment and energy price based on GDELT data; Keertipati et al. [7] obtained real-time news data through GDELT data, and two conflicts between the Sri Lankan civil war and the Fiji coup were identified by using the change point analysis method; Yonamine et al. studied the relationship between the level of violence and geospatial space based on Afghanistan’s GDELT data from 2001 to 2012 and constructed a prediction model to predict the level of violence in Afghanistan in the future.The above research uses GDELT data for single factor analysis and does not conduct comprehensive research on social development based on political, economic, military and other factors [14]. 3. Method of compiling the B&R transportation indexes The measurement criteria of the macro-society and economic situation cover the overall situation of economic, social, political, cultural and other construction of countries and regions. The scope of the project is very broad and comprehensive, and it is difficult to fully reflect the level of economic construction and development with a single indicator [15, 16]. Therefore, we need to compile a comprehensive index to assess and analyze the development and changes of the macro-social and economic situation more comprehensively, accurately and broadly. The index study that reflects the macroeconomic situation involves all aspects of social life, such as the consumer price index that reflects inflation levels, the Dow Jones Indexes that use arithmetic averages to reflect stock price volatility and the purchasing managers’ index that reflects economic trends. This section summarizes the existing work from several main aspects such as sample frame selection, index system construction, weight design and index adjustment. 3.1. Sample selection The index study is divided into sampling method and full sample method according to the amount of data used. According to the Society Newsletter of December 2012, the China Computer Society’s Big Data Expert Committee has not yet concluded whether the big data approach is equivalent to the full data approach [17]. Government agencies such as the National Bureau of Statistics are the departments with the largest amount of data in China. In the actual work, considering the cost of data acquisition, the government statistics such as the National Bureau of Statistics generally conduct a full survey of units above a certain size limit on the basis of the overall census and sample the units lower than the limit. In another example, when calculating the real effective exchange rate (REER), the International Monetary Fund (IMF) calculates the full sample of 185 countries, while other institutions such as the Bank for International Settlements sample 52 economies data. Sampling methods are used more in the study of index compilation. The size of the sample size depends on (i) the degree of variation of the subject, (ii) error and accuracy requirements, (iii) confidence requirements, (iv) overall size and distribution and (v) sampling methods and other factors. Common sampling methods are simple random sampling, stratified sampling and multi-level sampling. In sampling inference, the sampling error is controlled by the sample size, and a simple random sampling is taken as an example to illustrate the method of determining the sample size. The total variance, the limit error, the two-sided quantile of the standard normal distribution corresponding to the confidence, the overall ratio, the sampling method, etc. are known, and the sample size required for the population mean estimation is the following: $$\begin{align} &n = \frac{{{Z^2}{\sigma ^2}}}{{{\Delta ^2}}} \textrm{(Repeat sampling)} \end{align}$$(1) $$\begin{align}& n = \frac{{{Z^2}{\sigma ^2}N}}{{N{\Delta ^2} + {Z^2}{\sigma ^2}}}\textrm{(Non-repetitive sampling)}\end{align}$$(2) The sample size for the overall ratio estimation is the following: $$\begin{align}& n = \frac{{{Z^2}P(1 - P)}}{{{\Delta ^2}}}\textrm{(Repeat sampling)}\end{align}$$(3) $$\begin{align}& n = \frac{{N{Z^2}P(1 - P)}}{{N{\Delta ^2} + {Z^2}P(1 - P)}}\textrm{(Non-repetitive sampling)}\end{align}$$(4) When the population variance |${\sigma ^2}$| or the population ratio |$P$| is unknown, the sample variance or sample ratio or a historical similar population variance (or population ratio) may be substituted. When calculating the sample size required for the overall ratio, it can also be replaced by the maximum value of 0.25 of |$p(1 - p)$|⁠. 3.1.1. Selection of samples Principles of sample selection: (i) representativeness. The samples that are sufficiently representative of the influence of the index are usually selected, such as ranking according to a specific metric, and then screening; (ii) reasonable structure, it has certain complementarity and can comprehensively cover different research scopes between samples; (iii) the availability of sample data. The more samples that meet the requirements, the better the calculation accuracy. However, as a researcher, it is necessary to consider the availability of data and the complexity of calculation to achieve a balance; (iv) granularity, choosing the right granularity is necessary. For example, in the Fed’s calculation of REER, the euro zone is calculated as a single indicator, and the bilateral exchange rate between the euro and the US dollar is calculated. The IMF and other institutions calculate the euro zone member countries separately, it can be seen that the Fed has made full use of granularity. 3.1.2. Sample rotation Repeated surveys should ensure that samples from each period are highly representative. In practice, it is also a common method to rotate samples on a regular basis. If the frequency of investigation is high, such as the month or season, the sample may be rotated several times; if the frequency of investigation is relatively low, such as more than half a year, it can be used continuously, and the rotation interval is increased according to the actual effect. 3.2. Construction of indicator system When selecting indicators, it is necessary to fully consider the systematize, comparability and comprehensiveness of the indicators. It is also necessary to consider the use of direct and indirect indicators to resolve the contradiction between the scientificity and availability of the indicator system. The construction of the indicator system is guided by the following principles: (i) scientific. Starting from the basic theory of macroeconomics, the selected indicators should accurately reflect their development laws and development levels to accurately reflect their development status and development characteristics; (ii) integrity. The selected indicators can fully reflect the overall level, but also reflect the status and impact of various aspects (each element); (iii) is comprehensive. The selected series of indicators should be general and comprehensive. It is possible to comprehensively develop macroeconomic levels with as few and precise indicators as possible; (iv) operational. Based on scientific considerations, the selected indicators can objectively reflect the development process and can obtain more accurate data, so that quantitative evaluation and monitoring can be carried out; (v) comparable. The indicator system must conform to the national conditions, reflect the actual level of macroeconomics and in consideration of horizontal comparison [18]. 3.3. Standardized data processing Before calculating each index, it is necessary to standardize the data of each index. Through standardization, the dimension of each index can be eliminated, and the discrete index can be quantified so that it can be added up. Specific steps are as follows: (i) determining the data threshold of indicators, each of which includes the maximum threshold and the minimum threshold and (ii) standardizing calculation [19]. Standardizing calculation of indicators can make different dimensions of various indicators into values that can be directly calculated by adding. If the index values are evenly distributed, the following general standardized formulas are used for calculation: |$Xi$|- index value; |$Xmin$|- minimum threshold; |$Xmax$|- maximum threshold. $$\begin{align}& {Z_i} = \frac{{{X_i} - {X_{min}}}}{{{X_{max}} - {X_{min}}}}\end{align}$$(5) If the values of each index are quite different, the logarithmic standardized calculation method is adopted to eliminate the adverse effects caused by the large differences of the index data. The calculation formula is as follows: $$\begin{align}& {Z_i} = \frac{{lg{X_i} - lg{X_{min}}}}{{lg{X_{max}} - lg{X_{min}}}}.\end{align}$$(6) 3.4. Weight selection Weight refers to the importance of a factor or indicator relative to something, which differs from the usual weight, reflecting not just the percentage of a factor or indicator, but emphasizing the relative importance of the factor or indicator, tending to contribute or importance. The selection of weights mostly adopts average empowerment, proportional empowerment, Delphi method (expert opinion method) [20], analytic hierarchy process (AHP), regression analysis, etc. There are also studies that use a dynamic weighting method that uses dynamic weightings based on changes in index constituents over time. Average empowerment and proportional empowerment are the most direct methods. For example, in recent years, the main international organizations used the average empowerment method to calculate the comprehensive statistical index of information technology. Proportional empowerment is mostly used in the scenario where there is a clear relationship between the value of each industry or region, such as the proportion of each industry, product and business service and so on to GDP. The theoretical basis of expert opinion method is assuming there are M evaluation subjects and N evaluation sub-indexes, it is proved that if each evaluation subject has weight data for each sub-index separately, the weight design scheme that guarantees the smallest difference of evaluation results among evaluation subjects is to take the arithmetic average value of M weight vectors; however, if not, the existence of the above-mentioned empowerment information guarantees that the approximate optimal solution to obtain the smallest difference in evaluation results is to assign the same weight to the N sub-indicators under the circumstances that a large number of evaluation subjects may disagree with each other. For the scenarios where there is no proportional relationship between the real value of the industry or region and the indicators to be investigated, examples in public service industries are medical education, public security and administration, etc; the level of quantified values such as consumption and economic growth rate of residents is not completely equal to the importance of this field, so it is not easy to use such indicators as consumption or economic growth rate of residents to measure [21]. To calculate the weight, it is necessary to calculate the weight according to the importance of residents to these indicators. Generally, Delphi method and AHP are combined to determine the weight of the indicators. Other studies use questionnaire method to obtain the score of each index and overall evaluation and use least squares and regression model to determine the weight. For the weight of multi-level index system, the combination of subjective and objective evaluation is usually used. For example, for the confirmation of the weights of multi-level index system, the first several levels of indicators can be subjectively assigned, such as the AHP combined with Delphi method. The last level of indicators can be objectively assigned, such as marker coefficient of variation method, which combines the weights of evaluation indicators with the changes of index values and gives larger weights to the indicators with larger changes. 3.5. Index adjustment Since index research and setting is a comprehensive evaluation of complex socio-economic phenomena, when adjusting for changes in indices, samples and weights, they must be scientifically linked to ensure that the method and adjustment factors between the calculations of the two systems have comparability. The divisor adjustment method is a method of calculating average stock prices created by the Dow Jones Company in 1928. The core of the correction method is to obtain a constant divisor, which is used to ensure that the index result is not changed by changes in the calculation method. The steps of the calculation are as follows: (i) calculate the index for the common period using two separate systems; (ii) divide the value calculated by the original method with the value calculated by the new method. The corresponding ratios for each period and for each categorical index are obtained; (iii) choose the period for which the coefficients are to be calculated; (iv) incorporate the calculated. The corresponding rate data for each year is calculated as an arithmetic mean and the result is the final adjustment factor. 4. Data source and processing 4.1. Data source GDELT is Google’s free global news database in 2013. It monitors global broadcast, print and online news data in more than 100 languages in real time, uses machine translation technology to convert language news into English and extracts news events from it. The frequency is once every 15 minutes. As of 2016, the total number of incidents reached 350 million [22]. The GDELT event attribute contains a total of 58 fields for the following five categories: eventanddataattre, actorattributes, eventactionattributes, evegraphy and datamanagemeds. The event content is not included, and some events provide their corresponding page links. GDELT uses pattern recognition and simple syntax to encode events to get their event type. As shown in Table 1, the pieces are divided into 20 categories, and each category includes many sub-categories, which are of more than 300 categories. The GDELT event category is mainly based on economic construction. For other topics (such as economic development and infrastructure construction), this article uses natural language processing technology to autonomously mine news event categories for subsequent research. The full text of the event webpage, combined with the information of the event, the location of the event and other information, built a the B&R transportation development news space-time database. 4.2. Data pre-processing In the real world, data are often incomplete (missing some attribute value of interest), inconsistent (contains code or name) (of differences) and highly susceptible to noise (errors or outliers). Because the database is so large, and the datasets often come from multiple heterogeneous data sources, low-quality data will lead to low quality mining results. Just like a chef who wants to make delicious steamed fish, if the fish is not de-scaled, it will not be delicious for our mouth. Data pre-processing is a reliable way to solve the data problems mentioned above. Data pre-processing generally includes gathering, segmentation, cleaning and normalization, feature extraction and modelling. The sequence of human input sentences is transformed into a vector form recognizable by the computer through data pre-processing, and good sentences are selected as training set [23]. Gathering routines combine and unify data from multiple data sources, and the process of building a data warehouse is actually data integration. Segmentation is a process of generating a sequence of words according to a certain standard. The word segmentation in English, spaces can be used as natural separators between words. Cleaning refers to ‘cleaning’ by filling in missing values, smoothing noisy data, identifying or removing outliers and resolving inconsistencies. The main objectives are to achieve the following: format standardization, exception data removal, error correction and removal of duplicate data. Without going into too much detail about feature extraction and modeling, it is always about pre-processing the data to make the data we are going to use more standardized. This paper uses GDELT to get 7 million 320 thousand news event data from 25 countries along the belt road from 2013 to 2017. According to the web page links of events, crawler technology is used to crawl its original web pages, and the text is extracted from the cumulative contribution model based on DOM node correlation. The full text of the news is obtained, and then cleaned up, and a spatio-temporal alignment of the news database is constructed. 4.3. The topic extraction algorithm of LDA News reports often contain a number of central ideas (i.e. themes), covering the natural, social, political, military, economic, cultural and other fields, and the semantic model is more complex. Therefore, this paper introduces the topic model, to mine the topic information contained in news reports, and explores the social development trend by studying the temporal and spatial evolution of news topics. LDA [8] is a commonly used probabilistic topic model with three-layer structure of words, topics and documents. According to the basic assumption of LDA, a news document can be regarded as ‘selecting a topic with a certain probability and selecting a certain word from the subject with a certain probability’. The probability of occurrence of a word in a news document is as follows: $$\begin{align}&P(w|doc) = \sum_{theme} {P(w|theme)P(theme|doc).}\end{align}$$(7) The subject vector |$\overrightarrow{{\theta _m}}$| and the word vector |$\overrightarrow{{\varphi _k}}$|⁠, respectively, obey the Dirichlet prior distribution |$\Delta (\overrightarrow \alpha )$|⁠, |$\Delta (\overrightarrow \beta )$| of the parameters |$\overrightarrow \alpha $| and |$\overrightarrow \beta $|⁠, let |$w$| denote the vocabulary, and |$z$| denote the subject, then the joint probability distribution of the above formula can be expressed as follows: $$\begin{align} &P(\overrightarrow w,\overrightarrow z |\overrightarrow \alpha,\overrightarrow \beta ) = P(\overrightarrow w,\overrightarrow z,\overrightarrow \beta )P(\overrightarrow z,\overrightarrow \alpha ).\end{align}$$(8) |$n_{m,\neg i}^k$|- the number of feature words in the news document |$m$| that are not assigned to the subject |$k$|⁠; |$n_{k,\neg i}^t$|- the number of times the feature word |$t$| is not assigned to the subject |$k$|⁠; |$\sum \limits _{k = 1}^K {n_{m,\neg i}^{(k)}}$|- the number of feature words that are not assigned to each topic in the news document; |$\sum \limits _{t = 1}^V {n_{k,\neg i}^{(t)}}$|- the number of times each feature word of the subject |$k$| is not assigned. $$\begin{align}&P = ({z_i} = k|w,z) = \left( {\frac{{n_{m,\neg i}^k + {\alpha _k}}}{{\sum\limits_{k = 1}^K {n_{m,\neg i}^{(k)} + {\alpha _k}}}}} \right) \times \left( {\frac{{n_{k,\neg i}^t + {\beta _t}}}{{\sum\limits_{t = 1}^V {n_{k,\neg i}^{(t)} + \beta }}}} \right)\end{align}$$(9) By using the known data and document generation rules, we can find the maximum values of the hyper-parameters alpha and beta in Formula (3), thus mining the hidden topics in the news spatio-temporal database. LDA is an unsupervised model with simple use and high efficiency, but it has some problems, such as difficulty to interpret the subject vector, inconvenient parameter adjustment and limited accuracy. On the basis of annotated data, the supervised subject model labeled LDA is introduced. Compared with LDA, it can effectively utilize expert knowledge, establish mapping relationship between labels and topics and avoid the problem that topic vectors are difficult to interpret. It significantly improve classification accuracy. However, it requires a lot of manpower to annotate data, which is difficult to be used in situations with wide coverage and large scale. For this reason, this paper adopts the method of topic mining from coarse to fine, which combines unsupervised method with supervised method [24]. Firstly, it uses LDA model to mine unsupervised topics from news data of different countries and makes a preliminary study on the temporal and spatial variation of news events, obtains the general picture of transportation situation and finds hot spots; secondly, it labels news data of hot spots artificially and obtains them. Based on the labeled LDA model, the traffic situation of hot spots is deeply excavated, and the fine portraits of hot spots are obtained. The topic extraction algorithm of LDA is shown in the following figure: Figure 1. Open in new tabDownload slide LDA flow chart. Figure 1. Open in new tabDownload slide LDA flow chart. 4.4. Quantitative assessment of national transportation along the B&R Due to the differences in the social and economic development situation of the countries along the line, the degree of transportation construction is also different, so the types of transportation news topics are not the same [25]. Therefore, this paper uses LDA topic model to mine the topics of the transportation news data of the countries along the line and obtains the probability vector of each news topic, then introduces |$K$|-means clustering algorithm to divide the topic probability vector. If the vector value is not enough to be divided into three categories, or the maximum probability value is less than the threshold value (the threshold value is related to the number of topics set during topic mining) or the difference between clustering centres is too small, the news document cannot be considered to correspond to any topic, and the news document is discarded. Finally, the topic in the cluster corresponding to the maximum probability value is taken as the topic of the news document, and the probability vector of the analyzed topic words is used for manual interpretation of the topic. A total of 9 types of news topics of interest are obtained, and the 10 topic words with the maximum probability of each topic are shown in Table 2. Table 1. Mining of the theme words of the B&R transportation news. Event . The theme words . . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . Highway highway service truck gas mileage road surfaces highway port freight volume service area Shipping port container ship freight volume coal fuel ocean weather discharge Aviation flight aviation fuel freight volume transport airport business goods time order Energy fuel enterprise economic natural gas policy energy supply pipeline reserve Economic market company budget government transaction policy safety growing up development Religion religion Muslim community free army safety faith offend conflict Army attack army war arms defense government attack force action Horror terror molecule Muslim economic attack force kill damage obstruct explosion Tourism tourist travel hostel development festival culture business service increase Event . The theme words . . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . Highway highway service truck gas mileage road surfaces highway port freight volume service area Shipping port container ship freight volume coal fuel ocean weather discharge Aviation flight aviation fuel freight volume transport airport business goods time order Energy fuel enterprise economic natural gas policy energy supply pipeline reserve Economic market company budget government transaction policy safety growing up development Religion religion Muslim community free army safety faith offend conflict Army attack army war arms defense government attack force action Horror terror molecule Muslim economic attack force kill damage obstruct explosion Tourism tourist travel hostel development festival culture business service increase Open in new tab Table 1. Mining of the theme words of the B&R transportation news. Event . The theme words . . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . Highway highway service truck gas mileage road surfaces highway port freight volume service area Shipping port container ship freight volume coal fuel ocean weather discharge Aviation flight aviation fuel freight volume transport airport business goods time order Energy fuel enterprise economic natural gas policy energy supply pipeline reserve Economic market company budget government transaction policy safety growing up development Religion religion Muslim community free army safety faith offend conflict Army attack army war arms defense government attack force action Horror terror molecule Muslim economic attack force kill damage obstruct explosion Tourism tourist travel hostel development festival culture business service increase Event . The theme words . . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . Highway highway service truck gas mileage road surfaces highway port freight volume service area Shipping port container ship freight volume coal fuel ocean weather discharge Aviation flight aviation fuel freight volume transport airport business goods time order Energy fuel enterprise economic natural gas policy energy supply pipeline reserve Economic market company budget government transaction policy safety growing up development Religion religion Muslim community free army safety faith offend conflict Army attack army war arms defense government attack force action Horror terror molecule Muslim economic attack force kill damage obstruct explosion Tourism tourist travel hostel development festival culture business service increase Open in new tab Table 2. The news subject frequency of the B&R events. Nation . Subject frequency % . . Highway . Ocean shipping . Aviation . Energy . Economic . Religion . Military . Horror . Tourism . Afghanistan 0.5 - 0.1 0.2 0.4 1 75.2 86.1 - Pakistan 16.2 15.1 9.6 2.7 21.2 30.3 3.8 56.2 - Iran 18.3 25.2 13.8 21.3 19.9 43.6 68.4 6.3 4.2 United Arab Emirates 24.3 34.1 27.8 24.1 71.1 27.7 - - 12.8 Qatar 11.2 23.3 26.1 25.9 44.3 20.1 - 2.3 23.3 Syria 1.3 4.6 2.2 4 3.1 16.6 65.8 55.5 - Saudi Arabia 33.1 68.9 54.2 87.2 37.2 45.5 65.2 33.2 - Oman 6.5 12.2 16.1 13.2 87.6 2.1 - - 23.6 Yemen - - - - 2 - 63.3 - - Iraq 2.3 1.1 2.1 23.5 1 8.3 43.3 32.2 - Jordan 13.3 21.2 16.7 14.5 33.2 2.1 - - 12.1 Indonesia 15.2 45.3 24.2 16.3 40.1 3.2 3.7 1.2 33.2 Laos 0.6 0.2 - - 23.1 - - - 12.1 Malaysia - 23.2 12.1 1.4 57.2 7.3 - - 19.8 Thailand 1.3 21.2 0.4 2.8 61.2 - 0.1 - 48.8 Myanmar 0.2 0.1 - - 10.2 8.7 23.1 - 6.7 Philippines - 2.4 - - 33.3 - 18.7 1.3 5.5 Kazakhstan 10.2 - 1.8 16.6 40.2 10 - - - Kyrgyzstan 8.1 - 1.1 15.2 33.3 14 1.2 8.8 - Uzbekistan 0.6 - - 3.4 16 1.2 - - - Nation . Subject frequency % . . Highway . Ocean shipping . Aviation . Energy . Economic . Religion . Military . Horror . Tourism . Afghanistan 0.5 - 0.1 0.2 0.4 1 75.2 86.1 - Pakistan 16.2 15.1 9.6 2.7 21.2 30.3 3.8 56.2 - Iran 18.3 25.2 13.8 21.3 19.9 43.6 68.4 6.3 4.2 United Arab Emirates 24.3 34.1 27.8 24.1 71.1 27.7 - - 12.8 Qatar 11.2 23.3 26.1 25.9 44.3 20.1 - 2.3 23.3 Syria 1.3 4.6 2.2 4 3.1 16.6 65.8 55.5 - Saudi Arabia 33.1 68.9 54.2 87.2 37.2 45.5 65.2 33.2 - Oman 6.5 12.2 16.1 13.2 87.6 2.1 - - 23.6 Yemen - - - - 2 - 63.3 - - Iraq 2.3 1.1 2.1 23.5 1 8.3 43.3 32.2 - Jordan 13.3 21.2 16.7 14.5 33.2 2.1 - - 12.1 Indonesia 15.2 45.3 24.2 16.3 40.1 3.2 3.7 1.2 33.2 Laos 0.6 0.2 - - 23.1 - - - 12.1 Malaysia - 23.2 12.1 1.4 57.2 7.3 - - 19.8 Thailand 1.3 21.2 0.4 2.8 61.2 - 0.1 - 48.8 Myanmar 0.2 0.1 - - 10.2 8.7 23.1 - 6.7 Philippines - 2.4 - - 33.3 - 18.7 1.3 5.5 Kazakhstan 10.2 - 1.8 16.6 40.2 10 - - - Kyrgyzstan 8.1 - 1.1 15.2 33.3 14 1.2 8.8 - Uzbekistan 0.6 - - 3.4 16 1.2 - - - - indicates no data. Open in new tab Table 2. The news subject frequency of the B&R events. Nation . Subject frequency % . . Highway . Ocean shipping . Aviation . Energy . Economic . Religion . Military . Horror . Tourism . Afghanistan 0.5 - 0.1 0.2 0.4 1 75.2 86.1 - Pakistan 16.2 15.1 9.6 2.7 21.2 30.3 3.8 56.2 - Iran 18.3 25.2 13.8 21.3 19.9 43.6 68.4 6.3 4.2 United Arab Emirates 24.3 34.1 27.8 24.1 71.1 27.7 - - 12.8 Qatar 11.2 23.3 26.1 25.9 44.3 20.1 - 2.3 23.3 Syria 1.3 4.6 2.2 4 3.1 16.6 65.8 55.5 - Saudi Arabia 33.1 68.9 54.2 87.2 37.2 45.5 65.2 33.2 - Oman 6.5 12.2 16.1 13.2 87.6 2.1 - - 23.6 Yemen - - - - 2 - 63.3 - - Iraq 2.3 1.1 2.1 23.5 1 8.3 43.3 32.2 - Jordan 13.3 21.2 16.7 14.5 33.2 2.1 - - 12.1 Indonesia 15.2 45.3 24.2 16.3 40.1 3.2 3.7 1.2 33.2 Laos 0.6 0.2 - - 23.1 - - - 12.1 Malaysia - 23.2 12.1 1.4 57.2 7.3 - - 19.8 Thailand 1.3 21.2 0.4 2.8 61.2 - 0.1 - 48.8 Myanmar 0.2 0.1 - - 10.2 8.7 23.1 - 6.7 Philippines - 2.4 - - 33.3 - 18.7 1.3 5.5 Kazakhstan 10.2 - 1.8 16.6 40.2 10 - - - Kyrgyzstan 8.1 - 1.1 15.2 33.3 14 1.2 8.8 - Uzbekistan 0.6 - - 3.4 16 1.2 - - - Nation . Subject frequency % . . Highway . Ocean shipping . Aviation . Energy . Economic . Religion . Military . Horror . Tourism . Afghanistan 0.5 - 0.1 0.2 0.4 1 75.2 86.1 - Pakistan 16.2 15.1 9.6 2.7 21.2 30.3 3.8 56.2 - Iran 18.3 25.2 13.8 21.3 19.9 43.6 68.4 6.3 4.2 United Arab Emirates 24.3 34.1 27.8 24.1 71.1 27.7 - - 12.8 Qatar 11.2 23.3 26.1 25.9 44.3 20.1 - 2.3 23.3 Syria 1.3 4.6 2.2 4 3.1 16.6 65.8 55.5 - Saudi Arabia 33.1 68.9 54.2 87.2 37.2 45.5 65.2 33.2 - Oman 6.5 12.2 16.1 13.2 87.6 2.1 - - 23.6 Yemen - - - - 2 - 63.3 - - Iraq 2.3 1.1 2.1 23.5 1 8.3 43.3 32.2 - Jordan 13.3 21.2 16.7 14.5 33.2 2.1 - - 12.1 Indonesia 15.2 45.3 24.2 16.3 40.1 3.2 3.7 1.2 33.2 Laos 0.6 0.2 - - 23.1 - - - 12.1 Malaysia - 23.2 12.1 1.4 57.2 7.3 - - 19.8 Thailand 1.3 21.2 0.4 2.8 61.2 - 0.1 - 48.8 Myanmar 0.2 0.1 - - 10.2 8.7 23.1 - 6.7 Philippines - 2.4 - - 33.3 - 18.7 1.3 5.5 Kazakhstan 10.2 - 1.8 16.6 40.2 10 - - - Kyrgyzstan 8.1 - 1.1 15.2 33.3 14 1.2 8.8 - Uzbekistan 0.6 - - 3.4 16 1.2 - - - - indicates no data. Open in new tab Based on the statistics of the traffic and transportation news topics in the countries along the line, the percentage of all kinds of news topics in the total news of each country is calculated (Table 3). Taking the administrative units at the provincial level of 25 countries along the line as the unit, the frequency of various news topics is calculated, and the frequency vector of news topics in each province is obtained, and the news topics with the highest proportion are regarded as the key social elements of the region. The data source of the above research is mainly statistical data, which is difficult to obtain and has low time resolution. Moreover, due to the lack of micro scale full element details, it is difficult to reflect the internal relationship between things and complex social reality problems. However, news data cover a wide range and have a large amount of information, strong timeliness and low acquisition difficulty, which can reflect both macro elements and micro details. Therefore, this paper mines the theme events based on the news data, takes the news theme events as the evaluation index and establishes the evaluation model of transportation development. The formula is showed as follows: |$st_{c}$|-the degree of transportation development in country c; |$e_{i}$|- the proportion of time-space events in this country’s news to the total number of events in the country; |$w_{i}$|-the type of event reflects the ability of transportation development, the range of values is [-10, 10], the positive and negative values, respectively, represent the positive and negative impacts of such events on transportation development; $$\begin{align}& st_c=\sum w_i e_i. \end{align}$$(10) Referring to [19] to determine the weight of indicators, this paper designed a questionnaire and invited five relevant experts from Peking University School of International Relations to quantify the values of each type of event. The mean value of each type of event in the questionnaire is taken as the final weight. The results are shown in 4. The data of Tables 3 and 4 are substituted into the formula to obtain the transportation development index of the countries along the line (Table 5). Table 3. National Transportation Development Index along the B&R. Positive influence . Negative influence . Event Weights Event Weights highway 6.8 religion --4.3 ocean shipping 6.8 economic 6.9 army --5.6 aviation 6.6 energy 6.4 horror --8 tourism 5.4 Positive influence . Negative influence . Event Weights Event Weights highway 6.8 religion --4.3 ocean shipping 6.8 economic 6.9 army --5.6 aviation 6.6 energy 6.4 horror --8 tourism 5.4 Open in new tab Table 3. National Transportation Development Index along the B&R. Positive influence . Negative influence . Event Weights Event Weights highway 6.8 religion --4.3 ocean shipping 6.8 economic 6.9 army --5.6 aviation 6.6 energy 6.4 horror --8 tourism 5.4 Positive influence . Negative influence . Event Weights Event Weights highway 6.8 religion --4.3 ocean shipping 6.8 economic 6.9 army --5.6 aviation 6.6 energy 6.4 horror --8 tourism 5.4 Open in new tab Table 4. National Transportation Development Index along the B&R. Nation . Construction . Afghanistan --8.02 Syria --9.3 Jordan 5.76 Myanmar 2.13 Pakistan --4.12 Saudi Arabia 4.51 Indonesia 4.31 Philippines 1.02 Iran 1.24 Oman 6.26 Laos 2.31 Kazakhstan 5.8 United Arab Emirates 6.05 Yemen --8.32 Malaysia 6.02 Kyrgyzstan 3.8 Qatar 5.62 Iraq --7.88 Thailand 6.63 Uzbekistan 3.45 Nation . Construction . Afghanistan --8.02 Syria --9.3 Jordan 5.76 Myanmar 2.13 Pakistan --4.12 Saudi Arabia 4.51 Indonesia 4.31 Philippines 1.02 Iran 1.24 Oman 6.26 Laos 2.31 Kazakhstan 5.8 United Arab Emirates 6.05 Yemen --8.32 Malaysia 6.02 Kyrgyzstan 3.8 Qatar 5.62 Iraq --7.88 Thailand 6.63 Uzbekistan 3.45 Open in new tab Table 4. National Transportation Development Index along the B&R. Nation . Construction . Afghanistan --8.02 Syria --9.3 Jordan 5.76 Myanmar 2.13 Pakistan --4.12 Saudi Arabia 4.51 Indonesia 4.31 Philippines 1.02 Iran 1.24 Oman 6.26 Laos 2.31 Kazakhstan 5.8 United Arab Emirates 6.05 Yemen --8.32 Malaysia 6.02 Kyrgyzstan 3.8 Qatar 5.62 Iraq --7.88 Thailand 6.63 Uzbekistan 3.45 Nation . Construction . Afghanistan --8.02 Syria --9.3 Jordan 5.76 Myanmar 2.13 Pakistan --4.12 Saudi Arabia 4.51 Indonesia 4.31 Philippines 1.02 Iran 1.24 Oman 6.26 Laos 2.31 Kazakhstan 5.8 United Arab Emirates 6.05 Yemen --8.32 Malaysia 6.02 Kyrgyzstan 3.8 Qatar 5.62 Iraq --7.88 Thailand 6.63 Uzbekistan 3.45 Open in new tab It can be seen from Table 5 that the social stability between countries along the line is quite different, and the social development situation is not balanced. According to the degree of social stability, the countries along the line are divided into 4 categories: (i) rapid development, including Oman, United Arab Emirates, Malaysia, Kazakhstan and Thailand. Such countries have a relatively high economic activity and a good economic development. Among them, the coastal areas of the Caspian Sea in Kazakhstan are rich in oil resources, and energy activities occupy a clear advantage, so they have a strong driving role in the development of transportation within the region. (ii) Stable development, including Kuwait, Jordan, Indonesia and other countries. Among them, Kuwait’s economic activities and energy activities accounted for 28.8% and 15.6%, respectively, indicating that Kuwait is rich in energy and has relatively frequent economic activities, which has certain promotion effects on transportation development. (iii) Slower development, such as Kyrgyzstan, Uzbekistan, the Philippines, Iran, Pakistan, Myanmar and other countries. Among them, Kyrgyzstan’s religious activities and military conflicts accounted for 32.7% and 11.3%, respectively, and its domestic political situation was found to be unstable. Uzbekistan’s Andijan area has more frequent human rights violations. It is a hotspot in Central Asia and is located in Ferghana Basin, which is the junction of Uzbekistan, Tajikistan and Kyrgyzstan. The borders of the country are intertwined, the ethnic and religious issues are very complicated, and the conflicts are constant. It is the ‘gunpowder barrel’ area in Central Asia; the proportion of terrorist attacks in the Philippines reached 12.8%, mainly distributed in Mindanao and Greater Manila; terrorist attacks in Iran and Pakistan, Iraq bordered occur frequently, and naval incidents in the Strait of Hormuz and Caspian Sea were frequent, which coincided with the deployment of naval bases in the region. Due to frequent military conflicts and terrorist attacks, the domestic economic and transportation development has been seriously affected. (iv) Development lags, including Yemen, Syria, Iraq and Afghanistan. The proportion of military conflicts in the four countries reached 53.8%, 50.9%, 50.4% and 75.4%, respectively. Among them, Yemen and Syria are trapped in the quagmire of civil war, the people are displaced, and the terrorist forces are large; the Taliban forces in Afghanistan are still active, and military conflicts and terrorist attacks are frequent, making them the most unstable countries along the line; Kurdish in Iraq Armed and Islamic State (ISIS) is more active, with frequent military activities and terrorist attacks, which have caused a fatal impediment to the development of transportation in these countries. It can be found that Southeast Asia and Central Asia are relatively stable overall. Myanmar is the only unstable country in Southeast Asia. The political, religious and ethnic issues in West Asia and South Asia are complicated and there are many social instability factors. Oman and the United Arab Emirates are relatively stable. In the implementation of the B&R initiative, stable and relatively stable countries can be selected for transportation infrastructure investment and cooperation to reduce risks and reduce economic losses. 4.5. Conclusions This paper uses the GDELT database as a data source to obtain online news data from countries along the B&R initiative, combining the LDA and labeled LDA thematic models for news topic event mining. The temporal and spatial changes in news topics, the development of transportation modes (road, sea, air), social stability and security risk factors were the study, which drew the following conclusions. The social development of the countries along the route is uneven. Based on the degree of social stability, the countries along the route can be divided into four categories. (i) fast types such as Oman, UAE, Malaysia, Thailand and Kazakhstan; (ii) steadily developing, such as Kuwait, Jordan, Indonesia, etc.; (iii) slower developing types, such as Kyrgyzstan, Indonesia, Uzbekistan, Philippines, Iran, Pakistan, Myanmar, etc.; (iv) lagging developmental types, such as Yemen, Syria, Iraq and Afghanistan. In the development of transport cooperation in the B&R, we can prioritize stable countries and reduce the risk of investment and cooperation Reducing Economic Losses. In this paper, some news data are selected for manual annotation, and a verification data set is generated to compare the mining precision between LDA and labeled LDA. Because of the large differences in topics between the two types of models, it is necessary to label the data of both models separately. According to the validation data, the number of topics is correctly inferred and the accuracy is calculated using the LDA and labeled LDA models, respectively. It can be found that the accuracy of labeled LDA is higher than LDA and the categories are more refined, indicating that by adding manual labeling information, which can significantly improve the mining effect of topic models. This paper also constructs a set of unconstrained and supervised methods by combining temporal mining methods from coarse to fine social development. First, the macro-social trends are explored to identify hotspots using LDA, and then labeled LDA is used to explore the hotspots for in-depth analysis. It can effectively explore their economic and industrial structures, identify major social events and discover their social security risks and trends. The method of this paper can achieve dynamic monitoring of the social development of the countries along the route and provide risk early warning for the advancement and implementation provide decision-support aids that are of great application value. Funding Program of the Co-Construction with Beijing Municipal Commission of Education of China (Grant No. B18H100040). REFERENCES 1 Debin , D. and Yahua , M. ( 2015 ) One belt and one road: the grand geo-strategy of China’s rise . Geogr. Res. , 34 , 1005 – 1014 . Google Scholar OpenURL Placeholder Text WorldCat 2 Yabo , Z. , Xiaofeng , L. and Yuejing , G. ( 2017 ) Analysis of the oil and gas resource distribution pattern along the belt and road and the interdependence relationship with China . Geogr. Res. , 36 , 2305 – 2320 . Google Scholar OpenURL Placeholder Text WorldCat 3 Song , C. et al. ( 2018 ) Undertaking research on belt and road initiative from the geo-relation perspective . Geogr. Res. , 1 , 3 – 19 . Google Scholar OpenURL Placeholder Text WorldCat 4 Li , X. ( 2015 ) Diplomatic risks facing China’s belt and road strategy . Internat. Econom. Rev. , 2 , 68 – 79 . Google Scholar OpenURL Placeholder Text WorldCat 5 Fichtner , J. , Heemskerk , E.M. and Garcia-Bernardo , J. ( 2017 ) Hidden power of the big three? Passive index funds, re-concentration of corporate ownership, and new financial risk . Bus. Polit. , 19 , 298 – 326 . Google Scholar Crossref Search ADS WorldCat 6 Siling , Y. ( 2015 ) The management of China’s relations with its neighbors and its challenges under the initiative of one belt one road . South Asian Stud. , 2 , 15 – 34 . Google Scholar OpenURL Placeholder Text WorldCat 7 Keertipati , S. , Savarimuthu , B.T.R., Purvis , M. and Purvis , M. ( 2014 ) Multi-level Analysis of Peace and Conflict Data in GDELT . In Proc. MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis , pp. 33 – 40 . ACM . Google Scholar OpenURL Placeholder Text WorldCat 8 Hoffman , M. , Bach , F.R. and Blei , D.M. ( 2010 ) Online Learning for Latent Dirichlet Allocation . In Advances in Neural Information Processing Systems , pp. 856 – 864 . ACM . 9 Haiquan , L. ( 2017 ) The security challenges of the “one belt, one road” initiative and China’s choices . Croat. Int. Relat. Rev. , 23 , 129 – 147 . Google Scholar OpenURL Placeholder Text WorldCat 10 Kolosov , V.A. , Suocheng , D., Portyakov , V.Y., Chubarov , I.G., Tarkhov , S.A. and Shuper , V.A. ( 2017 ) The Chinese initiative “the belt and road”: a geographical perspective . Geogr. Environ. Sustain. , 10 , 5 – 20 . Google Scholar Crossref Search ADS WorldCat 11 Mei , Y. , Jing , Z. and Liu , J. ( 2016 ) Research on index compilation method of integrating big data . Manage. Eng. , 22 , 7 – 11 . Google Scholar OpenURL Placeholder Text WorldCat 12 Leetaru , K. and Schrodt , P.A. ( 2013 ) GDELT: Global Data on Events, Location, and Tone, 1979–2012 . In ISA Annual Convention , pp. 1 – 49 . Citeseer . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC 13 Sagi , D.J.B. and Labeaga , J.M. ( 2016 ) Using GDELT data to evaluate the confidence on the spanish government energy policy . IJIMAI , 3 , 38 – 43 . Google Scholar Crossref Search ADS WorldCat 14 Yuan , Y. , Liu , Y. and Wei , G. ( 2017 ) Exploring inter-country connection in mass media: a case study of China . Comput. Environ. Urban Syst. , 62 , 86 – 96 . Google Scholar Crossref Search ADS WorldCat 15 Wang , F. , Li , M., Mei , Y. and Li , W. ( 2020 ) Time series data mining: a case study with big data analytics approach . IEEE Access , 8 , 14322 – 14328 . Google Scholar Crossref Search ADS WorldCat 16 Liang , Y. , Quan , D., Wang , F., Jia , X., Li , M. and Li , T. ( 2020 ) Financial big data analysis and early warning platform: a case study . IEEE Access , 8 , 36515 – 36526 . Google Scholar Crossref Search ADS WorldCat 17 Zhang , D. ( 2017 ) High-speed train control system big data analysis based on the fuzzy RDF model and uncertain reasoning . Int. J. Comput. Commun. Control , 12 , 577 – 591 . Google Scholar Crossref Search ADS WorldCat 18 Claude , U. ( 2020 ) Predicting tourism demands by google trends: a hidden markov models based study . J. Syst. Manag. Sci. , 10 , 106 – 120 . Google Scholar OpenURL Placeholder Text WorldCat 19 Wang , W. ( 2015 ) A Social Stability Analysis System Based on Web Sensitive Information Mining . In National Conf. Big Data Technology and Applications , pp. 46 – 58 . Springer . Google Scholar Crossref Search ADS Google Preview WorldCat COPAC 20 Zhang , D. , Sui , J. and Gong , Y. ( 2017 ) Large scale software test data generation based on collective constraint and weighted combination method . Teh. Vjesn. , 24 , 1041 – 1050 . Google Scholar OpenURL Placeholder Text WorldCat 21 Ongus , R. and Nyamboga , C. ( 2019 ) Collecting development practices in using information technology: a comparative study . J. Logist. Inf. Serv. Sci. , 6 , 1 – 22 . Google Scholar OpenURL Placeholder Text WorldCat 22 Lee , S. ( 2019 ) UML sequence diagram for reusability of data components . J. Syst. Manage. Sci. , 9 , 65 – 77 . Google Scholar OpenURL Placeholder Text WorldCat 23 Kim , B. and Kim , T.-G. ( 2019 ) Cooperation of simulation and data model for performance analysis of complex systems . Int. J. Simul. Model. , 18 , 608 – 619 . Google Scholar Crossref Search ADS WorldCat 24 Wu , D. , Liu , Y., Li , K., Li , J.et al. ( 2019 ) A multi-objective particle swarm optimization algorithm based on human social behavior for environmental economics dispatch problems . Environ. Eng. Manage. J. , 18 , 1599 – 1607 . Google Scholar Crossref Search ADS WorldCat 25 Bouzon , M. , Govindan , K. and Taboada Rodriguez , C.M. ( 2020 ) Grey DEMATEL technique for evaluating product return drivers: a multiple stakeholders’ perspective . Environ. Eng. Manage. J. , 19 , 19 – 38 . Google Scholar OpenURL Placeholder Text WorldCat 26 Li , M. , Li , T., Quan , D. and Li , W. ( 2020 ) Economic system simulation with big data analytics approach . IEEE Access , 8 , 35572 – 35582 . Google Scholar Crossref Search ADS WorldCat 27 Stanley , C.R. , Mettke-Hofmann , C., Hager , R. and Shultz , S. ( 2018 ) Social stability in semiferal ponies: networks show interannual stability alongside seasonal flexibility . Anim. Behav. , 136 , 175 – 184 . Google Scholar Crossref Search ADS WorldCat © The British Computer Society 2020. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - Transportation Index Computation: A Development Theme Mining-Based Approach JF - The Computer Journal DO - 10.1093/comjnl/bxaa102 DA - 2020-09-15 UR - https://www.deepdyve.com/lp/oxford-university-press/transportation-index-computation-a-development-theme-mining-based-OQkxXtzbCw SP - 1 EP - 1 VL - Advance Article IS - DP - DeepDyve ER -