Clusterization of airport cities and cluster dynamics for an air passenger demand network topology forecast based on socio-economic development scenario

1. Introduction Forecast of air passenger demand is an important basis for planning in the constantly changing aviation transportation system. The aircraft industry, and researchers, study air passenger demand and develop forecast models for it using various techniques and levels of aggregation. Each shows various methods to calculate the demand in particular airports (Erma Suryani et al) or on particular routes (Dr. Md. Jobair Bin Alam et al, Tobias Grosche et al). However, these studies do not present a method of forecasting an evolution of air passenger demand between cities at a global level. They fail to take into account the potential for changes in the number of airport- connected cities when forecasting demand within an air transport system. The study Forecast of origin-destination air passenger demand between global city pairs using future socio-economic development scenarios proposes a method of forecasting an evolution of the air travel passenger demand between cities, taking into account the probability of changes to the number of airport-connected cities within an air transport system. In other words, the proposed method forecasts passenger demand as well as changes to the topology of an ‘air passenger demand network’, within a forecast period. The method computes air passenger demand at any given point of time within the forecast period. The method has two steps: forecasting the potential for demand between city-pairs and calculating demand on new and existing connections. Forecasting the potential for demand between city pairs determines whether the potential for demand between a given city-pair exists. It does this by first defining a utility as an ‘attractive force’ between cities. This attractive force is in turn presented through a gravity model based on socio- economical information of cities, in pairs, and the distance between them. To define air passenger demand in the demand network between cities in a discrete "slice" of a socio-economic scenario the probability of air passenger demand appearing has to be assessed. Because socio-economic conditions vary between cities, a clusterization of cities based on socio- economic factors must be made. 2. Clusterization The basic idea of clusterization is to divide a set of cities into several groups (clusters) where each cluster represents a subset of cities. Cities in a cluster are united by similar characteristics. In other words, if clustering is done by socio-economic indicators, cities within one cluster possess the same socio-economic indicators, compared with cities in other clusters. In this study normal mixture approach is used to divide cities into groups by socio-economic indicators. This approach estimates the probability that an element (city) is in each cluster. The normal mixture approach is chosen because it works well in overlapping areas. In the overlap areas, cities from several clusters share the same space. It is especially important to use normal mixtures rather than other clustering methods (e.g. k-means clustering) if one wants an accurate estimate of the total population in each group. This study introduces ‘cluster dynamics’. Cluster dynamics is a method of calculating the probability that a given element (city) will appear within a given cluster at a given point in time. This method is how the cities are allocated to the various clusters. This process reveals the changes over time of city distributions within the clusters. During the forecast period, cluster centres remain fixed as at the base year and do not change. Over the forecast period, the socio- economic indicators of the cities change. These changes effect the probability of membership of given cities in given clusters. 3. Preliminary results The starting point of the forecast is the air passenger demand network of the base year 2012. From the ADI (Sabre Airport Data Intelligence) database received 4435 settlements where is at least one airport from which in 2012 at least one passenger had a flight from/to that settlement. City populations and city GDP make up the socio-economic characteristic used within the study. City population data has been obtained from the UN and the MaxMind database. GDP data has been compiled from the World Bank and the UN. Socio-economic scenario on city level has been developed using Randers “2052” scenario. Socio-economic development scenario on city level contains time series of city GDPs and city populations 2012-2050 for every city obtained from ADI database for 2012. For the base year clasterization to 9 clusters has been made by city populations, city GDP and GDP per capita. Nine clusters cover ‘small’, ‘middle’ and ‘big’ cities by populations and ‘poor’, ‘middle class’ and ‘rich’ cities by GDP. Number of cities in each cluster and clusters means (cluster centres) is shown in Fig.1. Cluster dynamics is shown in Fig.2. For the purposes of the study, cluster names derived from cluster means (of population and per capita GDP) were adopted. 4. Outline The final paper will present the detailed description of clusterization process. The paper will include: Detailed justification for choosing the normal mixture approach for clusterization, clustering parameters and number of clusters; Description of cluster dynamic approach; Cities clusters changes based on socio-economic forecast; and Implementation of cluster dynamics for an air passenger demand network topology forecast.