Prediction of CO and PM10 in Cold and Warm Seasons and Survey of the Effect of Instability Indices on Contaminants Using Artificial Neural Network: A Case Study in Tehran City

Today, air pollution in urban areas is a major issue that have been affecting human health and the environment. Over the years artificial neural network methods has been used for prediction of pollutants concentration in many metropolitans. In the present study data were obtained from department of environment and air quality controlling stations in city of Tehran from March 2012 to October 2013. Prediction of CO and PM10 contaminations during cold and warm seasons under the influence of instability indices and meteorological parameters was done using the artificial neural network. Results of the modeling process showed that the highest correlation coefficient was obtained 0.84 for PM10 in warm season. On the contrary, the highest correlation coefficient of CO in cold season was 0.78. Also, the effect of instability indices on air pollution was investigated. The highest CO concentration occurred during cold seasons (R2= 0.81), while the lowest concentration was in warm season (R2= 0.72). In case of PM, the highest concentration occurred during warm seasons (R2= 0.84), while the lowest concentration was in cold season (R2=0.75). doi: 10.5829/ijee.2022.13.01.08


INTRODUCTION 1
Clean air is a basic requirement for human health and on the other hand, Air pollutants impose a wide-ranging adverse impacts on biological, physical and natural systems especially on human health [1]. Air pollutants such as CO and PM10 increase mortality and hospital admissions [2]. These two pollutants can lead to diverse impacts on human health. Health effects can range from skin problems to cancer and asthma. They also include birth defects, serious developmental delays in children, leading to a number of diseases [3]. Vehicular system is the main sources of air pollution in megacities [4] like Tehran with a population of about 12 million people. With increasing air pollution, prediction of air quality is important for providing the proper actions and controlling strategies. Several approaches have been used for *Corresponding Author Email: m.hadavifar@hsu.ac.ir (M. Hadavifar) modeling include deterministic approaches, which comprise numerical solving of differential equations and empirical methodologies using various functions to estimate the concentrations of the pollutants depending on the ambient conditions [5]. At first, pollution emitters, the chemical composition of the emitted gasses and physical processes happening in the boundaries of atmospheric layers should be recognized. A recent instance of this approach has been investigated by Finardi et al. [6]. On the contrary, a second approach has been studied which typically needs a large quantity of measured data gathered at different atmospheric conditions. Generally, the second approach is superior for explanation of complex relations between concentrations of air pollutants and predictors [7] such as artificial neural network (ANN). In case of air quality, several studies have been completed using artificial neural network [1,8,9] relating to NO2 [10], PM10 [11], O3 [12] and SO2 [13].
Artificial neural network is a suitable model for uncertainty and nonlinearity relationships and can be trained to accurately predict for general conditions [14]. Neural networks are analytical and educational tools that try to mimic the patterns of information processing in the human brain. They have high flexibility and examine nonlinear relationships between parameters. They predict the concentration of pollutants without the need for a specialist. Ghazali and Hakim-Ismail [15] employed neural network for predicting of air quality. Several pollutants and meteorological parameters such as SO2, CO, NO2, NO, temperature, relative humidity and air velocity were considered in their studies. The performance of the conducted models were evaluated through measuring of mean square error (MSE) and R 2 values. According to the constructed networks, the best prediction performance was related to a model used network structure 7-20-4 with R 2 and MSE values of 0.57 and 0.062, respectively [15]. A new method for estimation of hourly concentrations of air pollutants was presented by Hrust et al. [16]. This method was developed for a site in urban residential area in Zagreb, Croatia, for four typical air pollutants (NO2, O3, CO and PM10). A novel approach, based on families of univariate regression models applied for choosing the averaging intervals of input variables. A comparison between the new proposed methods with older methods, which proceeds hourly-averaged predictions as input variables, showed similar or improved performance. It was expected that longer term averages used as inputs in the proposed method provides smaller input errors and greater accuracy [16]. Cai et al. [17] applied artificial neural network to forecast hourly air pollutant concentrations adjacent to an arterial in Guangzhou, China. Factors that effect on pollutant concentrations were categorized into four types: meteorological and geographical parameters, traffic-related and background concentration. Models were trained, validated and tested based on backpropagation neural network using the gathered data. It showed that the models had ability to produce a truthful forecasting of hourly concentrations of the pollutants more than 10 h in advance. This comparative study showed that the neural network model describes the data more accurate than the multiple linear regression-based models and the California line source dispersion model. [17]. Brunelli et al. [11] applied an Elman neural network model in order to prediction of daily maximum concentrations of SO2, O3, PM10, NO2 and CO as typical pollutants. Experimental outputs revealed that the model was suitable, which gave coefficients of correlation ranging from 0.72 to 0.97 for predicted and measured data. Their experiments also revealed a logically better agreement, as compared to Multilayer Perceptrons (MLP), between measured and predicted daily maxima for Elman networks. Barometric pressure, wind direction and intensity and ambient temperature were considered as input data. In this study to predict the CO and PM10 concentrations, a MLP neural network was employed. Also, in this research, for the first time, in addition to meteorological parameters, incontinence indicators have been used to predict the concentration of pollutants. Firstly, prediction of pollutant concentration with meteorological parameters and secondly, the effect of instability indices on air pollution were investigated.

Area of the study
Area of the city of Tehran is 730 square kilometers, with a population of about 8.5 million, 2,200,000 residential units [18], 3.6 million cars and 2.5 million motorcycles. The city is the largest urban area in Iran ( Figure 1). Iran is currently facing serious air quality issues and this is due to overflow of population and different industrial activities. In Tehran city pollutants such as PM10, SO2, NO2, HC, O3 and CO are the major air pollutants and about 80 to 85% of these pollutants are released by mobile sources [19]. Nevertheless, Tehran is surrounded by high mountains preventing the natural movement of air. Meanwhile, due to the overflow of pollution, Tehran is considered to be as one of the most polluted cities. This has been resulted due to a variety of motor vehicles, factories, industrial, residential and governmental units, the use of fossil fuels for industries and transportation, etc. In this sense, carbon monoxide is the largest air pollutant in Tehran and the smallest ones are sulfur compounds [20].

Data collecting
For the present study measurements of daily averages of PM10 and CO concentrations obtained from Sharif Air Quality Station were considered, which is one of the 22 stations monitored by the government throughout the city of Tehran. Meteorological data was obtained from Mehrabad Station. Data of the study were limited to those obtained from 1 March 2012 to 31 July 2013 (Table 1).

Instability indices
The most important factor in determining the amount of accumulation of pollutants, their distribution and the occ-   Instability index urrence probability of dusts, is determining the stability and instability of the atmosphere. Among the available data for the prediction of stability or instability of atmosphere, it is possible to refer to Radiosonde data. Using Radiosonde data, special indices were used to calculate the amount of convection and movement of air masses. In fact, stability and instability indices were calculated using thermodynamic diagrams called Skew-t. In general, with the analysis of the Skew-t data, the following information could be obtained: Assessment of stability and instability of atmosphere, studying the elements of the climate in different layers, identifying storms and thunderstorms [21]. In this study, SI, LI and TT indices were selected from the common indices (CAPE, SI, KI, LI, CIN and FCL) for the determination of atmospheric stability and instability.

LI (Lifting convection level)
Basically, Lifted Index (LI) presenting the stability of an air parcel [22] which adiabatically raised to a height of 500 hPa pressure level. Subsequently, the temperature of the parcel and the environmental temperatures at that height are matched according to Equation (1):

TTI (Total Total Index)
The Total Totals parameter (TTI) assesses an instability condition on the basis of temperature difference between 850 and 500 hPa and also the difference between dew point at 850 hPa level and the temperature at 500 hPa [23]. Values of TTI > 50 indicate the developments of thunderstorm (Equation (2)): = ( 850 − 500 ) + ( 850 − 500 ) Where T and Td are ambient temperature and temperature of dew point, respectively in a specific given level.

SI (Showalter Index)
HOW was proposed by Showalter [24], as stated in Equation (3): SI is Showalter index (°C), TE500 is the ambient temperature at 500 hpa (°C), and TP500 is parcel air temperature at 500 hpa (°C), on the condition that the parcel raises from level of 850 hpa. Values greater than -8 lead to extreme instability.

Neural networks structure
With the rapid technological advancement and the ability of the neural network to recognize, classify, generalize and learning, the interest in using the network has been increased. However, they only originate stimulation from the brain [25]. The neural network composed of weights that can be used to adjust the relationship between the data. The system begins to learn by setting data's and changing the primary network's status [26]. Gradient of the climatic system are commonly non-linear, irregular and highly complex. These include, the Indian monsoon [27], the El Niño events [28] and tropospheric ozone in the United States [29] and should predicted with nonlinear system like neural network. Neuron networks have different types that can be referred to the types, perceptron neural network [30], Hopfield neural network [31], Hamming neural network [32], Kohonen neural network [33], Time delay neural network, Backward propagation neural network. There is in an artificial neural network, one input layer, and a number of hidden layers, and finally an output layer ( Figure 2). In this study, the trained network was MLP with an input layer, three hidden layers with 10 neurons and an output layer for the prediction of PM10, CO (Figure 3).
By choosing an appropriate set of weights and transfer functions, it is known that a multi-layer perceptron can estimate any smooth, computable function between the input and output vectors. In the present study, a sigmoid function was considered as a transfer function as follows (Equation (4)) [34].  The multi-layer perceptron's have the ability to learn through network training. Usually, training needs a set of training data. All the valid data set was containing 500 samples, which were divided into three sub-categories: 75 samples (15% of samples) were considered as validation set; 350 samples (70% of all samples) were applied as the training set and testing set was containing 75 samples (15% of all samples). In the present study, Levenberg Marquardt algorithm (LMA) was used for the training and learning processes. LMA was selected because of its faster convergence in training process of medium size networks. LMA can be expressed by Equation (5): Where J is the Jacobian matrix for deriving error of each weight, μ is a number, e is vector error and I is the recognizing matrix. Parameter μ determines the weight of gradient descent [35]. In order to make the ANN performance appropriate, statistics obtained from the root mean square error (RMSE) (Equation (6)) and correlation coefficient (R) were used (Equation (7)).
In these equations, Oi is the observed data, Pi is the predicted data, n is the number of observed data, ̅ is the average of predicted data and ̅ is the average of observed data [36].

Predicting pollutant concentration in terms of season
Investigation of the variations of pollutant concentrations in different seasons revealed that the pollution level in various places of an urbanized area depends on many factors. One of these factors is the existence of numerous lines (vehicles) and areas (industrial enterprises) which act as sources of the atmospheric pollution [37]. Another factor was shown to be meteorological parameters. Therefore, this study also attempted to examine the effect of meteorological parameters (temperature, speed wind, direction wind, and pressure) on air pollution (CO, PM10). CO and PM10 were revealed to be serving as two important air pollutants in Tehran. It is well known that CO and PM10 contributed to greenhouse gas that warm the atmosphere [38]. The main anthropogenic sources included combustion of fossil fuel and biomass, motor vehicles and industrial activities [39]. Figures 4 and 5 illustrate the comparison of pollutants concentration in different seasons. Results showed that the highest CO concentration occurred in cold seasons (R 2 = 0.78) and that the lowest concentration of pollutant was seen in warm season (R 2 = 0.66). One can see that the greatest amount of air pollution by carbon monoxide was observed at the time of opening schools. The reasons for this were increase of traffic and fossil fuel consumption, as well as inappropriate functioning of vehicles producing more gas due to the coldness. Moreover, the probability of air inversion in the cold seasons, especially in winter, could be another reason for the relative steady-state concentration of carbon monoxide.
The peak in the cold season can be due to the use of heating appliances and vehicles along with meteorological conditions such as cold weather, increasing pressure, stability and temperature inversion. Low temperature, especially in winter, led to increasing use of heating appliances and more fossil fuels. This could be because of the notion that increasing the use of these devices leads to an increase in carbon monoxide concentrations [40].
According to literatures, seasonal variations has been occurred for some pollutants as NMHC (non-methane hydrocarbon), CO and NOx with maximum concentrations in winter and minimum concentrations in the summer period in Kuwait [41]. A similar situation can also be observed in a number of cities in India and China [42,43].
An annual variation in concentrations of sulphur dioxide and CO with a winter maximum was observed in Venice, Los Angeles, Delhi, some cities in England, Canada, and Eastern Europe [44][45][46]. These are in agreement with the present finding.
For PM10 the highest concentration of pollutant was observed in warm seasons (R 2 = 0.84), while the lowest concentration of pollutant was seen in cold season (R 2 =-0.68). The PM10 concentration is largely affected by a number meteorological factors, including wind velocity, sea level pressure, relative humidity, and temperature. Some reason for this could be dust phenomenon, decrease of rainfall and no washing out the particles in warm seasons, humidity, and also inversion [47].
Increasing air dryness might be attributed to an increase in emissions of the pollutants. Additionally, relative humidity, if accompanied by precipitation, and washing can reduce the amount of air pollutants [48]. Similar work in previous studies on Tehran conclud ed that the highest PM10 concentration was evident in summer more than the standard amount and this was attributed to lack of wind [49].
Generally, increasing concentration of air pollutants have also been reported to more obvious in summer which has been reported to be attributed to low wind speed, high temperature and subsidence inversion dominating in this season.

Investigating the impact of instability indices on air pollution
Instability is an important factor in the development of severe weather, and severe weather instability indices can be a useful tool when applied to a given convective weather situation. Therefore, instability indices can be used to predict thunderstorms. A lifted index with a value of 0 °C is sufficient for severe weather development only if the dynamics are very strong [50]. As seen in Figures 6  and 7, the highest CO concentration was seen in cold seasons (R 2 = 0.81) while the lowest concentration of pollutant was in warm season (R 2 = 0.72). Furthermore, regarding PM10 pollutant, the highest concentration was found to be in warm seasons (R 2 = 0.84) while the lowest concentration of pollutant was in cold season (R 2 =0.75). Therefore, as compared with the results shown in Figures  4-7, instability indices have not a great effect on air pollution. Perhaps one reason is inappropriate choice of indicators and the other reason is inadequacy of the data. Some of the most common indices that are often applied in the study of thunderstorms are: K Index [51], Lifted Index [22], Showalter Stability Index [24], Total Totals [23], Severe Weather Threat Index, Convective Available Potential Energy (CAPE) and Convective Inhibition (CIN). In 1960, George [51] surveys the relationship bet- ween KI and thunderstorm probability. According to his research, with 50% probability, thunderstorms will occur if the KI value is 26 °C or more. In 1957, Galway [22] showed that a negative LI indicates that the boundary layer is unstable. This instability represents an environment in which convection can occur. Galway declared that LI values between −3 and −5 °C represent marginal instability. Showalter [24] presented the threshold values of SI for thunderstorms, indicating that SI values from 0 to −4 °C are suitable for moderate thunderstorms. However, instability indices were used along with meteorological parameters to assess the effect of such indices on air pollution.

CONCLUSION
In the present study, seasonal variations of PM10 and CO were investigated. A three-layer perceptron neural network was used with 10 neurons under the performance of sigmoid transmission function and Levenberg Marquardt algorithm, using meteorological parameters and instability indices as the input data during 2012-2013. The highest seasonally CO concentrations was observed in cold seasons with correlation coefficient of R=0.76 and RMSE with value of 0.65 , which can be due to inversion, utilization of heating equipment and increase of fossil fuel consumption. The highest PM10 concentration with correlation coefficient (R=0.84) and RMSE with value of 21.59 were in warm seasons due to reduction of wind speed and rainfall. In the second step, the effect of instability indices on air pollutants was investigated. The highest CO concentration had been observed in cold seasons (R 2 = 0.81); however, the lowest concentration of pollutant was shown in warm season (R 2 = 0.72). Considering PM10, the highest concentration occurred in warm seasons (R 2 = 0.84) and the lowest concentration of pollutant was in cold season (R 2 =0.75). Therefore, instability indicators were concluded not to have a great effect on air pollution.