Examining Air Pollution Continuity in Tehran Province using Markov Chain Model

The Air Quality Index is a numerical tool used to quantify air pollution levels and classify pollution severity. It plays a vital role in ensuring healthcare system stability by understanding air pollution's dynamic behavior and shifts in pollution intensity. To analyze the probabilistic transition between pollution severity levels, a Markov Chain model was utilized. This study examined six air pollution states (Clean, Healthy, Unhealthy for Sensitive Groups, Unhealthy, Very Unhealthy, Hazardous) across 12 stations in Tehran's northern, southern, eastern, western, and central regions from 2018 to 2022. Results revealed that the western and southern areas exhibited the highest pollution levels, with over 44% and 34% of instances corresponding to unhealthy indices, respectively. In contrast, northern Tehran consistently maintained cleaner air quality. Probability transition matrices highlighted the greatest stability continuity in healthy air quality across all regions. Transitioning between clean/healthy air to very unhealthy/hazardous air was minimal, with hazardous air quality almost absent in most stations, except for the west and south (25% stability). The probability of continued unhealthy air quality in these areas reached 60%, indicating heightened pollution. The findings of transition probability matrices indicated that the western and southern regions had the highest likelihood of sustained pollution, while the northern region consistently maintained cleaner air. The probability of continuous clean air in the west was below 30%, while transitioning from very unhealthy/hazardous air to unhealthy air reached 50%. Conversely, the northern Tehran region exhibited over 40% stability for unhealthy air quality and over 50% for clean and healthy air.

INTRODUCTION 1   Air pollution is a concerning issue, particularly for individuals residing in large industrialized cities, as it can have significant impacts on both human health and the environment.Rapid urbanization and industrialization worldwide have led to increasing severity of air quality.Efforts to meet the housing, consumption, healthcare, and energy needs of the growing urban population have resulted in numerous challenges, contributing to air pollution in modern cities.A comprehensive study conducted by the World Health Organization (WHO) revealed that nearly 90% of the global population is continuously exposed to air pollution, resulting in more than two million premature deaths annually [1].Indoor Pollutant Sources Air pollution has an impact on the *Corresponding Author Email: mehdi.nadi@gmail.com(M.Nadi) body, including respiratory systems and heart disorders.Asthmatics, bronchiolitis, lung disease, cardiovascular problems, central nervous system malfunction, and skin conditions are the most common respiratory disorders, as are chronic obstructive pulmonary disease (COPD).The challenges posed by outdoor air pollution are public health risks such as cardiovascular disease, respiratory ailments, COPD, and world-class asymmetry [2].Air quality in the residential areas adjacent to the industrial regions is of great concern due to the association with particulate matter and toxic gaseous which has adverse effects on human health [3].Considering the influence of air on human life and its consumption overnight, it can be stated that the unfavorable quality of air in today's world is considered one of the most important environmental problems in many of the world's large cities [4].The rapid expansion of cities and urban populations, coupled with industrial growth, especially in proximity to large and densely populated cities, has not only escalated urban air pollution but has also triggered significant climatic and environmental changes, gaining substantial attention from researchers and planners [5].
Urbanization in recent decades has led to a significant increase in the number of cities with a population exceeding one million, rising from one city in 1976 to eight cities in 2017 in Iran alone.This population growth, coupled with inadequate urban planning, has brought about various environmental and human-related challenges, with air pollution being a prominent issue.Air pollution is just one of the many environmental problems arising from population growth and urban development.In Tehran, the capital of Iran, and several other Iranian cities, air pollution has become one of the most tangible environmental challenges [6].Statistics show that during periods of increased air pollution in Tehran, the number of respiratory patients increases by up to 60%.The main factors associated with the rise in cardiovascular, respiratory, and pulmonary diseases are the increased levels of sulfur dioxide, particulate matter, and carbon monoxide [7].Today, the various consequences of air pollution have necessitated continuous monitoring and control of air quality, making it an irrefutable concern in societies.Solving the complex issue of air pollution necessitates collective efforts and a deep understanding of pollution sources and events [8].Pollution events, although not precisely predictable, can provide valuable insights when observed continuously.Probability laws play a crucial role in understanding many pollution events, contingent on knowledge of air quality indices and the probability of such events occurring [9].In essence, the Air Quality Index, a vital tool for assessing and quantifying air quality, particularly for five major air pollutants-particulate matter, nitrogen dioxide, groundlevel ozone, carbon monoxide, and sulfur dioxide-plays a pivotal role in comprehending the quality of air and the likelihood of pollution events.
The Air Quality Index (AQI) divides air quality into six categories: good, moderate, unhealthy for sensitive groups, unhealthy, very unhealthy, and hazardous.The likelihood of each category depends on the levels of pollutant concentrations [10].According to probability laws, some random phenomena are more likely to occur when the probability of others is lower.In addition, among n possible states, only one state can occur, and the likelihood of each state is not necessarily superior to others.These processes are proven using Markov chains [11].
Utilizing Markov chains and understanding the continuity of pollution events and the possibility of creating various states can significantly aid in planning for air pollution management and reduction.Several studies have been conducted in this regard.Hoyos et al. [12] investigated the pollution of ozone and sulfur dioxide over seven years  in Mexico City using a Markov chain-based model.The results showed that ozone had a steady distribution, while sulfur dioxide exhibited oscillatory behavior.Jarquin et al. [13] used Markov chain analysis to study the behavior of five pollutants (CO2, NOx, SO2, O3, PM10) for five possible states, ranging from good to very bad, in Mexico City.The results indicated that state changes over time exhibit nonlinear behavior, and even when pollutant concentrations increase progressively, the probability of worsening air quality is very low because it remains in an orderly state.Holmes and Hassini.[14] examined the random behavior of AQHI risk categories in Ontario (34 air monitoring stations) over five years from 2015 to 2019.They determined discrete-time Markov chains using three AQHI risk categories (low risk, moderate risk, high risk) as states for transition probabilities.Sarvi et al. [15] assessed the number of excess PM2.5 days in Tehran from October 2010 to December 2015.They used a hidden-state Poisson Markov model, considering various hidden states, to predict the number of occurrences.The results revealed that the number of days exceeding the limit (predicted average from 6.87 to 11.39 days) is relatively high for sensitive individuals based on PM2.5 air quality.Alyousifi et al. [16] applied a Markov Chainbased Fuzzy State (MCFS) model using triangular fuzzy numbers to analyze uncertainty in air pollution events and describe the transitional behavior of air pollution.Their results indicated that MCFS can effectively model air quality indices and may be a superior option for predicting air pollution.Zakaria et al. [17] conducted research on predicting future air quality using a Markov Chain model, introducing it as a model for long-term pollution distribution assessment.Suhaimi et al. [18] used the Multiple Imputation by Markov Chain Monte Carlo (MCMC) for air quality data calculation and found that the multiple imputation technique using the MCMC method provides a suitable fit and supplies missing data effectively.
Considering the significance of air quality in the urban areas of Tehran and the absence of prior statistical studies regarding the assessment of consecutive days with various air quality indices and the determination of the transition probabilities of pollution severity classes in Tehran and Iran, this research aims to analyze the frequency of pollution severity classes and investigate the likelihood of consecutive different air pollution severity levels in the city of Tehran.
Considering the utmost significance of urban air quality in Tehran and the substantial impact of air pollution on human health and the environment, coupled with the fact that comprehensive studies regarding the continuity of air pollution with various quality indices and the precise determination of the transition modes of different severity classes of pollution have not been conducted thus far for Tehran and other regions of Iran, this research aims to fill this crucial gap.It does so by conducting an extensive analysis of the frequency of various air pollution severity categories and introducing a Markov chain model to analyze the persistence of these severity levels and the likelihood of consecutive days with varying air quality in Tehran.This research holds paramount importance in enhancing air quality, safeguarding human health, and protecting the environment.

Study area
Tehran, the capital of Iran, is situated in the heart of the country with a population of approximately 8.5 million residents.This population can swell to over 12.5 million during the day, as people from surrounding cities commute to Tehran for work.There are over 17 million daily car trips in Tehran, with many vehicles which are outdated technology.Consequently, Tehran's air quality ranks among the most polluted in the world.The city's topography and climate exacerbate pollution issues.
Tehran is located at a high altitude and is surrounded by the Alborz mountain range, which traps polluted air.Temperature inversions, especially in winter months, further hinder pollutant dispersion.Recent studies indicate that achieving clean air in Tehran is a complex challenge, as factors like rapid population growth, industrial development, urbanization, and increased fuel consumption exert significant pressure on air quality [19].

Required data
The ground-based air quality index (AQI) data was obtained from the Tehran Air Quality Control Company's website.There are 26 active monitoring stations in Tehran, measuring concentrations of air pollutants including ozone, particulate matter with diameters less than 10 and 2.5 micrometers, carbon monoxide, nitrogen dioxide, and sulfur dioxide.These monitoring stations determine the air quality index based on pollutant levels [20].
In this research, due to the need for daily time series data of the AQI for Markov analysis, data from 12 pollution monitoring stations in different areas of Tehran, including north, south, east, west, and central regions, were utilized.The data was collected over a 5-year period (2018-2022) from stations with minimal missing data.The coordinates of these monitoring stations are presented in Table 1, and the distribution of these stations is illustrated in Figure 1.

Air quality index (AQI)
The air quality index is a numerical scale used to report the daily air quality in terms of its impact on human health and the environment.The daily results of this index provide an estimate of air pollution levels that people rely on.An increase in the air pollution index signifies an increase in air pollution, posing a serious threat to human health.In most cases, the air quality index indicates how clean or polluted the air around us is.The primary focus of the air pollution index is on the health effects that may

Markov chain model
Markov chain models can be used for simulating discretetime series.The principles of this method are based on the temporal correlation between the data of a variable, meaning that the occurrence of a specific random variable, such as very unhealthy air quality, depends on the data of previous time steps or time increments.Each Markov chain model is determined by the number of states of the variable in question and its time order [22].For example, in the analysis of air quality with a Markov chain, the number of model states can be set equal to the number of pollution levels.The model's order means that the state of a variable at the current time depends on how many time steps immediately preceding it, which can be 1 or 2 steps or more.The equation form of a first-order Markov chain is as follows: The above equation indicates that the probability of an event such as X occurring at time t depends only on times t-1 (first order) or t-1 and t-2 (second order), not on the path that the time series has taken to reach the variable X at time t.In the following, some characteristics of firstorder Markov chains, including transition probability matrices and the calculation of their steady-state probabilities, will be discussed [23].Each element of the transition probability matrix in a first-order Markov chain model is denoted as Pij, representing the probability of transitioning from state i to state j.This probability is determined using the following equation: In this study, six states (equivalent to six air quality categories) were chosen for both i and j, as presented in Table 2.The form of the transition probability matrix for a first-order six-state Markov chain will be as follows: In the above matrix, for example, the probability that after clean air, unhealthy air occurs.In this matrix, the sum of probabilities in each row equals 1, meaning: To determine the time dependence (first and second orders) or the lack of time dependence (zero order) in discrete time series of AQI, it is necessary to use an appropriate criterion.For this purpose, the Bayesian Information Criterion (BIC) is commonly used [24].
In which, S represents the number of states, m is the order of the model, and the parameter is determined from the following relationships [25]: In which, the parameter represents the likelihood parameter of the model of order m.The model with a lower BIC value is introduced as the optimal model.If the BIC values are equal for several different models, the model with a lower order is selected [26].

Description of air quality Range of AQI Levels of concern Class number
Air quality is satisfactory, and air pollution poses little or no risk 0-50 Good 1 Air quality is acceptable.However, there may be a risk for some people, particularly those who are unusually sensitive to air pollution.

51-100 Moderate 2
Members of sensitive groups may experience health effects.The general public is less likely to be affected.

101-150
Unhealthy for Sensitive Groups 3 Some members of the general public may experience health effects; members of sensitive groups may experience more serious health effects.

151-200 Unhealthy 4
Health alert: The risk of health effects is increased for everyone.

201-300 Very Unhealthy 5
Health warning of emergency conditions: everyone is more likely to be affected.301 and higher Hazardous 6

Selection the appropriate order of the Markov chain
In this research, daily data for five years (2018-2022) from 12 air quality monitoring stations in five regions: East, West, North, South, and Central Tehran were used to select the appropriate order for the Markov chain model.The investigation of orders 0, 1, and 2 using the BIC index in Table 3 showed that the first-order model had the lowest BIC value and therefore, the best Markov model for fitting all the pollution monitoring stations in Tehran is a first-order Markov chain model with six states.

Frequency analysis of air quality index
Based on the analysis of the air quality frequency index in Tehran, as provided in Table 4, the western regions are the most heavily polluted areas, displaying the highest percentages of both healthy and very unhealthy air quality categories.For instance, Shad Abad and Shahrdari 21 Municipality stations show healthy and clean air quality index percentages of 55.9% and 51.4%, respectively.Furthermore, more than 44% of days are associated with unhealthy air pollution conditions in these areas.
The second most polluted area in Tehran is situated in the southern region, with the unhealthy air quality index percentages at Shahr Rey and Shahrdari 19 stations being 34.91% and 31.34%,respectively.
In terms of the pollution index percentage, the central region of Tehran, particularly the Tarbiat Modares station, surpasses even the southern region, with the pollution index percentage exceeding 40%.However, at the Setad Bohran station, this figure is 31.5%.
The eastern areas of Tehran exhibit better air quality compared to the western and southern areas.The combined percentage of unhealthy air quality categories at Golbarg and Masoudie stations is 22.2% and 16.81%, respectively.Conversely, the Pirouzi region is more polluted than the southern regions, with the pollution index percentage at the Pirouzi station exceeding 39%, indicating higher pollution levels compared to other eastern regions of Tehran.Clean and healthy air conditions, as indicated by the pollution index frequency, are prevalent in the northern region of Tehran.Here, less than 29% of days in the year experience polluted conditions, and on more than 70% of days in the year, the air quality in this area is considered clean and healthy.

Probabilistic analysis of daily transitions with different AQI
Given that in the majority of AQI index stations and time intervals, the first-order Markov model provides a better fit for the Air Quality Index (AQI), selected the first-order Markov chain model with six states for analyzing air quality classes.The transition probability matrix for this model is a square 6*6 matrix, with each element representing the probability of transitioning from one specific air quality class to another .
To gain a deeper understanding of air quality in various regions of Tehran, including the east, west, north, south, and central areas, we conducted probabilistic analyses at multiple monitoring stations.

Central Region of Tehran:
The results presented in Table 5 reveal that the central regions of Tehran exhibit the highest level of stability and consistency for AQI class 2, which signifies healthy air.The probability of transitioning from one day with healthy air to the next day with healthy air stands at 75%.Additionally, there is a probability exceeding 40% of the air quality remaining unhealthy for sensitive groups (AQI class 3).
Table 5 results indicate that there is no possibility of transitioning from clean air to unhealthy, very unhealthy, or hazardous air, and vice versa.Nevertheless, there is a likelihood of transitioning from days with a high pollution index to unhealthy air and air quality indices for sensitive groups.Furthermore, the stability of indices denoting unhealthy air for sensitive groups and unhealthy air for all groups surpasses the 50% mark.Conversely, there is no stability observed in the very hazardous air quality index, while the stability of the hazardous air quality index stands at 50%.
Table 5 further illustrates that in instances of unhealthy, very unhealthy, and hazardous air pollution conditions, there exists a probability that the air quality will persist as unhealthy or transition to air quality indices designed for sensitive groups on the following day.Eastern Region of Tehran: Considering that the main diagonal of the transition probability matrix represents the continuity and stability of each air pollution class, our examination of the main diagonal elements in the probability matrices for air pollution class transitions at stations in the eastern part of Tehran (as presented in Table 6) reveals noteworthy findings.
Firstly, it's evident that the Golbarg and Masoudie stations exhibit high probabilities of maintaining clean and healthy air, with probabilities exceeding 55% and 75%, respectively.The Pirouzi station also displays substantial continuity in clean and particularly healthy air, with probabilities of 29% and 77%.These observations underscore the remarkable persistence of clean and, notably, healthy air quality in these locations.
Interestingly, Golbarg station indicates a probability of more than 45% for days characterized by an unhealthy air quality index.In contrast, the Masoudie station shows no instances of very unhealthy or hazardous air pollution persisting.However, it's worth noting that there is a higher probability of air quality transitioning from these categories to unhealthy conditions and those specifically affecting sensitive groups.Consequently, the likelihood of a day with severe air pollution transitioning to a day with clean air is low, with a probability of less than 50%.Conversely, the reverse scenario, where healthy conditions lead to unhealthy conditions, has not been observed.
Furthermore, it appears that the Masoudie station consistently experiences healthier weather conditions compared to Golbarg and Pirouzi stations.
These insights provide valuable information for understanding the air quality dynamics in the eastern region of Tehran, contributing to informed decisionmaking in environmental management and public health.Northern Tehran Region: The outcomes derived from the probability matrices detailing air pollution level transitions in the northern regions of Tehran are presented in Table 7.A comprehensive analysis of these matrices reveals several significant patterns and trends.
Firstly, it's apparent that the probability of air quality remaining unhealthy in northern Tehran exceeds 40%, while clean and healthy air quality classes boast a higher probability, surpassing the 50% mark.This highlights the region's capacity for sustaining clean and healthy air quality conditions.
Notably, the probability of transitioning from days characterized by hazardous pollution levels to unhealthy air quality indices for sensitive groups is notably higher compared to other transitions.Conversely, the possibility of transitioning to a day with clean and healthy air quality following a hazardous pollution day is non-existent.In essence, once hazardous air pollution conditions are experienced on a particular day, there is no possibility of their recurrence on the subsequent day, and certainly, clean and healthy air conditions will not prevail.
These findings provide valuable insights into the dynamics of air quality in the northern region of Tehran, emphasizing the challenges associated with addressing and mitigating air pollution, especially when pollution levels reach hazardous thresholds.This knowledge contributes to the development of targeted environmental and public health strategies.Southern Tehran Region: In the southern Tehran region, akin to other areas, the probability of transitioning from the first four levels of air pollution, which include clean, healthy, unhealthy for sensitive groups, and unhealthy, to very unhealthy and hazardous conditions is nearly negligible.However, a notable distinction emerges in the Shahrdari 19 station, where the probabilities of transitioning to dangerous, unhealthy, and unhealthy conditions for sensitive groups are 25%, 60%, and 53%, respectively.These probabilities exceed those observed at central, eastern, and northern stations.Moreover, there is approximately a 50% likelihood of transitioning from dangerous and very unhealthy air pollution conditions to unhealthy conditions.This observation underscores the persistence of air pollution over consecutive days in the southern Tehran region.Comparing the air pollution transition probability matrices in the southern Tehran region, as presented in Table 8, reveals that in the Shahr Rey area, although the stability of clean air is lower compared to the Shahrdari 19 station.
This analysis provides valuable insights into the unique dynamics of air quality in the southern region of Tehran, emphasizing the challenges posed by the persistence of air pollution, especially in comparison to other areas of the city.Western Tehran Region: The probability matrices depicting air pollution transmissions in the western region of Tehran are detailed in Table 9, revealing distinct patterns that distinguish this area from others in the city.
Notably, the persistence of air pollution in the western region of Tehran, particularly unhealthy air, surpasses that of other areas in the city.Moreover, there exists a noteworthy 50% probability that after experiencing very unhealthy and hazardous air pollution conditions on a given day, unhealthy air conditions will prevail on the subsequent day.This observation underscores a concerning trend of consecutive days with air pollution in the western area of Tehran.
In contrast, the probability of sustained clean air in this region is considerably low, measuring below 30%.Additionally, there is no possibility of transitioning from  unhealthy conditions to clean air conditions, both for the general population and sensitive groups.The probability of transitioning from AQI class 3 and 4 to healthy air conditions stands at less than 40% and 10%, respectively.These findings shed light on the distinctive air quality dynamics in the western region of Tehran, emphasizing the challenges associated with mitigating air pollution and the need for targeted interventions to improve air quality and public health in this area.
Comparing the results with previous research, Rahimi et al. [27].noted that CO pollutant levels were highest in Fatemi, Bazar, and Aghdasie stations throughout most months of the year, with Fatemi station showing the highest probability of CO pollution occurrence.Conversely, Shahr Rey station exhibited the lowest likelihood of CO pollutant persistence.
Building upon these insights, Holmes and Hassini [14] employed the Air Quality Health Index (AQHI) in eastern Canada, highlighting that urban areas spent more time in high-risk categories and tended to remain in these categories for extended durations before transitioning.
The examination of various orders of the Markov chain model revealed that pollution intensity over two consecutive days was not independent, suggesting a connection between air quality on consecutive days.The first order of the Markov chain was selected, implying that air quality on one day is influenced by the preceding day, while the air quality of two previous days does not significantly impact the pollution intensity of the current day.
The general examination of transmission probability matrices demonstrated that, overall, as pollution intensity increased, the stability of air quality decreased.However, in most stations, no stability probability was observed for the dangerous air quality category.Notably, the probability of two consecutive days of healthy air quality was higher than that of consecutive days with clean air quality.
To further understand the AQI patterns, transmission probability matrices for air pollution monitoring stations were estimated.The results revealed that unhealthy air quality categories at each monitoring station exhibited stability, with gradual and low probability transitions between unhealthy and healthy categories.Sudden and significant transitions between very unhealthy or dangerous air quality to clean and healthy conditions, and vice versa, were not anticipated.
In a similar study, Zakaria et al. [17] employed the Markov chain to predict air quality in Sarawak, Malaysia.Their findings indicated a higher continuity of good and dangerous air quality conditions compared to the other four conditions.Additionally, the probability of good air quality conditions in the long term exceeded that of average and unhealthy states.Very unhealthy and dangerous conditions showed the lowest probability of continuation.

CONCLUSION
In this research study, an extensive analysis of the Air Quality Index (AQI) across six categories, including clean, healthy, unhealthy for sensitive groups, unhealthy, very unhealthy, and dangerous levels, was conducted.The analysis focused on the probability of consecutiveday transitions for 12 monitoring stations situated across different regions of Tehran, encompassing the north, south, east, west, and central areas.The study spanned a five-year period from 2018 to 2022.The findings revealed that the northern region of Tehran consistently exhibited the best air quality, while the western and southern regions consistently ranked as having the poorest air quality.
In conclusion, the probability of consecutive days with clean air quality in western Tehran was found to be less than 30%, whereas in northern Tehran, there was a probability of more than 60% for consecutive days with clean and healthy air quality.Importantly, it was observed that dangerous air pollution conditions, which are harmful to all living beings and associated with severe consequences, did not persist for more than two consecutive days in the north, east, and central regions of Tehran.However, there was a possibility of such conditions persisting in the southern and western regions, particularly in Shahrdari 19 and 21 areas.
These findings provide valuable insights into the dynamics of air quality in Tehran, underlining the importance of region-specific air quality management strategies and interventions.

Table 1 .
[21]raphical coordinates of ground pollution measuring stations Calculations of the air pollution index are based on major air pollutants, including particulate matter, ground-level ozone, sulfur dioxide, nitrogen dioxide, and carbon monoxide.Particulate matter and ozone are the most hazardous to human health and the environment.For each of these categories of air pollutants, different countries have their own air quality index concerning other national air quality standards for public health protection, as summarized in Table 2[21].
Figure 1.Distribution of air pollution measurement stations in Tehran be experienced days or hours after breathing polluted air.

Table 3 .
BIC index values of Markov chain models of different orders in Tehran air quality monitoring stations

Table 4 .
Frequency (%) analysis of air quality of studied stations in Tehran city