The work presented here is divided into two major parts. The first part gives an abreviated and informal discussion of the theoretical framework underpinning the methodology that is to be applied in the second part. It is based on material put together in Roqué (1996) which in turn was based on material incidentally derived in Roqué (1982). The topic is time-varying Markov chains for which there is not an extensive amount of literature. The reader is refered to the classical and very important work of Hajnal (1956, 1958). There is also a very good treatment of the subject matter in Howard (1971). The topic on non time-varying Markov chains, on the other hand, has generated an overwhelming amount of literature and it is treated by almost every major contemporary author. Among these, some of the best are Ross (1983, 1985), Çinlar (1975), and Karlin and Taylor (1975).
All Markov chains, time-varying or not, have in common that they satisfy the Markov property. A Stochastic Process is a collection of random variables indexed by time and a discrete time Stochastic Process {Xn:n= 0,1,2,…} defined on a discrete state space E is a Markov chain if :
P(Xn + 1 = j | X0, X1, …..Xn) = P(Xn + 1 = j | Xn = i)
for all i and j in E and all times n ≥ 0.
This property simply states that whatever state the process will move into depends only on where the process currently is and not on any prior history. If the state space E is finite, the Markov chain is also called finite. Our attention in this paper is restricted to finite Markov chains. If the cardinality of the set E is the integer v, and if the transition probabilities (as in equation (1)) do not vary with time, then the finite Markov chain has a well defined stochastic square matrix of dimension (v x v) which is usually labeled P and is called the matrix of transition probabilities. This matrix contains an orderly arrangement of the transition probabilities and the entries in each row must add up to one.
The second part uses the theory described in the first part to update the very successful work of Gabriel and Newmann (1957, 1962) in predicting rain probabilities and analyzing weather cycles in Israel. Their work was amply documented in Cox and Miller (1968) under the title “Rainfall in Tel Aviv.” They used the state space E = {0, 1}, where the two states represented “wet day” or “dry day.” The 2 x 2 transition matrix was estimated using 27 years of accumulated data on daily rainfall for the months of the rainy season. Their Markov chain was not time varying. It was used to predict the probability of wet and dry days and to analyze weather cycles during the months of the rainy season. They obtained probability distributions for the duration of both dry and wet spells. Here one will emulate their effort using instead a time-varying Markov chain that will permit estimating year round probabilities for both the Wet and the Dry seasons for both the short and the long terms.
ERGODICITY
Given the Stochastic Process {Xn:n = 0, 1, 2, ….} (here a finite Markov chain), then its ergodic properties are those that relate information derived from one realization of the process (usually a time average) to information derived from the entire ensemble of realizations (usually in the form of an expected value). One may borrow and adapt the discussion in Gross and Harris (1974) to illustrate the concept of ergodicity. If one assumes the following limits exist for all k integer, k ≥ 1:
Then the process is ergodic with respect to all its moments (i.e., ergodic in distribution function) if a(k) = b (k) = c(k) for all k integer, k ≥ 1.
These processes, when ergodic, become independent of time and possess a stationary or steady state probability distribution. It can be shown they also become independent of any initial distribution. In short, any one realization of the process yields all relevant information about the process behavior in the long run. The time varying process discussed here, however, will only exhibit weak ergodicity in distribution, that is, it is only the case that a(k) = c(k) for all k integer, k ≥ 1.
The reason the limit b(k) does not figure in the definition of weak ergodicity in the distribution function is because the time varying process will not always possess a steady state distribution. Instead, there will exist a unique finite length stationary cycle of probability distributions (vectors) that becomes independent of time and independent of any initial distribution. Any one realization will still yield all relevant information about the long-run behavior of the process. The process becomes independent of any initial distribution and except for the repetitive nature of the cycle, completely independent of time as well.
Let П((k)), k = 1, 2, ….,N be such a stationary cycle of probability distributions (vectors), then the following proposition becomes relevant:
Proposition: For the repeating stationary cycle, there exists a Cesaro sum convergence (coordinate wise) vector, П”, which itself forms a probability distribution.
The proof of this proposition is very simple and is omitted here.
THE PROCESS
The time varying Markov chain considered here is defined in the state space E = {0, 1}. According to Cox & Miller (1968), this is the smallest non-trivial state space. Any pair of ergodic (i.e., regular) non time varying Markov chains defined on this state space will suffice to construct the time varying process. The two Markov chains will be represented by the two 2 x 2 transition matrices P(1) and P(2). Matrix P(1) will have eigenvalues 1 and α, and steady state distribution (vector) П (1). The second matrix P(2) will have eigenvalues 1 and β, and steady state distribution (vector) П (2). Regularity requires both |α| < 1 and |β| < 1. One may then define a cyclic process with each repeating cycle of constant length (n + m) time epochs. In every cycle, as it repeats itself, transitions are governed by the chain P(1) during the first n epochs, inmediately followed by m time epochs where transitions are governed by the chain P(2). Hence transitions are varying in time as a consequence of alternating the two transition probability matrices.
Fortunately, for 2 x 2 stochastic matrices, the property of regularity is closed under matrix multiplication. One has a process here that easily overcomes the objections outlined in Hajnal (1958). The time varying process will also continue to satisfy the Markov property and the Markov chain process may be denoted {Yn : n = 0, 1, 2, ….}.
Showing weak ergodicity in distribution function for this process in the binary state space gets even easier if one makes the following observations: First, all moments for this process are identical, second, all sums of the indexed Y random variables count the number of transitions into state 1 and hence Cesaro sums represent proportion of time (or of transitions) in (or into) state 1, and finally, all expected values of the indexed Y random variables simply recover the probability of being in state 1.
The key to showing weak ergodicity in distribution requires borrowing and adapting some concepts from the theory of Simulation (see Law & Kelton (1982)). One must assume the time index has gotten very large and stationarity has set in. At some point after stationarity has set in, one begins looking at the sample paths of the chain. Alternatively, one may conceive of a process that simply starts at cyclic stationarity. From the theory of Simulation then, the end of each cycle is a Regeneration Point (actually all epochs in the cycle may be viewed as Regeneration Points). This simply means the process probabilistically restarts itself at the end of each cycle for the one realization being observed.
A simple thought experiment should convince the reader of this. Assume one has a countable infinity of probabilists. Each probabilist arrives at the process sequentially one at a time just before the next cycle begins. Each probabilist is given information on the state probability distribution at the beginning of the cycle, the two chains, and the rules of the process. Each probabilist, aided by the Markov property, is asked to derive the probability distributions of all finite trajectories of the sample paths of the chain from that moment on. Clearly, all probabilists, possessing identical information, will derive identical results.
One may then define for the jth cycle
It is then possible to treat S1,S2,…., etc. as independent, identically distributed (i.i.d.) random variables.
Since the cycle length is completely deterministic and given by the constant (n + m), one may then let for the jth cycle
EXTENSIONS
In order to extend these results to larger state spaces, some repeating sequence of factors in an infinite product of matrices is a requirement. Even if all matrices are regular, if they go on forever in the infinite product with no discernible pattern and with different steady state vectors, obviously the sequence of transient responses will go on indefinitely and the process will not cycle. The work in Howard (1971) strongly supports this conjecture. A repeating sequence under the right conditions as evidenced in the cases considered here insures cyclic behavior.
The case of 2 x 2 regular matrices is the only exception to the objections outlined in Hajnal (1958) to the approach using characteristic roots. The objection is simply that the product of two regular matrices may not be regular. Obviously, for larger state spaces some kind of restriction may be necessary. One that comes to mind right away is due to Hajnal (1958) and may be called “The Hajnal Qualification.” This qualification would require that one consider only repeating sequences of factor matrices that always begin with the same “scrambling” matrix. (See Hajnal (1958) to insure regularity throughout the infinite product of matrices.) A scrambling matrix is any regular stochastic matrix such that for any two of its rows there always is at least one column with nonzero entries for both rows. If the repeating pattern of factors in the infinite product of matrices is set for almost all matrices except, say the first one, then it may be possible to start the infinite product with an arbitrary starting scrambling matrix. The transient effect due to this starting matrix should vanish in the long run and regularity would be ensured throughout the entire product.
To show that scrambling matrices are not necessary for ergodicity and hence weak ergodicity, simply alternate any non-scrambling regular matrix with the identity matrix.
It should be noted that the two state processes considered here meet the criterion for weak ergodicity given by Hajnal (1956, 1958), but Hajnal’s criterion does not imply Cesaro sums convergence (see Hajnal (1956)). This convergence, on the other hand, is a requirement for weak ergodicity in distribution function. So the requirements for weak ergodicity in distribution are in fact more restrictive. One of the restrictions may very well be cyclic structure and behavior.
Another conjecture one may make is that the cycling processes, that in the limit may be exhibiting a new kind of stationarity, have autocovariance functions that are strictly a function of within cycle position and between cycle lags.
It is worth mentioning that the processes described here exhibit all kinds of peculiar new behavior. For example, there may be stochastically induced completely deterministic oscillation between two states (0 and 1). If one begins to shrink the time interval, at some point the oscillating entity will appear to be in both states simultaneously. Would a particle oscillating in such a fashion thru time exhibit wave-like characteristics? The same phenomenom shows there is a way in and out of absorption into a state and in fact this is what generates the oscillation. All of this generated by alternating a pair of distinct regular Markov chains without any absorbing state in them. Their behavior, and the oscillation, become independent of time and independent of initial conditions.
In more specific terms, this oscillation occurs when one lets n = 1 and m = 1 in the cycle, and the product of the two distinct alternating 2 x 2 regular matrices, in whichever order multiplied, yields a regular 2 x 2 matrix with one absorbing state in it.
The oscillation or periodic phenomenom can easily be extended to larger state spaces. It requires three matrices products for the 3 x 3 case, four matrices products for the 4 x 4 case, etc. The columns of the respective identity matrices alternate as the stationary cycle of probability distributions. As such, this constitutes a special class of time varying Markov chains. To show this all that is needed is the work in Howard (1971). An example of the 2 x 2 case (period = 2) is:
Although one can learn and infer from this oscillatory finite Markov chains, nevertheless no matter how small the transition interval selected, it is possible to select a smaller observation interval and observe the actual state. The Quantum problem is just the opposite; no matter how small one makes the observation interval, the transition interval can and will become even smaller making it impossible to observe the actual state. Obviously continuous time analogs to these models and perhaps even continuous state spaces are necessary.
For this oscillatory class of processes, if the periodicity is v, then the 1 x v frequency vector П” has every component equal to (1/v). As v gets larger, this vector approaches an infinite size column vector whose every component is equal to zero.
This large class of finite time varying Markov chains also has the property that the processes can start with the highest levels of entropy as initial conditions and terminate at cyclic stationarity with absolute zero entropy. One wonders if neural pathways, for example, process information this way.
THE APPLICATION
The theory thus far described can now be put to work in reviving and bringing up to date the work of Gabriel and Newmann (1957, 1962). These authors were very successful in studying weather cycles and generating rain probabilities during the rainy (or wet) season in Tel Aviv. Here the analysis will be conducted for the City of Miami which encompasses Little Havana. The author is indebted to William R. E. Locke of the National Weather Service and to Father Pedro Cartaya, S. J., of the Belen Jesuit Prep. School Observatory who were instrumental in obtaining the data for this study.
The states of nature for the purpose of this analysis are only two. State 0 represents a dry day, that is, a day for which no rain (or only a trace amount) was recorded and state 1 represents a wet day, that is, a day for which at least .01 inches of rain was recorded. The recording station was the weather station at Miami International Airport. Amounts of rain on any given day less than .01 inches are considered a trace amount and are labeled as such by the recording station. Such days are considered dry days in this study.
The data consisted of 10 years of daily rainfall records comprising the years 1984 thru 1993, a total of 3653 days. The year was divided into a Dry season comprising the 181 days from November 1 to April 30 and a Wet season comprising the 184 days from May 1 to October 31. The data consisted then of 1813 Dry season days and 1840 Wet season days.
There were 429 wet days and 1384 dry days in the Dry season and there were 879 wet days and 961 dry days in the Wet season. This data yielded the following inmediate information: It rains an average of 42.9 days in the Dry season with an average downfall per season of 15.01 inches. The average downfall per day of precipitation in the Dry season was .35 inches with a standard deviation of .69 inches. For the Wet season, it rains an average of 87.9 days per season with an average downfall of 41.31 inches per season. The average downfall per day of precipitation in the Wet season was .47 inches with a standard deviation of .69 inches. The average total yearly precipitation was then 56.32 inches of rain.
To construct the time varying Stochastic Process, two Markov chain transition probability matrices were estimated using relative frequencies (Maximum Likelyhood Estimators). The data results for the Dry season were the following:
The steady state or long term probability of rain on any given day of the Dry season was .235, that is, it tends to rain on slightly less than one fourth of the days. Similarly, the data results for the Wet season were the following:
The steady state or long term probability of rain on any given day of the Wet season was .480, that is, it tends to rain on approximately one half of the days.
The Markov chains can be used to generate probabilities outright. For example, if on any given day of the Dry season there is an estimate of a 50% chance of rain, then the probability that it rains the following day can be obtained from the following product:
It yields a subsequent probability of rain of .29 (and .71 of a dry day).
One can use the Markov chains to analyze the weather cycles within each season (see Cox & Miller (1968)). Let the duration of a Dry season dry spell (in days) be denoted by the random variable DSDS. Then the probability of a dry spell during the Dry season lasting j days is given by:
P(DSDS = j) = (.813)j – 1(.187), j = 1, 2, ….
The mean of this geometric distribution is 1/(.187), yielding an average duration of a Dry season dry spell of 5.35 days. The variance is (.813)/(.187)2 , which yields a standard deviation of 4.82 days. Similarly, the duration of a Dry season wet spell in days as a random variable denoted DSWS has the following probability distribution:
P(DSDWS = j) = (.392)j – 1(.608), j = 1, 2, …
As in Gabriel & Newmann (1957), one can define a Dry season weather cycle (DSWC) as a dry spell followed by a wet spell. They showed the pertinent random variables may be assumed statistically independent. Then,
DSWC = DSDS + DSWS
The weather cycle random variable is the convolution of two distinct and independent geometric random variables.
One can repeat this analysis for the Wet season by defining the random variables Wet season dry spell (WSDS), Wet season wet spell (WSWS), and Wet season weather cycle (WSWC), in which case:
P(WSDS = j) = (.647)j – 1(.353), j = 1, 2, …
P(WSWS = j) = (.617)j – 1(.383), j = 1, 2, …
and
WSWC = WSDS + WSWS.
Tables 1 and 2 summarize basic information concerning all the weather cycle random variables. They give the mean and standard deviation of all pertinent random variables.
Finally, one can obtain the results for the time varying Stochastic Process. The stationary or long term probability of rain for the jth day of the Dry season is given by:
П1((j)) .235 (.205)j(.245) j + 1, 2, …, 181
The stationary or long term probability of rain for the kth day of the Wet season is given by:
П1((k)) .480 (.264)k(.245) k + 1, 2,…, 184
Both probabilities settle down very quickly to the respective steady state values for each season but one can identify the boundary interaction between the alternating seasons as a stationary residual transient response that somewhat smooths out the change of seasons. During the first few days of the Dry season, the probabilities of rain are slightly higher than the normal tendency for the season and during the first few days of the Wet season the probabilities of rain are slightly lower than the normal tendency for the season.
Tables 3 and 4 summarize the stationary probabilities for the respective seasons.
Since it tends to rain slightly less than one fourth of the days during the Dry season and approximately one half of the days during the Wet season and since both seasons last approximately the same, it stands to reason that it should rain slightly over one third of the days throughout the years. This can be confirmed by calculating П1“:
CONCLUSION
The methodology of Gabriel and Newmann yields a wealth of information on year round climatic conditions. This information should prove of value to planners in both the agricultural and tourism sectors of the local economy. It should prove useful to the researchers of the Mobile Irrigation Laboratory and to the agronomists of the Homestead Agricultural Center. Even though the data reflects values calculated for the City of Miami, results should not be too dissimilar for regions throughout Miami Dade County. At the very worst, the methodology may be re-implemented on other regions.
The methodology implemented here is applicable in any country of the world that is subject to a subtropical climate with well defined Dry and Wet seasons. Cuba and Puerto Rico, for example, are two such nations.
Leave a Reply