www.tlab.it
Markov chains
A Markov chain (from the name of the Russian
mathematician Andrei Andreiëvich Markov) consists in a succession (or sequence) of events, generally
suitable as status, characterized by
two properties:
 the series of the events and their
possible outcomes is a finite set;
 the outcome of each event depends only (or at the most) on the
immediately precedent event.
With the consequence that a
probability value corresponds to every transition from one event to the other.
In scientific domain, the Markovian chains model is used to analyse
the succession of economic, biological, physical events etc. In the
domain of linguistic studies its application concerns the possible
combinations of the various analysis units on the syntagmatic axis
(one item after the other).
In TLAB the analysis of the Markovian chains
relates to two types of sequences:
· those concerning the relationships
between lexical units (words, lemmas or categories) present in the
corpus under analysis;
· those present in external files prepared by the user.
In both cases, to start with, some
square tables are constructed in which the occurrence of
transitions is recorded, that is the quantity that indicates the
number of times in which an analysis unit precedes (or follows) the
other. Subsequently, the transition occurrences are transformed
into probability values (see the following images):
For further information see Sequence Analysis
