Cluster Analysis
Cluster analysis is a set of statistical techniques the
aim of which is to detect groups of objects with two complementary
A - High internal (within cluster) homogeneity;
B - High external (between cluster) heterogeneity.
In statistical language, the characteristics "A" and "B"
respectively correspond to the within and between cluster
In general, there are two kinds of Cluster Analysis
- Hierarchical methods, whose
algorithms rebuild the whole hierarchy of the objects under
analysis (the so called "tree"), whether in an ascending order or
in a descending order;
- Partitioning methods, where
the user defines beforehand the cluster numbers in which the set of
objects under analysis is divided.
T-LAB uses both
types of algorithms.
In particular:
· the Co-Word
Analysis option uses a hierarchical method;
· the Cluster Analysis option allows the
use of three different methods: two hierarchical and one
· the Thematic Analysis of Elementary
Contexts and Thematic Document
Classification options use a bisecting K-means algorithm
Some of the publications quoted in
the Bibliography provide further
information on the general aspects of the various methods (Bolasco
S., 1999; Lebart L., A. Morineau, M. Piron, 1995), the specific
aspects relating to the Hdbscan (Campello R. J. G. B., Moulavi D.,
Zimek A. & Sander J. , 2015) and the bisecting K-means method
(Steinbach, M., G. Karypis, V. Kumar, 2000; Savaresi S.M., D.L.
Boley, 2001).