www.tlab.it
Cluster Analysis
Cluster analysis is a set of statistical techniques the
aim of which is to detect groups of objects with two complementary
features:
A  High internal (within cluster) homogeneity;
B  High external (between cluster) heterogeneity.
In statistical language, the characteristics "A" and "B"
respectively correspond to the within and between cluster
variance.
In general, there are two kinds of Cluster Analysis
techniques:
 Hierarchical methods, whose
algorithms rebuild the whole hierarchy of the objects under
analysis (the so called "tree"), whether in an ascending order or
in a descending order;
 Partitioning methods, where
the user defines beforehand the cluster numbers in which the set of
objects under analysis is divided.
TLAB uses both
types of algorithms.
In particular:
· the CoWord
Analysis option uses a hierarchical method;
· the Cluster Analysis option allows the
use of three different methods: two hierarchical and one
partitioning;
· the Thematic Analysis of Elementary
Contexts and Thematic Document
Classification options use a bisecting Kmeans algorithm
.
Some of the publications quoted in
the Bibliography provide further
information on the general aspects of the various methods (Bolasco
S., 1999; Lebart L., A. Morineau, M. Piron, 1995), the specific
aspects relating to the Hdbscan (Campello R. J. G. B., Moulavi D.,
Zimek A. & Sander J. , 2015) and the bisecting Kmeans method
(Steinbach, M., G. Karypis, V. Kumar, 2000; Savaresi S.M., D.L.
Boley, 2001).
