What T-LAB does and what it enables us to do
Requirements and Performances
Corpus Preparation
Corpus Preparation
Structural Criteria
Formal Criteria
Import a single file...
Prepare a Corpus (Corpus Builder)
Open an existing project
Automatic and Customized Settings
Dictionary Building
Co-occurrence Analysis
Word Associations
Co-Word Analysis and Concept Mapping
Comparison between Word pairs
Sequence and Network Analysis
Thematic Analysis
Thematic Analysis of Elementary Contexts
Modeling of Emerging Themes
Thematic Document Classification
Dictionary-Based Classification
Key Contexts of Thematic Words
Comparative Analysis
Specificity Analysis
Correspondence Analysis
Multiple Correspondence Analysis
Cluster Analysis
Singular Value Decomposition
Lexical Tools
Text Screening / Disambiguations
Corpus Vocabulary
Stop-Word List
Multi-Word List
Word Segmentation
Other Tools
Variable Manager
Advanced Corpus Search
Contingency Tables
Analysis Unit
Association Indexes
Cluster Analysis
Context Unit
Corpus and Subsets
Correspondence Analysis
Data Table
Elementary Context
Frequency Threshold
Graph Maker
Key-Word (Key-Term)
Lexical Unit
Lexie and Lexicalization
Markov Chain
Naïve Bayes
Occurrences and Co-occurrences
Poles of Factors
Primary Document
Stop Word List
Test Value
Thematic Nucleus
Variables and Categories
Words and Lemmas

Cluster Analysis

N.B.: The pictures shown in this section have been obtained by using a previous version of T-LAB. These pictures look slightly different in T-LAB Plus. Also: a) there is a new button (TREE MAP PREVIEW) which allows the user to create dynamic charts in HTML format; b) the DENDROGRAM button has been replaced by the Graph Maker tool.

This T-LAB tool uses the results of a previous Correspondence Analysis; in particular, the computation uses the object coordinates (lexical units or context units) on the first factorial axes (until a maximum of 10).

Accordingly, the user can select from three clustering techniques:

a) hierarchical (Ward method);
b) K-means (MacQueen method);
c) Kohonen (neuron grid).

The first two (a, b) allow the user to explore (tables and graphs) solutions from 3 to 20 clusters; while the third (c) groups the analysis units (only lexical units) within various sized grids (min 3 x 3, max 9 x 9).

N.B.: When the hierarchical method is select T-LAB enables an option (see the 'Refine' button below) that allows the user to combine the Ward and K-Means methods.

A brief description of the three techniques is available in the glossary of this manual.

At the processing end,
T-LAB shows graphs and tables.

The graphs represent clusters in the space detected by the correspondence analysis (see below)


In order to explore the various combinations of the factorial axes it is sufficient to select them in the appropriate boxes ("X Axis", "Y Axis").

In the case of hierarchical clustering, the user can easily explore (graphs and tables) the different partitions.

Dendrograms, pie charts and bar charts allow us to check the characteristics of each partition.

Bar charts allow us to check the relationships between clusters and variables.

Two kinds of tables are available:

(A) if the clustered objects are lexical units, for each of them (and for each cluster) the respective occurrences ('OCC') and distances ('DIST') from the centroids are displayed; moreover, for each variable which is significantly associated with the cluster examined, the respective Test Value is displayed.

(B) if the clustered objects are elementary contexts, the characteristics of each cluster (lexical units and variables) are described by means of the same method used in Thematic Analysis of Elementary Contexts (see below).

In the case of analyses performed using the hierarchical or K-means methods, T-LAB allows the user to view and to export a file (see "HTML Output" key) in which the characteristics of the clusters and some measures relating to the quality of the partition are reported.



In the case of the Kohonen maps, T-LAB produces only one type of output: a HTML file with the neuron grid and the lexical units included in each of them.