T-LAB Home
T-LAB 10.2 - ON-LINE HELP Prev Page Prev Page
T-LAB
Introduction
What T-LAB does and what it enables us to do
Requirements and Performances
Corpus Preparation
Corpus Preparation
Structural Criteria
Formal Criteria
File
Import a single file...
Prepare a Corpus (Corpus Builder)
Open an existing project
Settings
Automatic and Customized Settings
Dictionary Building
Co-occurrence Analysis
Word Associations
Co-Word Analysis and Concept Mapping
Comparison between Word pairs
Sequence and Network Analysis
Concordances
Co-occurrence Toolkit
Thematic Analysis
Thematic Analysis of Elementary Contexts
Modeling of Emerging Themes
Thematic Document Classification
Dictionary-Based Classification
Texts and Discourses as Dynamic Systems
Comparative Analysis
Specificity Analysis
Correspondence Analysis
Multiple Correspondence Analysis
Cluster Analysis
Singular Value Decomposition
Lexical Tools
Text Screening / Disambiguations
Corpus Vocabulary
Stop-Word List
Multi-Word List
Word Segmentation
Other Tools
Variable Manager
Advanced Corpus Search
Classification of New Documents
Key Contexts of Thematic Words
Export Custom Tables
Editor
Import-Export Identifiers list
Glossary
Analysis Unit
Association Indexes
Chi-Square
Cluster Analysis
Coding
Context Unit
Corpus and Subsets
Correspondence Analysis
Data Table
Disambiguation
Dictionary
Elementary Context
Frequency Threshold
Graph Maker
Homograph
IDnumber
Isotopy
Key-Word (Key-Term)
Lemmatization
Lexical Unit
Lexie and Lexicalization
Markov Chain
MDS
Multiwords
N-grams
Naïve Bayes
Normalization
Occurrences and Co-occurrences
Poles of Factors
Primary Document
Profile
Specificity
Stop Word List
Test Value
Thematic Nucleus
TF-IDF
Variables and Categories
Words and Lemmas
Bibliography
www.tlab.it

Cluster Analysis


N.B.: The pictures shown in this section have been obtained by using a previous version of T-LAB. These pictures look slightly different in T-LAB 10. Also: a) there is a new button (TREE MAP PREVIEW) which allows the user to create dynamic charts in HTML format; b) the DENDROGRAM button has been replaced by the Graph Maker tool; c) a quick access gallery of pictures which works as an additional menu allows one to switch between various outputs with a single click (see the below image).

This T-LAB tool uses the results of a previous Correspondence Analysis; in particular, the computation uses the object coordinates (lexical units or context units) on the first factorial axes (until a maximum of 10).

Accordingly, the user can select from three clustering techniques:

a) hierarchical (Ward method);
b) K-means (MacQueen method);
c) hdbscan (hierarchical DBSCAN).

The first two (a, b) allow the user to explore (tables and graphs) solutions from 3 to 20 clusters; while the third (c), which requires an additional parameter (i.e. the minimum number of words within a cluster), allows the user to explore just one solution.

N.B.: When the hierarchical method is select T-LAB enables an option (see the 'Refine' button below) that allows the user to combine the Ward and K-Means methods.

A brief description of the three techniques is available in the glossary of this manual.

At the processing end,
T-LAB shows graphs and tables.

The graphs represent clusters in the space detected by the correspondence analysis (see below)

 

In order to explore the various combinations of the factorial axes it is sufficient to select them in the appropriate boxes ("X Axis", "Y Axis").

In the case of hierarchical clustering, the user can easily explore (graphs and tables) the different partitions.

Dendrograms, pie charts and bar charts allow us to check the characteristics of each partition.

Bar charts allow us to check the relationships between clusters and variables.

Two kinds of tables are available:

(A) if the clustered objects are lexical units, for each of them (and for each cluster) the respective occurrences ('OCC') and distances ('DIST') from the centroids are displayed; moreover, for each variable which is significantly associated with the cluster examined, the respective Test Value is displayed.

(B) if the clustered objects are elementary contexts, the characteristics of each cluster (lexical units and variables) are described by means of the same method used in Thematic Analysis of Elementary Contexts (see below).

In the case of analyses performed using the hierarchical or K-means methods, T-LAB allows the user to view and to export a file (see "HTML Output" key) in which the characteristics of the clusters and some measures relating to the quality of the partition are reported.