T-LAB Home
T-LAB 10.2 - ON-LINE HELP Prev Page Prev Page
T-LAB
Introduction
What T-LAB does and what it enables us to do
Requirements and Performances
Corpus Preparation
Corpus Preparation
Structural Criteria
Formal Criteria
File
Import a single file...
Prepare a Corpus (Corpus Builder)
Open an existing project
Settings
Automatic and Customized Settings
Dictionary Building
Co-occurrence Analysis
Word Associations
Co-Word Analysis and Concept Mapping
Comparison between Word pairs
Sequence and Network Analysis
Concordances
Co-occurrence Toolkit
Thematic Analysis
Thematic Analysis of Elementary Contexts
Modeling of Emerging Themes
Thematic Document Classification
Dictionary-Based Classification
Texts and Discourses as Dynamic Systems
Comparative Analysis
Specificity Analysis
Correspondence Analysis
Multiple Correspondence Analysis
Cluster Analysis
Singular Value Decomposition
Lexical Tools
Text Screening / Disambiguations
Corpus Vocabulary
Stop-Word List
Multi-Word List
Word Segmentation
Other Tools
Variable Manager
Advanced Corpus Search
Classification of New Documents
Key Contexts of Thematic Words
Export Custom Tables
Editor
Import-Export Identifiers list
Glossary
Analysis Unit
Association Indexes
Chi-Square
Cluster Analysis
Coding
Context Unit
Corpus and Subsets
Correspondence Analysis
Data Table
Disambiguation
Dictionary
Elementary Context
Frequency Threshold
Graph Maker
Homograph
IDnumber
Isotopy
Key-Word (Key-Term)
Lemmatization
Lexical Unit
Lexie and Lexicalization
Markov Chain
MDS
Multiwords
N-grams
Naïve Bayes
Normalization
Occurrences and Co-occurrences
Poles of Factors
Primary Document
Profile
Specificity
Stop Word List
Test Value
Thematic Nucleus
TF-IDF
Variables and Categories
Words and Lemmas
Bibliography
www.tlab.it

Dictionary building


The option Dictionary building opens a window in which the user can carry out some operations on the corpus dictionary.

The user can rename or group the avalaible lemmas (see option '3' below); furthermore he can export the dictionary (see option '4' below); or import a customized dictionary (see option '5' below).

The starting point is a table (the Corpus Dictionary) that reports the following information:

- word/lemma correspondences;
- word occurrences;
- some labels which refer to the automatic lemmatization (see the "INF" column)
.


Before any intervention, by selecting a specific word and by using the right click of the mouse, the user can check the concordances (Key-Word-in-Context) which interests him (see the above option '2'). In any case, after clicking the "keyword selection" tab, the customized settings must be selected (see the above option '1').

The possible operations, even though different in their goals (revision of the lemmatizations and/or applications of grids for content analysis), all give a reorganization of the T-LAB database, thus creating different tables used to analyse data. Therefore all operations must be done for the words (lemmas or categories) considered to be interesting for the subsequent analyses. T-LAB, in fact, makes a further option available, Key Words Selection, with which users can decide which lemmas to "keep" and which to "discard".

The two functions (Dictionary Building and Key Words Selection) are strongly interconnected and the user can easily move from one to the other, also in order to change one's choices.

In Dictionary building there are two operating modalities:

- one which allows you to move the selected words (click) to the box on the right and, afterwards, re-denominating them by using the option "replace" (N.B.: In this case, the new label can be chosen from the selected lemmas. See the above option '3') or by typing a new label in the appropriate box;

- the other by using the "import a dictionary" option when the user intends to apply his list for classifying the words (See the above option '5').

N.B.: The right-click in the Rename / Group box enables a context menu which allows three operations: a) verify the concordances (Key-Word-in-Context) of the selected item; b) remove the selected item from the box; c) remove all selected items from the box.

In order to import a customized dictionary, it is required that the user has set up a Dictio.diz or Dictionary.diz file.
These files can be made up of "n" lines, each with a couple of strings separated by the character ";".
The maximum length of a string (word, lemma or category) is 50 characters: neither blank spaces no apostrophes must be included.

For each couple, the first string - on the left - indicates the label (lemma or category) defined by the user, the second indicates the corresponding word (Dictio.diz case) or lemma (Dictionary.diz case) already present in T-LAB dictionary.

These are some examples:

(File Dictio.diz) (File Dictionary.diz)

ACCEPT;accept
ACCEPT;accepted
ACCEPT;accepting
ACCEPT;accepts

------
CHILD;child

CHILD;children
WOMAN;woman

WOMAN;women

BIOTECH;biotech
BIOTECH;biotechnology

---
ABSTRACT_TOUGHT;distinctness
ABSTRACT_TOUGHT;distinguish
ABSTRACT_TOUGHT;diversification
ABSTRACT_TOUGHT;diversif

According to the type of file you import, the changes will be as follows:

N.B.:

- Using the option Lemmatized Corpus it is possible to export a copy of the corpus ( .txt file) in which every word will be replaced by the corresponding lemma or category.
- When the dictionary has been modified, the following analyses on the same corpus are available only as customized settings.