T-LAB Home
T-LAB PLUS 2019 - ON-LINE HELP Prev Page Prev Page
T-LAB
Introduction
What T-LAB does and what it enables us to do
Requirements and Performances
Corpus Preparation
Corpus Preparation
Structural Criteria
Formal Criteria
File
Import a single file...
Prepare a Corpus (Corpus Builder)
Open an existing project
Settings
Automatic and Customized Settings
Dictionary Building
Co-occurrence Analysis
Word Associations
Co-Word Analysis and Concept Mapping
Comparison between Word pairs
Sequence and Network Analysis
Concordances
Thematic Analysis
Thematic Analysis of Elementary Contexts
Modeling of Emerging Themes
Thematic Document Classification
Dictionary-Based Classification
Key Contexts of Thematic Words
Comparative Analysis
Specificity Analysis
Correspondence Analysis
Multiple Correspondence Analysis
Cluster Analysis
Singular Value Decomposition
Lexical Tools
Text Screening / Disambiguations
Corpus Vocabulary
Stop-Word List
Multi-Word List
Word Segmentation
Other Tools
Variable Manager
Advanced Corpus Search
Contingency Tables
Editor
Glossary
Analysis Unit
Association Indexes
Chi-Square
Cluster Analysis
Coding
Context Unit
Corpus and Subsets
Correspondence Analysis
Data Table
Disambiguation
Dictionary
Elementary Context
Frequency Threshold
Graph Maker
Homograph
IDnumber
Isotopy
Key-Word (Key-Term)
Lemmatization
Lexical Unit
Lexie and Lexicalization
Markov Chain
MDS
Multiwords
N-grams
Naïve Bayes
Normalization
Occurrences and Co-occurrences
Poles of Factors
Primary Document
Profile
Specificity
Stop Word List
Test Value
Thematic Nucleus
TF-IDF
Variables and Categories
Words and Lemmas
Bibliography
www.tlab.it

Elementary Contexts


During the importation phase, T-LAB makes a corpus segmentation into elementary contexts in order to help user exploration and, above all, to make analyses that require the co-occurrences computation.


According to the user's choices, the elementary contexts can be:

1 - Sentences

Elementary contexts ending with punctuation marks (.? ! ), whose length range is 50-1,000 characters.

 

2 - Chunks

Elementary contexts of comparable length made up of one or more sentences.

More precisely:

- T-LAB considers an elementary context to be every sequence of words interrupted by full stop and carriage return, whose dimensions are inferior to 400 characters;

- in the case where, within the maximum length, a full stop is not present, it searches for other punctuation marks in the following order (? ! ; : ,). If none are found, it performs segmentation on the basis of a statistical criterion, but without cutting the lexical units.



3 - Paragraphs

Elementary contexts ending with punctuation marks (.? ! ) and the return key, whose maximum length is 2,000 characters.

4 - Short Texts

This option is enabled only when the maximum length of texts is 2,000 characters (e.g. responses to open-ended questions).

N.B.:

- the corpus_segments.dat file contains the result of corpus segmentation;

- In T-LAB, the Concordances option allows the checking of elementary contexts where each word (or lemma) is present.