www.tlab.it
Corpus Preparation
In the case of a single document (or a corpus considered
as a single text) T-LAB needs
no further work: just select the Import a
single file.. and proceed as explained in the corresponding
section of this manual.
When, on the other hand, the corpus is made up of various
texts and/or categorical variables are
used, the Corpus
Builder tool must be used, which automatically
transforms any textual material and various types of files (i.e. up
to eleven different formats) into a corpus file ready to be
imported by T-LAB.
N.B.:
- we advise an orthographic review of the material to be
analysed. Moreover, if some important acronyms are spaced out from
punctuation (e.g. "U.N.") their transformation in single string
(e.g. "U_N") is recommended; this is because, in the normalization
phase, T-LAB interprets
the punctuation marks like separators;
- at the end of the corpus preparation phase it is
recommended that a new folder be created which should contain only
the corpus to be imported.
|