www.tlab.it
Structural Criteria
There are two structural criteria
which must be observed: the corpus
size and its subdivision into
parts.
As for the size, all T-LAB 10 tools
have been tested with a 90Mb corpus, approximately equivalent to
55,000 pages in .txt format.
Limits for the minimum size
require different evaluation criteria, because, under a certain
threshold, the corpus size can prejudice the reliability of many
statistical analyses. Just follow these simple instructions: use
corpora with at least 5,000 occurrences (approximately 30 Kb);
otherwise, in the case of open-ended questions, a minimum of 50
answers.
In order to be processed, a corpus can be made up of: a
single text without further partitions; a single text subdivided
according to criteria established by the user (for example, a book
divided into chapters); a number of texts (for example, different
interviews or documents) classified through the use of labels
linked to as many variables or IDnumber. In any case, the corpus is subdivided
into parts that must be defined by precise formal criteria.
|