www.tlab.it
Dictionary building
The option Dictionary
building opens a window in which the user can carry out some
operations on the corpus dictionary.
The user can rename or
group the avalaible lemmas (see option '3' below); furthermore he can
export the dictionary (see option '4'
below); or import a customized
dictionary (see option '5' below).
The starting point is a table (the Corpus Dictionary) that reports the following
information:
- word/lemma correspondences;
- word occurrences;
- some labels which refer to the automatic
lemmatization (see the "INF" column).
Before any intervention, by selecting a specific word and
by using the right click of the mouse, the user can check the
concordances (Key-Word-in-Context) which
interests him (see the above option '2'). In any case, after
clicking the "keyword selection" tab, the customized settings must
be selected (see the above option '1').
The possible operations,
even though different in their goals (revision of the
lemmatizations and/or applications of grids for content analysis),
all give a reorganization of the T-LAB
database, thus creating different tables used to analyse data.
Therefore all operations must be done for the words (lemmas or
categories) considered to be interesting for the subsequent
analyses. T-LAB, in
fact, makes a further option available, Key
Words Selection, with which users can decide which lemmas to
"keep" and which to "discard".
The two functions (Dictionary
Building and Key Words
Selection) are strongly interconnected and
the user can easily move from one to the other, also in order to
change one's choices.
In Dictionary building there
are two operating modalities:
- one which allows you to move the selected words (click)
to the box on the right and, afterwards, re-denominating them by
using the option "replace" (N.B.: In this case, the new label can
be chosen from the selected lemmas. See the above option '3') or by
typing a new label in the appropriate box;
- the other by using the "import a dictionary" option
when the user intends to apply his list for classifying the words
(See the above option '5').
N.B.: The right-click in the Rename / Group box enables a
context menu which allows three operations: a) verify the
concordances (Key-Word-in-Context) of the selected item; b) remove
the selected item from the box; c) remove all selected items from
the box.
In order to import a
customized dictionary, it is required
that the user has set up a Dictio.diz
or Dictionary.diz file.
These files can be made up of "n" lines, each with a couple of
strings separated by the character ";".
The maximum length of a string (word, lemma or category) is 50
characters: neither blank spaces no apostrophes must be
included.
For each couple, the first string - on the left - indicates the
label (lemma or category) defined by the user, the second indicates
the corresponding word (Dictio.diz
case) or lemma (Dictionary.diz case) already present in
T-LAB
dictionary.
These are some examples:
(File Dictio.diz)
|
(File Dictionary.diz)
|
ACCEPT;accept
ACCEPT;accepted
ACCEPT;accepting
ACCEPT;accepts
------
CHILD;child
CHILD;children
WOMAN;woman
WOMAN;women
|
BIOTECH;biotech
BIOTECH;biotechnology
--- ABSTRACT_TOUGHT;distinctness
ABSTRACT_TOUGHT;distinguish
ABSTRACT_TOUGHT;diversification
ABSTRACT_TOUGHT;diversif
|
According to
the type of file you import, the changes will be as
follows:
N.B.:
- Using the option Lemmatized
Corpus it is possible to export a copy of the corpus ( .txt
file) in which every word will be replaced by the corresponding
lemma or category.
- When the dictionary has been
modified, the following analyses on the same corpus are available
only as customized settings.
|