Файл: Гинзбург - Лексикология.pdf

Скачать файл (2,36Мб)

Заказать решение

ВУЗ: Не указан

Категория: Не указан

Дисциплина: Не указана

Добавлен: 06.04.2021

Просмотров: 4993

Скачиваний: 88

ВНИМАНИЕ! Если данный файл нарушает Ваши авторские права, то обязательно сообщите нам.

Last but not least contrastive analysis deals with the meaning and use

of s i t u a t i o n a l verbal units, i.e. words, word-groups, sentences
which are commonly used by native speakers in certain situations.

For instance when we answer a telephone call and hear somebody ask-

ing for a person whose name we have never heard the usual answer for the
Russian speaker would be

Вы ошиблись (номером)

Вы не туда попали.

The Englishman in identical situation is likely to say

Wrong number.

When somebody apologises for inadvertently pushing you or treading on
your foot and says

Простите\

(I beg your pardon. Excuse me.)

the Rus-

sian speaker in reply to the apology would probably say —

Ничего, по-

жалуйста,

whereas the verbal reaction of an Englishman would be differ-

ent —

It’s all right. It does not matter. * Nothing

*please

in this case

cannot be viewed as words correlated with

Ничего, Пожалуйста."

To sum up contrastive analysis cannot be overestimated as an indispen-

sable stage in preparation of teaching material, in selecting lexical items to
be extensively practiced and in predicting typical errors. It is also of great
value  for  an  efficient  teacher  who  knows  that  to  have  a  native  like  com-
mand  of  a  foreign  language,  to  be  able  to  speak  what  we  call  idiomatic
English,  words,  word-groups and  whole sentences  must be  learned within
the  lexical,  grammatical  and  situational  restrictions  of  the  English  lan-
guage.

An important and promising trend in modern

linguistics which has been making

progress during the last few decades is the quantitative study of language
phenomena and the application of statistical methods in linguistic analysis.

Statistical linguistics is nowadays generally recognised as one of the

major branches of linguistics. Statistical inquiries have considerable im-
portance not only because of their precision but also because of their rele-
vance to certain problems of communication engineering and information
theory.

Probably one of the most important things for modern linguistics was

the realisation of the fact that non-formalised statements are as a matter of
fact unverifiable, whereas any scientific method of cognition presupposes
verification of the data obtained. The value of statistical methods as a
means of verification is beyond dispute.

Though statistical linguistics has a wide field of application here we

shall discuss mainly the statistical approach to vocabulary.

Statistical approach proved essential in the selection of vocabulary

items of a foreign language for teaching purposes.

It is common knowledge that very few people know more than 10% of

the words of their mother tongue. It follows that if we do not wish to waste
time on committing to memory vocabulary items which are never likely to
be useful to the learner, we have to select only lexical units that are com-
monly used by native speakers. Out of about 500,000 words listed in the

OED

the “passive” vocabulary of an educated Englishman comprises no

more than 30,000 words and of these 4,000 — 5,000

242

§ 2. Statistical Analysis

are presumed to be amply sufficient for the daily needs of an average
member of the English speech community. Thus it is evident that the prob-
lem of selection of teaching vocabulary is of vital importance.

It is also

evident that by far the most reliable single criterion is that of frequency as
presumably the most useful items are those that occur most frequently in
our language use.

As far back as 1927, recognising the need for information on word fre-

quency for sound teaching materials, Ed. L. Thorndike brought out a list of
the 10,000 words occurring most frequently in a corpus of five million
running words from forty-one different sources. In 1944 the extension was
brought to 30,000 words.

Statistical techniques have been successfully applied in the analysis of

various linguistic phenomena: different structural types of words, affixes,
the vocabularies of great writers and poets and even in the study of some
problems of historical lexicology.

Statistical regularities however can be observed only if the phenomena

under analysis are sufficiently numerous and their occurrence very fre-
quent. Thus the first requirement of any statistic investigation is the
evaluation of the size of the sample necessary for the analysis.

To illustrate this statement we may consider the frequency of word oc-

currences.

It is common knowledge that a comparatively small group of words

makes up the bulk of any text.

It was found that approximately 1,300 —

1,500 most frequent words make up 85% of all words occurring in the text.
If,  however,  we  analyse  a  sample  of  60  words  it  is  hard  to  predict  the
number of occurrences of most frequent words. As the sample is so small
it  may  contain  comparatively  very  few  or  very  many  of  such  words.  The
size  of  the  sample  sufficient  for  the  reliable  information  as  to  the  fre-
quency of the items under analysis is determined by mathematical statistics
by means of certain formulas.

It goes without saying that to be useful in teaching statistics should

deal  with  meanings  as  well  as  sound-forms  as  not  all  word-meanings  are
equally  frequent.  Besides,  the  number  of  meanings  exceeds  by  far  the
number  of  words.  The  total  number  of  different  meanings  recorded  and
illustrated  in

OED

for the first 500 words of the

Thorndike Word List

14,070, for the first thousand it is nearly 25,000. Naturally not all the
meanings should be included in the list of the first two thousand most
commonly used words. Statistical analysis of meaning frequencies resulted
in the compilation of

A General Service List of English Words with Seman-

tic Frequencies.

The semantic count is a count of the frequency of the oc-

currence of the various senses of 2,000 most frequent words as found in a
study of five million running words. The semantic count is based on the
differentiation of the meanings in the

OED

and the

’

See ‘Various Aspects ...’, § 14, p. 197; ‘Fundamentals of English Lexicography, § 6,

p. 216.

The Teacher’s Word Book of 30,000 Words

by Edward L. Thorndike and Irvin

Lorge. N. Y., 1963. See also

M. West.

A General Service List of English Words. L., 1959,

pp. V-VI.

See ‘Various Aspects ...’, § 14, p. 197.

243

frequencies are expressed as percentage, so that the teacher and textbook
writer may find it easier to understand and use the list. An example will
make the procedure clear.

room (’space’) takes less room, not enough room to turn round
(in) make room for

(figurative)

room for improvement

}

12%

come to my room, bedroom, sitting room; drawing room, bath-
room

}

83%

(plural = suite, lodgings) my
room in college to let rooms

}

It can be easily observed from the semantic count above that

the

mean-

ing ‘part of a house’

(sitting room, drawing

room,

etc.) makes up 83% of

all occurrences of the word

room

and should be included in the list of

meanings to be learned by the beginners, whereas the meaning ’suite,
lodgings’ is not essential and makes up only

of all occurrences of this

word.

Statistical methods have been also applied to various theoretical prob-

lems of meaning. An interesting attempt was made by G. K.

Zipf

to study

the relation between polysemy and word frequency by statistical methods.
Having discovered that there is a direct relationship between the number of
different meanings of a word and its relative frequency of occurrence, Zipf
proceeded to find a mathematical formula for this correlation. He came to
the  conclusion  that  different  meanings  of  a  word  will  tend  to  be  equal  to
the square root of its relative frequency (with the possible exception of the
few  dozen  most  frequent  words).  This  was  summed  up  in  the  following
formula  where

stands for the number of meanings,

for relative fre-

quency —

tn — F

This formula is known as Zipf’s law.

Though numerous corrections to this law have been suggested, still

there is no reason to doubt the principle itself, namely, that the more fre-
quent a word is, the more meanings it is likely to have.

One of the most promising trends in statistical enquiries is the analysis

of collocability of words. It is observed that words are joined together ac-
cording to certain rules. The linguistic structure of any string of words may
be described as a network of grammatical and lexical restrictions.

The set of lexical restrictions is very complex. On the standard prob-

ability scale the set of (im)possibilities of combination of lexical units
range from zero (impossibility) to unit (certainty).

Of considerable significance in this respect is the fact that high fre-

quency value of individual lexical items does not forecast high frequency
of the word-group formed by these items. Thus, e.g., the adjective

able

and the noun

man

are both included in the list of 2,000 most frequent

words, the word-group

an able man,

however, is very rarely used.

Set ‘Word-Groups and Phraseological Units’, §§ 1, 2, pp. 64,66, 244

The importance of frequency analysis of word-groups is indisputable as in
speech we actually deal not with isolated words but with word-groups. Re-
cently attempts have been made to elucidate this problem in different lan-
guages both on the level of theoretical and applied lexicology and lexicog-
raphy.

It should be pointed out, however, that the statistical study of vocabu-

lary has some inherent limitations.

Firstly, statistical approach is purely quantitative, whereas most lin-

guistic problems are essentially qualitative. To put it in simplar terms
quantitative research implies that one knows what to count and this knowl-
edge is reached only through a long period of qualitative research carried
on upon the basis of certain theoretical assumptions.

For example, even simple numerical word counts presuppose a qualita-

tive definition of the lexical items to be counted. In connection with this
different questions may arise, e.g. is the orthographical unit

work

to be

considered as one word or two different words:

work

— (to) work

Are  all  word-groups  to  be  viewed  as  consisting  of  so  many  words  or  are
some  of  them  to  be  counted  as  single,  self-contained  lexical  units?  We
know  that  in  some  dictionaries  word-groups  of  the  type

by chance, at

large, in the long run,

etc. are counted as one item though they consist of

at least two words, in others they are not counted at all but viewed as pecu-
liar cases of usage of the notional words

chance, large, run,

etc. Naturally

the results of the word counts largely depend on the basic theoretical as-
sumption, i.e. on the definition of the lexical item.

We also need to use qualitative description of the language in deciding

whether we deal with one item or more than one, e.g. in sorting out two
homonymous words and different meanings of one word.

It follows that

before counting homonyms one must have a clear idea of what difference
in meaning is indicative of homonymy. From the discussion of the linguis-
tic problems above we may conclude that an exact and exhaustive defini-
tion of the linguistic qualitative aspects of the items under consideration
must precede the statistical analysis.

Secondly, we must admit that not all linguists have the mathematical

equipment necessary for applying statistical methods. In fact what is often
referred to as statistical analysis is purely numerical counts of this or that
linguistic phenomenon not involving the use of any mathematical formula,
which in some cases may be misleading.

Thus, statistical analysis is applied in different branches of linguistics

including lexicology as a means of verification and as a reliable criterion
for the selection of the language data provided qualitative description of
lexical items is available.

The theory of Immediate Constituents (IC)
was originally elaborated as an attempt to
determine the ways in which lexical units are

relevantly related to one another. It was discovered that combinations of
such units are usually structured into

See also ‘Various Aspects ...’, § 12, p. 195,

See ‘Semasiology’, §§ 37, 38, pp. 43, 44.

245

§ 3. Immediate Constituents

Analysis

hierarchically arranged sets of binary constructions. For example in the
word-group

a black dress in severe style

we do not relate

black,

black to dress, dress

in,

etc. but set up a structure which may be repre-

sented as

a black dress / i n severe style.

Thus the fundamental aim of IC

analysis is to segment a set of lexical units into two maximally independent
sequences  or ICs thus revealing the hierarchical structure  of this set. Suc-
cessive  segmentation  results  in  Ultimate  Constituents  (UC),  i.e.  two-facet
units that cannot be segmented  into smaller units having both sound-form
and meaning. The Ultimate Constituents of the word-group analysed above
are:

black | dress | in | severe

style.

The meaning of the sentence, word-group, etc. and the IC binary seg-

mentation are interdependent. For example,

fat major’s wife

may mean

that either ‘the major is fat’ or ‘his wife is fat’. The former semantic inter-
pretation presupposes the IC analysis into

fat major’s | wife,

whereas the

latter reflects a different segmentation into IC’s and namely

fat

major’s

wife.

must be admitted that this kind of analysis is arrived at by reference

to intuition and it should be regarded as an attempt to formalise one’s se-
mantic intuition.

It is mainly to discover the derivational structure of words that IC

analysis is used in lexicological investigations. For example, the verb

de-

nationalise

has both a prefix

de-

and a suffix

-ise (-ize).

To decide

whether this word is a prefixal or a suffixal derivative we must apply IC
analysis.

The binary segmentation of the string of morphemes making up

the word shows that

*denation

*denational

cannot be considered inde-

pendent sequences as there is no direct link between the prefix

de-

and

nation

national.

In fact no such sound-forms function as independent

units in modern English. The only possible binary segmentation is

na-

tionalise,

therefore we may conclude that the word is a prefixal derivative.

There are also numerous cases when identical morphemic structure of dif-
ferent words is insufficient proof of the identical pattern of their derivative
structure which can be revealed only by IC analysis. Thus, comparing, e.g.,

snow-covered

and

blue-eyed

we observe that both words contain two

root-morphemes and one derivational morpheme. IC analysis, however,
shows that whereas

snow-covered

may be treated as a compound consist-

ing of two stems

snow + covered, blue-eyed

is a suffixal derivative as the

underlying structure as shown by IC analysis is different, i.e. (

blue

eye

ed.

It may be inferred from the examples discussed above that ICs repre-

sent the word-formation structure while the UCs show the morphemic
structure of polymorphic words.

Distributional analysis in its various forms is
commonly used nowadays by lexicologists of
different schools of thought. By the term

d i s t r i b u t i o n we understand the occurrence of a lexical unit rela-
tive to other lexical units of the same level (words relative to words / mor-
phemes relative to morphemes, etc.). In other

See ‘Word-Structure’, §§ 4, 6, pp. 94, 95. 246

§ 4. Distributional Analysis

and Co-occurrence

Смотрите также файлы

sep06130_1.pdf

Практикум по механике и молекулярной физике.pdf

educational_sphere.pdf

besov.pdf

Японские числительные.pdf

Файл: Гинзбург - Лексикология.pdf

Смотрите также файлы

Информация

Списки файлов

Дополнительно