Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Comparative Studies. Non-European Languages/Literatures  (1)
Type of Medium
Publisher
Person/Organisation
Language
Years
FID
Subjects(RVK)
  • Comparative Studies. Non-European Languages/Literatures  (1)
RVK
  • 1
    Online Resource
    Online Resource
    MIT Press ; 2000
    In:  Computational Linguistics Vol. 26, No. 3 ( 2000-09), p. 301-317
    In: Computational Linguistics, MIT Press, Vol. 26, No. 3 ( 2000-09), p. 301-317
    Abstract: In a medical information extraction system, we use common word association techniques to extract side-effect-related terms. Many of these terms have a frequency of less than five. Standard word-association-based applications disregard the lowest-frequency words, and hence disregard useful information. We therefore devised an extraction system for the full word frequency range. This system computes the significance of association by the log-likelihood ratio and Fisher's exact test. The output of the system shows a recurrent, corpus-independent pattern in both recall and the number of significant words. We will explain these patterns by the statistical behavior of the lowest-frequency words. We used Dutch verb-particle combinations as a second and independent collocation extraction application to illustrate the generality of the observed phenomena. We will conclude that a) word-association-based extraction systems can be enhanced by also considering the lowest-frequency words, b) significance levels should not be fixed but adjusted for the optimal window size, c) hapax legomena, words occurring only once, should be disregarded a priori in the statistical analysis, and d) the distribution of the targets to extract should be considered in combination with the extraction method.
    Type of Medium: Online Resource
    ISSN: 0891-2017 , 1530-9312
    RVK:
    Language: English
    Publisher: MIT Press
    Publication Date: 2000
    detail.hit.zdb_id: 602577-8
    detail.hit.zdb_id: 2025069-1
    SSG: 7,11
    Library Location Call Number Volume/Issue/Year Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. Further information can be found on the KOBV privacy pages