CCL

Összesen 2 találat.
#/oldal:
Részletezés:
Rendezés:

1.

001-es BibID:BIBFORM107049
Első szerző:Abdelzaher, Esra (linguist)
Cím:Defining Crime: A multifaceted approach based on Lexicographic Relevance and Distributional Semantics / Esra Abdelzaher, Ágoston Tóth
Dátum:2020
ISSN:1787-3606
Megjegyzések:This paper demonstrates how the parallel examination of distributional data and frame semantic information can expose word senses that are not documented in FrameNet. In our case study, we compare the distributional features of the word crime to its properties stored in the FrameNet database also considering dictionary data that we find in three online monolingual dictionaries. Our analysis indicates that crime has senses that are absent from FrameNet. The five senses that we identify can be separated on the basis of (a) frame hierarchies, (b) frame elements, (c) syntactic and semantic data extracted from corpora using lexicographical tools and (d) distributional similarity. Annotated examples are provided to demonstrate each sense.
Tárgyszavak:Bölcsészettudományok Nyelvtudományok idegen nyelvű folyóiratközlemény hazai lapban
folyóiratcikk
crime,
FrameNet,
distributional semantics
lexicographic relevance
Megjelenés:Argumentum. - 16 (2020), p. 44-63. -
További szerzők:Tóth Ágoston (1974-) (nyelvész)
Internet cím:Szerző által megadott URL
DOI
Intézményi repozitóriumban (DEA) tárolt változat
Borító:

2.

001-es BibID:BIBFORM113094
035-os BibID:(Scopus)85171342691
Első szerző:Tóth Ágoston (nyelvész)
Cím:Probing visualizations of neural word embeddings for lexicographic use / Ágoston Tóth, Esra Abdelzaher
Dátum:2023
ISSN:2533-5626
Megjegyzések:Our study explores the possibility of using the distributional characteristics of headwords as exemplified in the online Oxford Learner's Dictionaries, captured by contextualized word embeddings and displayed in two dimensions to help lexicographers find sense categories, detect variations across senses and select potential example sentences. In addition to the dictionary examples, we added British National Corpus data that contained the headwords. BERT word embeddings were extracted for all occurrences of the headword, then two-dimensional representations of the resulting high-dimensional BERT embedding vectors were created using 4 algorithms: MDS, Isomap, Spectral and t-SNE. Clustering was assisted by k-means clustering and Silhouette scoring for different k values. Our investigation showed that Silhouette scores for k-means increased after dimension reduction; furthermore, spectral and t-SNE visualizations were associated with the most cohesive clusters. The highest Silhouette scores recommended a number of clusters different from the number of dictionary senses, but semantic and syntactic patterns were detectable across the recommended clusters.
Tárgyszavak:Bölcsészettudományok Nyelvtudományok előadáskivonat
könyvrészlet
sense delineation
word embedding visualization
BERT
Megjelenés:Electronic lexicography in the 21st century: Proceedings of the eLex 2023 conference / edited by Marek Medved, Michal Mechura, Iztok Kosem, Jelena Kallas, Carole Tiberius, Milos Jakubícek. - p. 545-566. -
További szerzők:Abdelzaher, Esra (1992-) (linguist)
Internet cím:Szerző által megadott URL
Intézményi repozitóriumban (DEA) tárolt változat
Borító:
Rekordok letöltése1