Date

3-10-2026

Department

School of Behavioral Sciences

Degree

Doctor of Philosophy in Psychology (PhD)

Chair

Patrick Slowinski

Keywords

vector semantics, word embeddings, second language acquisition

Disciplines

Linguistics | Psychology

Abstract

This quantitative study examined the feasibility of categorizing individuals by English-language competency using word embeddings derived from language samples obtained from these individuals and by viewing word embeddings as a psycholinguistic construct. Word embeddings have been shown to be a versatile tool for performing a large number of important natural language processing (NLP) tasks, as well as a plausible model for some processes in the human mind, including processes within the domain of psycholinguistics. This research seeks to build upon previous research by showing that word embeddings may shed light on the underlying neural representation of semantic entities being acquired by novices. To test this, word embeddings generated from writing samples of non-native speakers of English were categorized using a k-cluster algorithm and English language proficiency, represented by reaction time and accuracy on a vocabulary test, across groups were compared. Results showed that language competency, as measured by reaction time and accuracy scores, varied significantly by the entropy of document embeddings, but not by cluster group. These results have implications for the psycholinguistics of semantics, for psychometric evaluations of learning, and also for textual interpretation of sources including the Bible.

Share

COinS