Catherine H. Decker, Ph.D. in 18th-century British Literature and the novel, at the Department of Psychology, University of California, Riverside

Last update 20 April 2004

Say what? Despite its unusual nature, my connection with Dr. Burgess's Psycholinguistics and Computational Cognition Lab has produced some useful work and sure has been a lot of fun.

In Spring of 2004, I am sitting in on Dr. Burgess's class focusing on skepticism and pseudoscience. I stepped in to substitute teach for two class sessions--although showing a video and leading ten to twenty minutes of discussion isn't much of a great pedagogical achievement.

In the fall of 2003, I was happy to be a co-author on two presentations and a poster/abstract. I was a co-presenter of the talk"PlagiarismTest.org--Building Knowledge and Eliminating the Ignorance Excuse to Increase Integrity" on Saturday, Oct. 18, 2003, 1:30, at the Center for Academic Integrity International Conference in San Diego. I was also a co-author on a paper that was presented at the Society for Computers in Psychology 2003 conference, "PlagiarismTest.org: Online Teaching, Learning, and Testing about Plagiarism." The poster/abstract of which I am a co-author is "Using a High-Dimensional Model to Capture Semantic Distinctiveness and Consolidation in Bilingual Memory," which was presented at the 44th Annual Meeting of the Psychonomic Society, on Friday, Nov. 7, 2003, 12:00, on the Conference Level of the Farrimont Hotel, Vancouver.

So what have I done around the lab in the past? I wrote some of the early versions of the lab webpages and some now long-gone websites for the Psychology Department and Neuroscience Department at UCR. In 1995, 1996, and 1997, I guest taught on gender and conversation for five class sessions of these three Psycholinguistics classes taught by Dr. Burgess. Some of the teaching material I created for my five class sessions is still available on this site. Although a lot of my web work has been replaced over the years, there are still some images I scanned in or created here and there, link lists I developed, or some html code I have written on various pages on hal.ucr.edu (or locutus.ucr.edu). I have been involved extensively in two major websites, PsychGrad.Org and Plagiarismtest.org. I have co-authored a number of conference presentations and abstracts with Dr. Burgess and various graduate and undergraduate students working at the lab, which led to one publication.

Publication

Burgess, C., Conley, P., Decker, C., & Devitto, Z. (2001). The Psychology Graduate Applicant's Portal. Behavior Research Methods, Instruments, & Computers, 33, 263-266. Available in PDF format (1.0MB)

I have also helped write stimuli for racial-bias experiments and also develop "JaneHAL" and "BurnettHAL" which all, as far as I know, have yet to debut to the world. "JaneHAL" and "BurnettHAL" are HAL matrixes using as their source of texts the e-texts of Jane Austen and Frances Hodgison Burnett respectively. What's a HAL matrix? Actually, I've developed a bit of a reputation for being able to explain HAL in "normal" or "non-expert" language to people--sort of HAL for laymen or the common man (excuse the traditional sexism, please; women are included in both groups).

HAL is a huge set of numerical data taken from large texts, a matrix that records the entire history of a word's use relative to all the other words in a set of text. The text for the main HAL comes from the internet and is huge--160 million words a few years ago--I'm not sure of the current size. The text is not processed, just raw from the internet. A HAL model can be generated from any textbase, and there are all sorts of interesting variations--old HAL, child HAL, and various types of mentally-ill or challenged HAL. We often discuss trying to generate matrixes from other languages or images or even data from space, such as SETI. At any rate, once you have a HAL matrix, the matrix produces a vector for every word that is the historical use of that word in the texts in relationship to every other word. This is not just simple or local co-occurance, like how cat and dog often appear in the same sentence for example. HAL values the way words occur in sentences in proportion to their frequency and distance from the target word. The vector for a word like street will most resemble the vector for a word that is used in the same context--the vector for a word like road, for example. Because cat and dog do not always occur in the same context, their vectors will differ more than the vectors of cat and cats for example. However, the vectors of cat and dog will be more similar than that of words like "the," "of", or "playwriting." This relationship between street and road, which rarely occurs in the same sentence is called "global co-occurance." This is what HAL gets at by precise numerical values based on actual, real use in texts. The math precisely calculates what words are used in the most similar ways to other words--we call this the "semantic neighborhood" of a word.

These semantic neighborhoods occur in the high-dimensional space involving millions of vectors. A math formula however enables you to see the neighbors in two-dimensional charts or in lists with set units indicating distance between various vectors. The vectors can be viewed also as a series of grayscale boxes, which enables you to "eyeball" similarities between vectors. Some of the cool things you can do with a HAL matrix for example is measure the semantic differences between various words--I was fascinated how the neighborhood for "women" in regular HAL would be words like gun or bitch, but in JaneHAL, Emma or approbation--i.e. something much less negative or violent. HAL can actually pass reading tests and has been used to duplicate findings from human research in all sorts of areas--see (or request) some of the HAL research on the Lab Reprints page.


To see the Lab page

To go directly to the Lab Research Page

Return to Cathy's Homepage