Comparing Comparisons

Evaluating Measures of Keyness or Distinctiveness in Computational Literary Studies

Christof Schöch
(Trier University, Germany)

Kyungpook National University
Daegu, South Korea

22 May 2024

Science fiction	Zeta	LLR	Welch
Space / Setting: Solitary settings are typical, e.g. space, desert or the arctic. Vast and imaginative array of settings, e.g. space, alternate universes, hallucinatory landscapes, the moon, Mars.	planet, space, surface, universe, ground, star, zone, outside, spatial	planet, space, universe	planet, ground, space, surface, outside, universe, zone
	9/50	3/50	7/50

Comparing Comparisons Evaluating Measures of Keyness or Distinctiveness in Computational Literary Studies Christof Schöch (Trier University, Germany) Kyungpook National University Daegu, South Korea 22 May 2024

Comparing Comparisons
Introduction
Thanks
Overview
What is keyness all about?
Some recent findings in my work in CLS1
Sentence length in English-language novels
Topics in Arthur Conan Doyle
Word-frequency in Maurice Leblanc
Definitions: Keyness / Distinctiveness
Traditional definition of keyness
What is Distinctiveness? (Schröter et al. 2021)
Various keyness measures (Du et al. 2022)
History: Keyness in CLS
Burrows’ Zeta in Authorship Attribution
Zeta for Authorship Attribution: Shakespeare (Craig and Kinney 2009)
Further uses and discussion of Keyness
Keyness for gender (Weidman and O’Sullivan 2018)
Keyness for Genre (Schöch 2018)
Zeta and Company / Beyond words
Quantitative Evaluation:Classification Task
Data: Corpus of French contemporary fiction
Evaluation Task: Genre Classification
Results #1
Results #2
Qualitative Evaluation:subgenre profiles
Why qualitative evaluation?
Our approach: match keywords with sugenre profiles
Example: ‘setting’ in science fiction
Results: Interpretability of measures
Conclusion: Findings and Outloook
What have we found out so far?
What are the next steps?
Thank you for your kind attention!
References
Bonus slides
Measures with references (Du, Dudar, and Schöch 2022)
All corpora (Du, Dudar, and Schöch 2022)
Correlation between measures (Du, Dudar, and Schöch 2022)
Keyness in stylo: genre (A.C. Doyle)

Comparing Comparisons

Introduction

Thanks

Overview

What is keyness all about?

Some recent findings in my work in CLS¹

Sentence length in English-language novels

Topics in Arthur Conan Doyle

Word-frequency in Maurice Leblanc

Definitions: Keyness / Distinctiveness

Traditional definition of keyness

What is Distinctiveness? (Schröter et al. 2021)

Various keyness measures (Du et al. 2022)

History: Keyness in CLS

Burrows’ Zeta in Authorship Attribution

Zeta for Authorship Attribution: Shakespeare (Craig and Kinney 2009)

Further uses and discussion of Keyness

Keyness for gender (Weidman and O’Sullivan 2018)

Keyness for Genre (Schöch 2018)

Zeta and Company / Beyond words

Quantitative Evaluation:
Classification Task

Data: Corpus of French contemporary fiction

Evaluation Task: Genre Classification

Results #1

Results #2

Qualitative Evaluation:
subgenre profiles

Why qualitative evaluation?

Our approach: match keywords with sugenre profiles

Example: ‘setting’ in science fiction

Results: Interpretability of measures

Conclusion: Findings and Outloook

What have we found out so far?

What are the next steps?

Thank you for your kind attention!

References

Bonus slides

Measures with references (Du, Dudar, and Schöch 2022)

All corpora (Du, Dudar, and Schöch 2022)

Correlation between measures (Du, Dudar, and Schöch 2022)

Keyness in stylo: genre (A.C. Doyle)