Topic Modeling

How does the number of topics affect the topics the tool gives you?

Changing the number of topics allows it to vary ont the granularity of produced topics.

How does the number of iterations affect the topics the tool gives you?

The higher the number of iterations the higher the topic coherence.

What settings do you recommend for use with the Topic Modeling Tool?

10-20 topics

>100 iterations

Remove stopwords

Solving: case point facts points fact obvious interest explanation investigation mystery simple confess theory present admit solution formed true problem connection

  1. What story uses that topic the most? The Dancing Men
  2. Which stories use it less? The Disappearance of Lady Frances Carfax
  3. What is the most common word from this topic in the story?
  4. Why some words are repeated?

Crime: man dead poor strong body death life brought terrible dangerous sort words creature real deep notice wild turn devil lies

  1. What story uses that topic the most? The Veiled Lodger
  2. Which stories use it less? The Reigate Squires
  3. Why do you think this is topic is more used in this story?
  4. What is the relation of the words and topic and their stories?

Murder Case: crime police found murder death night scene arrest reason attention remained trace instantly murderer attempt suspicion discovered charge caused search

  1. What story uses that topic the most? The Second Stain
  2. Which stories use it less? The Devil’s Foot
  3. Why are there some words that are not related with the topic?
  4. How does the topic modeling tool help us with the understanding of the story?

By Alessandra Oestreich and Isabelle Berta