Lowering the number of topics and iterations in MALLET can make the words found in each topic more general and broad and therefore, much harder to categorize. We found that using a higher number of topics made the terms found in each topic less vague. Higher iterations also helped with getting rid of some ambiguity while looking through the posts and made it a lot simpler to see how the words in each group were related to one another and to the Holmes’ stories themselves.
Three topics we had in common were:
1. Murder:
Lauren: found dead man body blood blow struck knife lay stick head weapon finally wound unfortunate bullet handle lying acted fainted (70 topics, 1000 iterations)
Caity: found man body blood dead knife lay stick blow head carried weapon heavy finally unfortunate neck wound lying drawn struck (100 topics, 1000 iterations)
2. Time/Time Measures:
Lauren: hour half past clock time cab ten waiting quarter work wait late minutes back drive catch eleven immediately presently church (70 topics, 1000 iterations)
Caity: time years week ago year country months days age twenty (50 topics, 2000 iterations)
3. Writing:
Lauren: paper note letter read handed wrote written writing sheet write book post page began pen pencil slip ran printed torn (100 topics, 5000 iterations)
Caity: paper note read letter table book papers pocket letters written (50 topics, 2000 iterations)
The topic of murder is found the most in “The Abbey Grange” and the least in “The Red-Headed League.” The topic of Time/Time Measures is found the most in “The Five Orange Pips” and the least in “The Illustrious Client.” The topic of Writing is found the most, again, in “The Five Orange Pips” and the least in “A Scandal in Bohemia.”
These are two question we proposed about the data:
1. Is this information important or useful to historians that are studying the time period in which the Sherlock Holmes stories were written?
2. How does the changing context of these stories change our interpretations of the data?