Topic Modeling Part 2

The ten topics I initially chose were: crime, case solving, observation, economy, body, morning/night, appearance, passing of time, written documents, and setting.

sh topic model chart 1

First, I decided to compare the topics of crime and case solving. There seemed to be a dramatic increase in the appearance of crime from 1894 to 1904. Upon looking back at the topic index, I found that the largest prevalence of crime was in The Adventure of the Second Stain, which was published in late 1904. Indeed, a decade after the original Adventures appeared in The Strand, a series of others were published known as The Return of Sherlock Holmes. The appearance of both crime and case solving varied throughout 1904, and while dipping over or under each other, they remained close until 1927.

sh topic model chart 2

Next, I compared observation, body, and appearance. Similarly to the previous graph, the body topic saw a dramatic spike in 1904. Its greatest appearance was in that year’s Adventure of Charles Augustus Milverton. Overall, however, it seems that appearance spent the longest amount of time above the other two topics—which combined the adjectives of “observation” and body parts of “body” to describe the way those body looked.

sh topic model chart 3

I examined the topic of economy next and found that it reached a vast height in 1893, when The Adventure of the Stock-Broker’s Clerk was published. Mentions of the economy seemed to remain relatively moderate when compared to the previous topics, and there didn’t seem to be any significant economic event in real life that triggered a spike.

sh topic model chart 4

Written documents shared the same 1893 and 1903 spikes as several other topics, and much like economy, was relatively modest throughout the stories until 1927. Its highest mentions were in The Adventure of the Dancing Men from 1903.

sh topic model chart 5

When comparing the topics of passing of time and morning/night, I found that the appearance of morning- or night-related words peaked earlier than most of the other topics, in 1892. According to the topic index, the peak was in The Adventure of the Yellow Face. The passing of time followed a reasonably similar pattern.

sh topic model chart 6

I found that setting peaked several times over the years, the greatest spike being in December of 1893, when The Adventure of the Final Problem was published. It experienced several others in 1904, 1908, 1911, 1921, 1924, 1925, and 1927. Unlike several of the other topics I looked at, I found that there was actually variety in the top-ranked documents in the topic index for setting—many of the other topics had the same story or stories listed multiple times in the first ten. This isn’t too surprising, considering the importance of setting in the telling of any story.

In general, I unfortunately wasn’t able to pinpoint many specific historical events that connected to the topics I’d selected, which was pretty disappointing.

(… Although, according to one of the timelines, elementary education was made free in 1891. Heh.)