Topic Modeling with Sherlock Holmes: Analysis

When first introduced with this project I was not sure how to go about it. Although it sounds like as interesting and cool representation of a group of works, it is a bit confusing. As for figuring out how to work the program, that got a bit tricky. The process is relatively simple, but each time you change something you have to remember to adjust where you are saving as well as other settings. But overall, it was an interesting experience and an intriguing idea.

For my group of words, I found it quite fun to come up with topic titles. The word groups themselves were thought provoking, especially when trying to figure out which story some of these words could have possibly come from. Part of the program itself did reveal to you where the words appeared the most in a story, so it was fun to see if your guesses were right and also if we had read the story before.

The easiest word groups to name were the ones with words that were similar and consistent in subject matter. For example, one of the easy ones to identify was the articles of clothing/garments word group. The words (black red glass large coat dress centre top brown observe glasses faced dressed boots colour broad impression pair hat mark) were obvious items of clothing, so it was simple to come up with a topic title. Others, such as death (found dead body lay man blood death blow knife unfortunate terrible person lying finally cut weapon evidence constable remained wound), body parts and expressions (eyes face hands voice cried lips shoulders sat turned air amazement sprang companion stared sunk raised sank eager instant shrugged cheeks staring astonishment angry breast), and house (house night room master bell attention bed asked servants alarm servant ring remained phelps walked butler drawing kitchen finally stay save rope thief scent coffee state joseph rang suspect smell dragged cover cellar burglar ill harrison instantly sounds scene french bound county form rest wished partly pull chamber mr ventilator) were easy to create names for because the words had obvious relations to one another and similar subject matters.

There were also some difficult topic titles. Some of the words in the groups did not have a clear subject matter or they would have mini groups within the larger one, making it hard to pin down a clear, overall theme. For example, crime investigation (holmes mr inspector lestrade sherlock case yard detective opinion scotland arrest evidence prisoner official ready practical force quietly bank gregson remarked oldacre mcfarlane final joke absence credit finished rubbed warrant pleasure hands gentlemen norwood fail express suspicious bound wiser chuckled profession afford lucky attempted finds jonas rolled sense martin bradstreet) and evidence (small box examined large papers floor carefully examination inside cut top square iron carpet showed wooden furnished evidently lower contents central removed mantelpiece careful examining) were difficult to name because a lot of the words did not connect with one another. Some words were similar, while others were completely different. With groups such as these, you had to look closely at all the words and try to come up with a general title that would encompass all the words into one subject.

– Allyson Macci