Topic Modeling

Topics from 50 topics and 1000 iterations-

1) Money-money business hundred men asked company american pounds answered worth thousand red gold paid city pay headed answer fifty price

2) Murder– found dead man death crime body evidence terrible unfortunate attempt violence words occurred instantly action save murderer committed escape murdered

3) Sherlock’s study– table papers small box examined study showed left carefully marks signs examination wood books round carpet mark traces fire examine

4) Women- woman wife lady husband love married child girl life mother loved daughter maid beautiful ferguson mrs mistress ways character women

Topics from 25 topics 2500 iterations-

5) Crime– man police found inspector dead crime death body evidence reason murder blood night shot person

6) Letter/message-paper table hand papers note read letter pocket box book put letters drew short handed

7) Sherlock-holmes hand chair sat back looked fire air sherlock visitor pipe rose laid pray companion

Topics from 50 topics and 2000 iterations-

8) Journey/travelling-place train station carriage found town return started long line reached drove cross miles journey

9) Appearance-face eyes man dark tall looked features expression thin lips pale mouth figure companion appearance

10) Case-case interest points curious remarkable fact attention singular account investigation matter clue events incident problem

Topic Modeling Sherlock Holmes Stories

All categories chosen from 50 topics with 1000 iterations:

1. morning night back clock waiting past early morrow quarter arrived
Title: time

2. paper note read letter table book handed letters written wrote
Title: writing

3. face eyes looked thin features lips figure tall dark expression
Title: physical features

4. woman lady wife husband life love girl child married maid
Title: household

5. black hair red hat heavy round broad centre coat dress
Title: clothing/accessories descriptions

6. found man dead lay body blood death knife lying round
Title: death/crime

7. give matter idea reason question impossible occurred absolutely explanation true
Title: interrogation/crime solving

8. face turned back instant hand sprang forward moment side head
Title: physical reactions

9. station train road carriage passed side drive reached drove hour
Title: transportation

10. light suddenly dark long caught sat lamp spoke silence silent
Title: darkness/mystery

The Adventure of the Dancing Men

For a change of pace, I read the Sherlock Holmes story, “The Adventure of the Dancing Men,” from Sir Arthur Conan Doyle’s The Return of Sherlock Holmes collection.  I used the Voyant word cloud tool to visualize the story.  The first version featured the words “holmes,” “mr,” and “mrs” that I deemed unimportant to the story.  I realize now that, though “holmes” is a bit obvious and takes up a lot of room, the other words I listed indicate relationships which may be important to the overall story.  Please bear this in mind as I continue without “mr” and “mrs” (because I took this screenshot on a different computer that allows screenshots and so I will be using this visualization):

Screenshot (1)

The loss of “mr” and “mrs” may not be so terrible, after all.  These “words” don’t modify any others to indicate which name they belong to, which I believe is a fault of the word cloud.  If I were looking at this visualization and trying to find character names, they are spaced throughout and it is impossible to determine which first name goes with which last name, or if each name is even related.  However, one can see that “husband” and “wife” are featured in the word cloud, so a marriage is implied.  “Norfolk” also shows the setting of the story.  While some related words point to a plot line — for example, “bullet,” “shot,” and “fired” suggest that a character has been or will be shot in the story — other related words like “man,” “men,” and “face” fail to create an image of just what the “Dancing Men” part of the title means.  In reality, the dancing men are drawn stick figures used as a code to send messages to one of the characters.

The word cloud has the potential to give the viewer a lot of information about a document, but in some cases it may fall short of its purpose.  The words used in this word cloud are no true representation of what occurs in the story.  Looking at the picture, I can see that there is a husband and wife, a few named characters, a setting, a letter, and someone gets shot.  This does not answer the question of who the dancing men are and, since the story is named after them, this is a lot of crucial information not being relayed.

Though I like the aesthetics and intention of the word cloud, I can understand why some people would be opposed to its existence, like Jacob Harris from the “Harmful” article we read seems to be.  A narrative may be impossible to find in a cluster of frequently used words with no specific meaning; therefore, the word cloud may not be effective or properly convey the meaning of the document it represents.

A Case of Identity Word Cloud

I used the visualization tool Voyant in order to create a World Cloud for the Sherlock Homes short story,  A Case of Identity.

word cloud

 

Since it is known that the World Cloud is a visualization of a Sherlock Holmes story, I added “Holmes” to the list of stop words, as well as “said.” These words, though used the most often, were irrelevant to the real analysis of the World Cloud.

Though little can be told about the plot of A Case of Identity from this visualization alone, it helps in pointing out who the story mainly revolves around. The words “Hosmer,” “Windibank,” and “Angel” appear 23, 20, and 19 times respectively throughout the text. Readers could infer that these are the main characters and upon reading the full text would discover that “Hosmer Angel” and “Windibank” are actually the same person.

Next to “Holmes,” which appeared 28 times in the text and was deleted from the Word Cloud, the next most often used word was “little.” This was surprising, as having read the text before creating this visualization, the word “little” seems to have nothing at all to do with the plot of the story. Upon further analysis, however, it can be seen that “little,” though not dealing much with the plot, is always used for a particular reason. Often times, it is used to describe Miss Mary Sutherland. Since she is a woman, she is portrayed as being more dainty, and therefore things about her are little, from her “little problem” to her “little handkerchief.” Watson also uses this term to describe Miss Sutherland’s appearance when Holmes asks him too, pointing out the “little black jet ornaments” on her jacket and the “little purple plush” on her dress. Holmes even goes as far as to comment on her “little income.” This use of the world little to describe Miss Mary Sutherland can be interpreted as a way to show readers that though it’s Miss Sutherland’s case that needs solving, Sherlock sees her as just another woman with a “little” and “trite” problem, and therefore, readers should see her this way as well.

While the comments Sherlock sometimes makes can be viewed undoubtedly sexist, I also think it’s important to look at the context of these stories. When Sir Arthur Conan Doyle wrote them, this generalization and view of women was the norm. It is only now, reading these stories in the 21st century, that we can point out what it is wrong with some comments made. Back then, this kind of description of women was not seen as an issue. It’s interesting to think, if Word Clouds were used long ago, if the same amount of analysis would be put into the word “little” or even how women were depicted in these Sherlock Holmes’ adventure stories at all.

The Adventure of the Speckled Band

screen-shot-2014-09-21-at-2-09-57-pm

Generated by http://www.jasondavies.com/wordcloud/

As my favorite Sherlock Holmes story of all time, seeing key words visualized in a word cloud helped me to focus on some key points that Sir Arthur Conan Doyle was trying to make. I found that many words in this word cloud appeared to be very grim, (i.e Death) and it is often easy to forget the grim nature of this story. Word clouds are an incredible tool in completely understanding the way a work is written.

Creeping man

word cloud sehrlock

 

sherlock wordle

 

I wanted to experiment with both tools, so above you can see the Voyant and Wordle visuals for the Sherlock Holmes story “The Case of the Creeping Man”. Each tool has its pros and cons. I like how with voyant you can see all the statistics and data behind eat word on the actual webpage and the fact you can add more stop words to customize what words appear. I wish it gave you the option to change the color and font. With wordle it’s just about the opposite. You have the option to change colors and font (though it’s very limited)  but it doesn’t show much data behind the words. I did figure out that if you right click on a word, for example I right clicked on ‘Sherlock’ and it gives you the option to delete the word. So for my visualization I removed Sherlock and Watson so it wouldn’t take away from other key words in the story.

The visualization aspect is helpful about each tool, but I find it really doesn’t offer much information about the plot of the story. If someone were to look at these word clouds without having read the story before hand they would only understand that the larger words are the ones that come up most often and maybe gather some information about who the characters are in the story and the areas the story is taking place.   I think Voyant is definitely more helpful and pays more attention to detail and statistics of the words. It is also helpful that you can edit the stop words and create your own stop word list. There is a lot to take in with Voyant but it’s nice that everything is all in one place: the word cloud, the actual text of the story, and the graphs for each word. The fact that it shows the words in context with the text is also very helpful.

 

http://sherlockholmes_cases.tripod.com/creeping.htm