Working with Word Clouds: Most Prominent Words in The Blue Carbuncle

Screen Shot 2014-09-20 at 6.37.23 PM

I created a word cloud of Arthur Conan Doyle’s The Blue Carbuncle using Voyant Tools.  The site is easy to use and navigate, however I do wish that Voyant offered the option to change the colors, font, and orientation of the text as Wordle does.

In order to get an accurate depiction of The Blue Carbuncle, I removed stop words including mostly prepositions and conjunctions. Even after this, though, Holmes was still the most prominent word in the cloud, appearing a total of 39 times.  While Holmes is the detective, I still felt that his name consumed so much space in the word cloud that it was difficult to actually grasp any of the main points of The Blue Carbuncle. I added his name to the list of stop words and then the word Hat then became most salient in the cloud, appearing 30 times! This was not shocking, considering that the hat helps Sherlock determine who ended up with the goose.  Not surprisingly, the words goose, and bird both appear over 20 times, while stone appears 21 times.  All of these words appear frequently because in the story, Holmes spends plenty of time trying to figure out how the stone ended up inside the goose.   

What I liked most about my word cloud is that the characters’ names were quite visible. After I added all of the prepositions and conjunctions to the list of stop words, the names Baker (appears 18 times), and Peterson (11 times) became more outstanding. This is important because it reminds us of Mr. Baker purchasing the goose with the stone, and Mr. Peterson, who presented the case of the blue carbuncle to Holmes.

Overall, Voyant Tools made it simple to create a word cloud, but not one that is unique or visually appealing.  As discussed in the Nieman Lab article we read this week, I do not think that Word Clouds have much purpose, unless you are trying only to observe word usage in a text. My word cloud displays the main characters and ideas of The Blue Carbuncle but you can’t make sense of what actually went on in the story unless you read it.

Scandal in Bohemia Word Cloud – Where’s Watson?

Using the Voyant tool, I made a word cloud of Sherlock Holmes’s adventure, “Scandal in Bohemia.” It is unfortunate that the tool does not utilize color to distinguish word frequency or other significant word trends because that would have allowed for some interesting insights. In any case, the word size was telling enough to extract some Sherlockian observations about the context of the story. In my generated word cloud ( http://voyant-tools.org/tool/Cirrus/?corpus=1411164560979.4726&query=&stopList=1411165833459tu&docIndex=0&docId=d1411099477875.99b8b096-b231-7094-d527-8b986fefb364), the most to least significant elements of the story are apparent from larger to smaller size. ‘Holmes’ appears 47 times, ‘photograph’ appears 21 times, ‘king’ (17), ‘majesty’ (16), ‘irene’ and ‘adler’ (13), and ‘woman’ (12). It’s no surprise that these four aspects of the story surface most frequently and the the photograph can be considered a tertiary character of the story because it is crucial to the reveal and the idea of ‘the woman.’

What I took most note of, however, was the lack of Watson’s name. Holmes is obviously the largest word, front and center, but Watson is notably smaller and on the outskirts of the cloud. His name appears 6 times, half as many times as the mention of Irene Adler’s name. Though this does make sense because he is the narrator and therefore is primarily mentioned in the first person in the text, I expected to see more of his name when Sherlock addresses him in conversation. A conclusion from this ‘where’s Watson’ is that this is a subtle show of Sherlock’s narcissism. Holmes’s heightened perception and memory are arguably the biggest parts of the story, but his lack of addressing Watson – our narrator and the right hand man – by name is a way of noticing Sherlock’s ego from a quantitative perspective.

TheWoman_WordFrequency ScandalBohemia_WordCloud2

Blog 1: Word Cloud

Word Clouds are graphic visualizations of the most frequent words used in a text.  This tool allows fresh interpretations to be made about any texts.  It provides a unique way of looking at a cluster of frequently used words that may elicit  a different understanding of what is being presented.

I chose to closely read The Adventures of Sherlock Holmes: The Blue Carbuncle and create a Word Cloud to develop a new understanding and a fresh perspective on the story.  I used Voyant as a tool to generate a Word Cloud for the text.

Below is a Word Cloud for the entire story:
The Blue Carbuncle

After editing the stop words and removing irrelevant words, the most common words to appear in the text are: man, holmes, hat, goose, little, know, stone, bird, and geese (beginning with most frequent word).  These words make sense considering the premise of the story involves an investigation of the missing blue carbuncle in the neck of a goose.  However I feel that this visualization and understanding of what is most important and valuable in the text would benefit from excluding similar words such as “goose” and “geese” and “bird”.  I will edit the stop words to take away “geese” and “bird” and a few other less frequently occurring words that seem to be duplicates in one way or another to see how it strengthens my observation.  In the new Word Cloud, there is a stronger sampling of frequent and presumably important words.  The most frequent words are: man, holmes, hat, goose, little, know, stone, just, sir, baker, and tell.  These words are a little more precise and reveal a lot about the plot of the story.

wordcloud

A quick browse through the word trends shows some information about where words more frequently appear in the text.  The most frequently occurring word “man” appears scattered throughout the text in an even fashion.  This makes sense since it is such a generic word.  The top three words used commonly at the beginning of the text are: hat, goose, and stone.  The top three words used more towards the end of the text are: holmes, little, and know.  If I had no prior knowledge of the story I would analyze this information as a story starting out confidently and ending up a mystery.  There seems to be a clear understanding with these frequent words that the beginning of the story is set up in such a way that is confident and outlining the premise.  There are strong nouns that identify the main points and symbols of the story.  It highlights what is most important.  As the story progresses there are mysterious things happening.  A case is being investigated by Holmes.  The frequently used words that appear more often towards the end of the story are Holmes (noun), little (adjective), and know (verb).  Holmes is solving the mystery of the Blue Carbuncle so naturally one would expect his name to appear at the end or resolution of the story.  Interpreting the words as “Holmes knowing little” is another angle that could be looked at.  Even though he does solve the case, the majority of the story is all about clues and small bits of information that are used as a collective whole to solve a mystery.  These words and their placement in the text may be valuable in understanding the key points and themes of the story.

Using the Word Cloud platform and Voyant tool to analyze a Sherlock Holmes story is a fun and interactive way to read and understand the text.  Being able to control what words should or should not appear in the Word Cloud, viewing the word trends, and seeing the frequencies are all helpful and can be used to extract important symbols or themes in the text that may have otherwise gone unnoticed.

 

Erica Gedney

Welcome to Digital Tools for the 21st Century: Sherlock Holmes’s London (DHM 293)

 Course Description:

Do you want to learn how to read 10,000 books at a time? Create maps of crimes in Sherlock Holmes’s London? This course provides an introduction to digital humanities (DH)–the practice of using digital tools for scholarly purposes in all majors–including its different uses, methodologies, tools, and projects.  You will learn different DH techniques, study existing DH projects, and try these techniques yourself in weekly labs.  We’ll use DH techniques to examine Sherlock Holmes short stories alongside Victorian court records, coroners’ reports, and maps of crimes in London.  While the in-class material will focus on 19th century London, your final group projects can be more immediately applicable to your own major or academic interests.  In lieu of taking exams and writing traditional papers, we will create digital exhibits, write blog posts, share our work through social media, and collaborate with students and scholars from around the world.  All majors are welcome. Computer literacy is helpful, but no programming experience is required.

Student Learning Outcomes:

By the end of the semester, students should be able to

  • Have an interdisciplinary understanding of 19th Century London
  • Identify, use, and discuss different DH methodologies and tools
  • Explain the pros and cons of the different methodologies and tools
  • Identify and explain key DH terms
  • Create projects using the tools covered in lab
  • Articulate what makes a DH project successful or unsuccessful
  • Use social media to engage with the larger DH community of scholars
  • Come up with research questions that can be answered with DH tools and methodologies and come up with the idea for tools and methodologies in order to answer research questions
  • Work collaboratively in groups to create a project that relates to their own research interests

Course Materials:

Every week will focus on a different DH methodology will include a discussion-based class and a hands-on lab when you will practice the tools we have been discussing. All readings with URLs can be found online and through the course website.