Lauren Gao’s Word Cloud Project: The Adventure of Silver Blaze

Word clouds are an interesting way to get a broad insight on the trends of a text or the data that could arise from it. Usually applied to interviews, documents, or other mediums of text, the diagram showcases the most often reoccurring words from the largest (most frequent) to the smallest (least frequent). As for the question of how accurate the representation of the data truly is through word clouds or tag clouds, remains to be seen. To see this for myself, I put Arthur Conan Doyle’s “The Adventure of Silver Blaze” through two word cloud programs, Voyant and Tagxedo. Before I layout what the general premise of the story is, it might be more interesting to see how good of a job word clouds do in “giving the whole picture” of the story.

Here’s the first from Voyant,

Screenshot (26)

“There is 1 document in this corpus with a total of 9,626 words and 1,959 unique words.

Most frequent words in the corpus: holmes (51), horse (46), colonel (40), straker (33), stables (20)”

I chose to add the word “said” to the stop word list as I felt that it would not have added anything of significant value to the overall representation of themes in the story.

From this word cloud, I could imagine a person looking at this word cloud for a summary of “The Adventure of Silver Blaze” and coming up with a somewhat generalization of the story, though fragmented. Of course, “holmes” is the biggest word, as shown by its size which most likely signifies that the reader is reading a Sherlock Holmes story. Right after, the reader could also potentially surmise that Silver Blaze is a horse or has something to do with horses just by looking at the size of the word “horse” as well as other horse related words around it, such as “stable”.

Here is the other word cloud from Tagxedo,

Screenshot (25)

With Tagxedo, I was able to figure out how to chose the different layouts of the word clouds to more overtly imply that “The Adventure of Silver Blaze” was, in fact, about a race horse. The word highlights however, remain fairly consistent with Voyant’s word cloud. The words, Holmes, horse, stables, Colonel, Straker all come up to be “important” entities of the story. One thing I could not do in Tagxedo however, is to an in depth look at the actual statistic data that went into the formation of the cloud, for example,

Screenshot (26)

clicking on a word in Voyant gives you a graphic on where and how often the word comes up in the text, as well as the phrasing around the word in question.


 

To clear up what “The Adventure of Silver Blaze” actually is about, an adequate summary can be found here.

Like the mystery most likely was to Holmes and Watson (and everyone else around them), is that these names all do have to do with the story, but the word cloud is unclear in how these names and objects are related to each other. A person could incorrectly deduce that the Colonel was the culprit of this crime when in fact it was actually the horse that had killed John Straker.