Word Cloud: A Case of Identity

For this project, I decided to make some word clouds based off of the Sherlock Holmes story A Case of Identity. I thought that this project was a fun experience, and it’s definitely something that I’ll try using again. After listening to our guest speaker last class talk about graphic design and how to make our digital projects look more visually pleasing, I was excited to jump into this project. I first decided to make a word cloud using Voyant (as shown below).

Screen shot 2015-03-02 at 10.27.04 PMI was hoping that when I submitted my story, it would automatically turn out like this, but I had to make a few tweaks with the stop words before I got to this final result that I think turned out pretty well. I think that it emphasizes many of the core themes and major words that you need to know in order to get at least some idea of what the story is about. I don’t think this type of chart in general is the most easy to understand, but it’s visually appealing with the fun colors and the simple font. I also really liked how user-friendly the site was, and it wasn’t hard to make edits to it.

Unfortunately, my ancient computer couldn’t handle any Java or Silverlight updates, so I wasn’t able to use Wordle or Tagxedo. I’m still kinda bummed because I looked at the word clouds that other people posted and they look really cool. But I was informed of another word cloud website called JasonDavies.com (as shown below)

Screen shot 2015-03-02 at 10.48.22 PMThis word cloud is a lot more cluttered than the one from Voyant and I didn’t really enjoy using it. It only lets you change the font, which is a bit restricting when trying to make something look more visually appealing. And after using Voyant, it was a bit disappointing not being able to change anything other than the font. Most of the words are around the same size, so you can’t really tell which words are the most important. If someone who had never read the story before looked at this word cloud, I don’t think they would have a clue what the story was about.

Word Cloud Project: A Scandal in Bohemia

Screen Shot 2015-03-01 at 8.45.33 PM    Screen Shot 2015-03-02 at 2.15.57 PM

Word clouds or tag clouds are visual depictions of word occurrence that offer greater importance to words that appear more frequently in a piece of text.  Moreover, the larger the word is in the cloud, the more common the word was in the source of text.

For the word cloud project, I chose the Voyant word cloud generator and the Sherlock Holmes story called A Scandal in Bohemia.  In the first word cloud that I have created, it shows what was happening at the beginning of the story when Sherlock Holmes receives a letter in the mail from the King of Bohemia who is asking Sherlock Holmes to do him a favor.  During this part of the story, the words that show up the most often is paper, german, bohemia, stands, note, and peculiar.  These key words help the reader to understand how the story obtained the title it was given because the word “peculiar” describes something unusual like a scandal is going on and the word “bohemia” describes the location that is involved in the peculiarity.  Moreover, based on this particular word cloud, I learned that something peculiar is happening and somehow an individual from Bohemia is involved based on a note that Sherlock Holmes has received.

In the second word cloud that I developed, it shows what happened at the end of the story when Sherlock Holmes receives a letter from Irene Adler in regards to the most wanted photograph.  During this part of the story, it is evident that some of the words that show up most often are Sherlock Holmes, photograph, know, dear, and really.  These fundamental words help the reader understand that Sherlock Holmes did in fact find the photograph but failed to realize how stealthy Irene Adler really is.

Even though using word cloud can be an interesting and creative way to portray information, there are also some negative aspects of this tool.  According to the Better Evaluation website, one of the pros of this tool is that there are various word and tag cloud generators that are freely available on the internet and creating them is really straightforward.  However, based on the Nieman Lab article word clouds can be considered a negative tool because word clouds support only the crudest sorts of textual analysis.  In addition, word clouds focus on only the occurrence of specific words instead of concepts and ideas that are important and will help you understand what is going on.  Lastly, Word clouds leave the readers to figure out the context of the data by themselves because they have to translate what the jumble of words are trying to depict and explain.

Word Cloud: A Scandal in Bohemia

I have in the past used Wordle to create a word cloud for a school project so I had no trouble with that site but I did have issues using Voyant and could not get it to delete additional stop words beyond what were in the dictionary for the English option.

I think the Wordle word cloud makes it easier to understand the story because more of the stop words were not on it. You can clearly tell from both clouds that Holmes is the key component in the story. Additionally, “photograph” is one of the largest words in both clouds, it actually appears to be the second largest in the Wordle word cloud. This tells you that the main focus of the story is about a photograph which is important because that is exactly what Holmes is trying to retrieve throughout the tale. The words King, Majesty, and woman/Irene are also clearly visible. Irene is very important in the story not only because she is originally who Holmes is searching for but also because she changes Holmes’ view on woman in general.  King and Majesty appearing larger tells one that the story is about a case that entails some sort of royalty. Some words that are larger that do not play main roles in the plot are door, one, must, said, minutes, house, face, and hand. These words though not stop words per-say do not pertain to the plot in a way that one can understand through the word cloud.  This is a huge con of using a word cloud- that unimportant words are shown and can throw off the plot of the story. Words like that can make it more difficult to understand what the tale is about. A huge pro though is the complete opposite and that is being able to get the general idea of a story in just a glance. One can view a word cloud and get an overview of important parts or more of words within a story. Overall, I think word clouds are visually appealing but not that helpful in this instance.

Untitled

wordle

Head In The Word Clouds, Feet On The Ground

 

Story: Scandal In Bohemia

Overall, I thought that the word cloud project was an interesting experience for sure. After listening to our guest speaker go over how the aesthetics of a data platform can truly effect how it’s absorbed by the audience, I was eager to delve into it. However, once I got into it I came to the conclusion that they aren’t really that informative. If you’ve already read the story, then of course it’s going to reinforce the common themes present in the readings! Despite my thoughts on this, I did like the design factor of the project and hope that we can do more of that this semester!

 Here is my first word cloud, from Voyant. Overall, I thought that it turned out well. But the program was not very user-friendly. It appeared that it needed to be updated. I wasn’t sure if there was a way to change the colors, shapes, or alignment of the words because it was confusing to use. Also, the jargon used like “corpus” was outdated and I wasn’t sure what it even meant.

 Screen Shot 2015-03-02 at 10.14.52 PM

In terms of the words present, I wasn’t really surprised at all which were the most popular.I feel as if the largest word in all of our projects is going to be Holmes. 

The same can be said for the following word cloud, from Wordle.

 

Screen Shot 2015-03-02 at 9.56.45 PM

 

I really liked using this tool a lot more because it was very straight-forward. User friendly programs are a huge plus in a project like this! As I just said, the words that appeared didn’t really surprise me all that much, either. Both of my word clouds are pretty similar, I think. King and photograph are two of the larger words up there, and as we know from reading the story, those are two important themes.

In part of this assignment, we also had to read two articles. I thought that the second one was pretty comical, and at the same time, agreeable. Because I’m a journalism student, I saw exactly where the author was coming from in terms of how a word cloud sometimes doesn’t illustrate the substance of a story at all. In journalistic writing, we actually avoid using similar words if we possibly can, unless it is the proper title of document, a study, a group, the name of a person, etc. We try to use synonyms as much as we can to crack any monotony in the bud and make our writing more colorful and appealing. In short, there’s probably close to 10 different words that can be swapped out throughout a story that essentially mean the same thing.

“Prettiness is a bonus; if it obliterates the ability to read the story of the visualization, it’s not worth adding some wild new visualization style or strange interface.”

In fiction works, such as a Holmes story, I noticed that word clouds are totally useful in illustrating some of the main concepts that are present. I think that analyzing word usage this way definitely has its benefits, but the analysis can’t end there. One of the larger duties in analyzing text is being able to identify what these themes actually mean. I do not think, however, that word clouds can substitute the true analytical thinking needed to process what these themes mean in the grand scheme of the plot. The initial thought of the word clouds is cool, but I think that there may be better tools out there to successfully show the themes of the story that looks more into substance and not just the word count. 

Words Clouds: Form over Function

After reading an editor from the New York Times disparage the utilization of word clouds, I had similar thoughts running through my mind as I endeavored upon this assignment. What deep insight can be gained from tallying Arthur Conan Doyles’ choice of words? I chose his story A Scandal in Bohemia to investigate the importance of word clouds.

Screen Shot 2015-03-02 at 2.08.44 PM
The “photograph” was thematically important to the story, as the word clouds would establish

I began with Wordle, and then planned to also use Tagxedo for some word cloud fun. I chose these two based on the fact that Voyant‘s learning curve may have been longer because stop words would have to manually removed. I preferred to try to understand the first two applications thoroughly.

I found Wordle very easy to use, and yet complex enough to change the word clouds’ appearance fairly significantly. It removes the common words automatically, although you can adjust that, as well as the font, the colors (both background and the letters) and the layout. This final option dictates how many words are included, which direction they face, and if the cloud is round or jaggedly shaped.

I began with gray and black words with a white background, remembering the design principles from class last week. This was called the Ghostly color setting. I sought to add a touch of color, and chose the Heat setting. I found this to be the most pleasing combination I had found. Finally, I wanted to make a kaleidoscope of color to test the outer bounds of the application.

Movie Poster?
Potential first draft movie poster for A Scandal in Bohemia (click to enlarge) created with Tagxedo

After learning a bit about the basics of word clouds, I hoped to create something more unique and memorable. The above word cloud was made with Tagxedo, using the sunset color scheme and aligned in the shape of Great Britain. One of the best features Tagxedo offers (which Wordle doesn’t) is the variety of shapes in which the words can be arranged. There were several geographic options, including Australia, South America and Great Britain, the latter of which was perfect considering Holmes London address. In addition to the options I did change, there were even more in the word/layout options menu on the left-hand side which I hope to investigate in the future.

Overall, this visualization tool can help to illuminate potential themes in a literary work. Photograph and Adler are two of many words that appear in the word cloud, making it clear that each is vital to the story. The confusion regarding the photo leads Doyle to increase the suspense, and Irene Adler’s name is used frequently because Holmes calls her the woman. The word cloud is simply another tool at the disposal of a digital humanities scholar.

I plan to try this with a yet-to-be decided text for exploration beyond Sherlock Holmes. Underlying themes can reveal themselves, or at the very least an artistic graphic can be created for a favorite piece of literature. Both design and literature interest me, so this was an intriguing assignment that I enjoyed thoroughly.

The Adventure of the Dancing Men

For a change of pace, I read the Sherlock Holmes story, “The Adventure of the Dancing Men,” from Sir Arthur Conan Doyle’s The Return of Sherlock Holmes collection.  I used the Voyant word cloud tool to visualize the story.  The first version featured the words “holmes,” “mr,” and “mrs” that I deemed unimportant to the story.  I realize now that, though “holmes” is a bit obvious and takes up a lot of room, the other words I listed indicate relationships which may be important to the overall story.  Please bear this in mind as I continue without “mr” and “mrs” (because I took this screenshot on a different computer that allows screenshots and so I will be using this visualization):

Screenshot (1)

The loss of “mr” and “mrs” may not be so terrible, after all.  These “words” don’t modify any others to indicate which name they belong to, which I believe is a fault of the word cloud.  If I were looking at this visualization and trying to find character names, they are spaced throughout and it is impossible to determine which first name goes with which last name, or if each name is even related.  However, one can see that “husband” and “wife” are featured in the word cloud, so a marriage is implied.  “Norfolk” also shows the setting of the story.  While some related words point to a plot line — for example, “bullet,” “shot,” and “fired” suggest that a character has been or will be shot in the story — other related words like “man,” “men,” and “face” fail to create an image of just what the “Dancing Men” part of the title means.  In reality, the dancing men are drawn stick figures used as a code to send messages to one of the characters.

The word cloud has the potential to give the viewer a lot of information about a document, but in some cases it may fall short of its purpose.  The words used in this word cloud are no true representation of what occurs in the story.  Looking at the picture, I can see that there is a husband and wife, a few named characters, a setting, a letter, and someone gets shot.  This does not answer the question of who the dancing men are and, since the story is named after them, this is a lot of crucial information not being relayed.

Though I like the aesthetics and intention of the word cloud, I can understand why some people would be opposed to its existence, like Jacob Harris from the “Harmful” article we read seems to be.  A narrative may be impossible to find in a cluster of frequently used words with no specific meaning; therefore, the word cloud may not be effective or properly convey the meaning of the document it represents.

Scandal in Bohemia Word Cloud – Where’s Watson?

Using the Voyant tool, I made a word cloud of Sherlock Holmes’s adventure, “Scandal in Bohemia.” It is unfortunate that the tool does not utilize color to distinguish word frequency or other significant word trends because that would have allowed for some interesting insights. In any case, the word size was telling enough to extract some Sherlockian observations about the context of the story. In my generated word cloud ( http://voyant-tools.org/tool/Cirrus/?corpus=1411164560979.4726&query=&stopList=1411165833459tu&docIndex=0&docId=d1411099477875.99b8b096-b231-7094-d527-8b986fefb364), the most to least significant elements of the story are apparent from larger to smaller size. ‘Holmes’ appears 47 times, ‘photograph’ appears 21 times, ‘king’ (17), ‘majesty’ (16), ‘irene’ and ‘adler’ (13), and ‘woman’ (12). It’s no surprise that these four aspects of the story surface most frequently and the the photograph can be considered a tertiary character of the story because it is crucial to the reveal and the idea of ‘the woman.’

What I took most note of, however, was the lack of Watson’s name. Holmes is obviously the largest word, front and center, but Watson is notably smaller and on the outskirts of the cloud. His name appears 6 times, half as many times as the mention of Irene Adler’s name. Though this does make sense because he is the narrator and therefore is primarily mentioned in the first person in the text, I expected to see more of his name when Sherlock addresses him in conversation. A conclusion from this ‘where’s Watson’ is that this is a subtle show of Sherlock’s narcissism. Holmes’s heightened perception and memory are arguably the biggest parts of the story, but his lack of addressing Watson – our narrator and the right hand man – by name is a way of noticing Sherlock’s ego from a quantitative perspective.

TheWoman_WordFrequency ScandalBohemia_WordCloud2