Topic Modeling: So Many Words!

The concept of topic modeling even at a surface glance, through the reading of an article, seemed like a pretty complicated endeavor. The nature of the program itself was very interesting, aside from the hardships it provided. The program allows for large volumes of literacy to be analyzed through “topics” or certain words from the text that are the most relevant in understanding the text without actually reading through it.

My topic modeling experimentation included two different uses:

60 topics, 1500 iterations, 25 topic and 50 topics, 2000 iterations, 50 topic words

My first example usage gave me the most varied results because of the larger amount of topic words. Although the amount of  iterations is less than my second example, the amount of topic words proved to be the true variable that determined more diverse results. The topics allowed me to understand the main point of the stories in which I chose to look at (their topic words) without even reading any of the stories. This is extremely helpful considering there is a very vast amount of Sherlock Holmes stories to be read. And like most other digital humanity tools, this would be very helpful in creating an archive, or any other project which requires the reading of many texts.

Most of us were pressed for time as this program was is only downloaded onto the computers in the lab where we have class. However, it was one of the many endeavors in life where once you understand it, it becomes quite easy to pick up efficiently. Screen Shot 2014-10-29 at 11.36.07 PM

As shown in the very crowded wordle of all of my topic words, there were many results.