Sherlock Holmes Topic Modeling (10)

No. of Iterations: 1000

No. of topic words printed: 20


Topic Modeling (10)

Number of Topics (40)

1. Deliberation: case, fact, reason, facts, explanation, mystery, obvious, idea, simple, shown, great, effect, prove, evident, impossible, solution, theory, observed, probable, story

Number of Topics (50)

2. Investigation: crime, police, evidence, murder, case, attention, account, death, tragedy, arrest, mark, occurred, inquiry, missing, unfortunate, discovered, charge, complete, naturally, committed

3. Attributes: man, face, eyes, dark, figure, looked, tall, head, drawn, black, features, mouth, thin, middle, appearance, deep, huge, beard, nose, lines

4.  Text: paper, note, read, letter, letters, book, handed, table, papers, written, message, writing, wrote, address, short, sheet, post, write, importance, document

5. Expression: face, eyes, turned, lips, spoke, appeared, light, suddenly, pale, manner, sat, staring, sank, expression, nervous, excitement, silent, eager, breath, fixed

Number of Topics (60)

6. Homicide: found, dead, body, left, dreadful, finally, carried, terrible, blow, lying, round, knife, stick, fell, brought, horrible, single, strong, weapon, person

7. Frontyard: road, house, carriage, side, drive, hall, front, direction, drove, back, garden, place, walked, station, yards, pulled, passed, stopped, gate, grounds

8. Setting: room, door, open, window, entered, opened, key, rushed, closed, bedroom, passage, instant, locked, floor, stair, pushed, lock, stairs, led, safe

9. Path: path, passed, showed, foot, round, water, led, track, leaving, ran, walked, edge, traces, feet, hard, grass, marks, fall, lay, ground

10. Mycroft: london, office, brother, suppose, papers, west, mycroft, young, company, evening, monday, club, card, foreign, fog, clerk, pycroft, pocket, daily, government

I kept the number of iterations to 1000 and the number of topic words to 20. I only experimented with the number of topics. I found that the lower the number (ten, foo example) the more general the words were, which made the meaning of word combinations difficult to pinpoint. I ended up using 40, 50, and 60.

At times, I found it difficult to understand some words usage with other terms. I think this is because I haven’t read many Sherlock Holmes stories and I don’t understand some associations. The topic modeling that I did end up using are those terms I strongly associate with Sherlock Holmes. Deliberation, investigation, and homicide relate very much to the overall Sherlock Holmes story line. I think these terms are more general and broad. The other terms (frontyard, attributes, text, setting, path, and expression) are more specific. These are the kind of things Sherlock would use during an investigation, as well as to DO an investigation. These kinds of terms would mostly be used in the middle part of the stories, during the investigation.

Mycroft, of course, is Sherlock’s brother.