Google Ngrams: The Rise of Communism, Fascism, and Elizabeth Bennet.

For my first Google Ngram chart I used the terms “communism” and “fascism” and searched between the years 1800 and 2000.

The definitions of Communism and Fascism are as follows:

• Communism is a socioeconomic system structured upon common ownership of the means of production and characterized by the absence of social classes, money, and the state; as well as a social, political and economic ideology and movement that aims to establish this social order.
• Fascism is a form of radical authoritarian nationalism that came to prominence in early 20th-century Europe. Historians cannot agree on a set definition but have noticed certain trends when it comes to fascist movements; the veneration of the state, a devotion to a strong leader, and an emphasis on ultranationalism and militarism.

ngram chart 1

It was not surprising to me that Communism started to rise on the chart beginning in the mid-1800s. In 1848, Karl Marx and his associate Friedrich Engels published their famous pamphlet, The Communist Manifesto. It provided a new definition for communism amidst the advancement of the Industrial Revolution. From there, Communism seems to steadily increase in publications. Fascism does not begin to rise on the chart until over 70 years later, in 1919. This was when Fascism was founded, in post-WWI Italy by nationalist syndicalists.
Again, I was not surprised to see Communism and Fascism both increase in publications around the early 1930s. Though World War II did not begin until 1939, related conflicts began earlier. Around the 1940s, Fascism was much more prevalent in publications, as the leaders of Germany and Italy operated under Fascist regimes. Communism, however, did not see a great decrease, as China, Cuba, and North Korea, among others, were all Communist.

When World War II ended in 1945, the use of Fascism declined, as these movements/regimes were mostly disbanded, and not many nations nowadays are openly Fascist. However, Fascism is still fairly popular in publications, which I can only assume are publications on history and the war. Though the use of Communism declined as the use of Fascism increased during World War II, it steadily increased afterward. This is because of the Cold War, which was a period of political and military tension following the war between the United States (and its allies) and the Communist Soviet Union. Communism reaches its peak on the chart between 1961 and 1963, probably because of the Bay of Pigs Invasion and the Cuban Missile Crisis, both attempts by the United States the overthrow the Communist regime of Fidel Castro in Cuba. Both events were very important, so it makes sense that the use of Communism would increase during this time.

As Cold War tensions died down, so did the use of Communism. Though like Fascism it is still popular in publication, one can mostly assume it is a part of mostly history publications.

For my second Ngram chart, I decided to do something a little lighter than Communism and Fascism. I searched the characters “Marianne Dashwood,” “Elinor Dashwood,” and “Elizabeth Bennet.” Marianne and Elinor are characters for Jane Austen’s Sense and Sensibility while Elizabeth Bennet is the main character of Austen’s Pride and Prejudice.

ngram chart 2

It seems that the Dashwood sisters do not enjoy the same popularity as Elizabeth. Though Sense and Sensibility was the first work Jane Austen ever published, Pride and Prejudice could be seen as one of her, if not her most, popular works. Therefore, Elizabeth Bennet increases in publication as the years go on, but does seem to decline quite suddenly at random intervals, though that may be just the only way the chart is able to show it. Though I tried “smoothing it out” using larger numbers, these seemingly random drops still occurred.

Google Ngrams is a helpful tool that can help locate trends in a wide array of digitized works. However, lack of context is a big drawback of the tool. Without context, some of the trends and information could be highly misinterpreted. Overall though, I find Google Ngrams to be a fun and easy tool to use.

Discovering Ngrams

I took two similar forms of data and made two separate charts, one involving the titles of my absolute favorite books, and one involving the names of my favorite literary ladies.

In my first chart, we see three novels, The Great Gatsby, Lolita, and The Things They Carried. Screen Shot 2014-10-19 at 8.18.38 PM

While Lolita and The Great Gatsby are classic novels, we can see clearly that The Things They Carried is a relatively new book as opposed to the other two which have clearly been around for much longer. Lolita is easily seen to be the oldest of the books. We see a large spike in the popularity of The Great Gatsby incredibly drastically, while Lolita’s popularity grew at a steady incline.

In my second chart, we see Jordan Baker of The Great Gatsby against Dolores Haze of Lolita. Screen Shot 2014-10-19 at 8.36.28 PMBoth women, as seen in the graph, are characters that have reappeared heavily throughout the twentieth and twenty first centuries. Jordan Bakers popularity trumps over Dolores Haze’s character, however, i have concluded that that occurs only because Dolores is notably referred to as Lolita or Lo rather than her full name Dolores, and in mentioning her, people often go for one of her recognizable nicknames rather than her given name.

Google Ngram Viewer

Screen Shot 2014-10-19 at 8.05.42 PM

These graphs show the popularity of certain words that we know today.  I looked up the three words that exemplify music; specifically genres of music which include:  rock, metal, and pop.  Turns out, that pop is the lowest in popularity while metal, and rock were the most popular.  In any case, between 1800-1900 it would make sense that metal and rock would be a lot more popular than pop because POP wasn’t really a genre of music between 1800-1900.  These graphs are really cool because being able to visually see the difference between certain words, make the idea of music genres so much more “real.”  I mean, as someone who loves music, I never really thought about the logistics, and statistics behind how popular a certain genre typically is.    Also, to me, my favorite genre of music is pop, so to see that it really was not at all popular was quite an interesting thing to see.   Between the years 1860-1900 pop made a small incline in popularity, while metal and rock both fluctuated dramatically between 1800-1900.  In 1820, rock had a downfall, but quickly had an incline in the year 1830.  Therefore for about 10 years, rock was not a popular genre of music.  Although, in the years 1820-1830, metal was not a very popular genre either.

Screen Shot 2014-10-19 at 8.41.05 PM

The second group of words that I looked up were words that had to do with movie genres.   Within this graph, it is shown that over the years the genre horror has fluctuated up and down, but remained going in a downward direction from 1800-1900.  While horror was beginning to slowly go down, comedy, and drama also fluctuated but ended up going on an uphill slope closer to the 1900’s.  Between the years 1800-1830; there has been a major, major fluctuation between these genres.   Horror, and Drama both cross over one another in the late 1800’s to early 1900’s as well while drama and comedy intertwined at 1830.  At one point, drama and comedy both were at the same level of popularity and that was in the year closer to 1830.  

All of these datas I have looked at were all so fascinating.  When you think about these words, you never really understand the importance of them.  We say words everyday, but never actually realize or understand how much we really use them, or how much they are really used on an everyday basis.  Not even talking about these genres, but just in the sole purpose of them being used it is incredible to look up data to see what is popular, and the time period they were at their highest peak as well.   

Sammy Harris

Get with the Ngram

I’ve had another chance to use the digital tool Google book Ngram viewer. I searched the words “Khaleel” and “war”. I graphed them independently and together to see which would be the best way to view the information. When I graphed the words together only the data of one of the words seemed to be prevalent on the chart while the other seemed to have no informative value at all. In this chart you can see how the word “war” out-values the word “khaleel” and I find this a little tragic because this means that not all words would be able to be compared together from this graph alone.

Here you can see the graph of both words:

Screen Shot 2014-10-19 at 6.27.22 PM

When i searched up my name “Khaleel” I found it pretty weird that my name was mentioned more in books of the 19th century instead of the 20th or 21st. This could mean that my name isn’t used as much in books of today or that the popularity of my name has decreased since it’s introduction in the history of English literature.

Screen Shot 2014-10-19 at 6.18.52 PM

I searched up the word war because that was one of the few things in history that has been apart of every generation. The value of the word spiked for two separate years and those were around the years of 1918 and 1942, which is almost identical to the same time frame of World War I and World War II. This shows that the atmosphere of literature at the time revolved around the current events of that time and the only event going on at that time was war.

Screen Shot 2014-10-19 at 6.21.40 PM

Google Books Ngram Viewer

The first 3 terms I decided to look up were “Frankenstein”, “Dracula”, and “Werewolf”. I chose these 3 fictional characters because they were very popular among fictional stories and folklore between the time periods of 1800 and 2000. More so during the 1800’s though, when superstitions were high among villagers and there was still a lot of unexplored territories and the fear of the forest. I also picked these three terms to look up because I am a fan of old and new horror movies. The three different characters are some of the more notable in the horror genre. However, it’s odd to see that “Frankenstein” and “Dracula” increase in popularity in the 1970’s then peak around the late 1990’s and “Werewolf” remain low in the rankings through the years.frankenstein

The next three terms I decided to look up were “Ford”, “Cadillac”, and “Dodge”.  I chose these three because they are very different from the previous graph I looked at. I also chose these three because practically everybody drives nowadays and I wanted to compare how popular 3 of the oldest car companies were. Obviously Ford is the most popular and mentioned the most because it is the oldest company. However, Cadillac and Dodge are very popular today and with Cadillac being a luxury car I was surprised not to see it higher up on the graph, not the lowest out of all 3 of them. It was odd to see the drop in Ford’s popularity though during the 1950’s. I suppose it had something to do with newer manufacturers coming into the marketplace and consumers wanting more variety. It is also weird to see that they reached their peak in the mid to late 1970’s then decline after that until the year 2000.

cars

Diseases and Kaiju

For the NGram project, I figured I’d geek out a little and check out how the English corpus treated two different topics- one serious, one not so much.

First, I opted to get serious- namely, the occurrence of the words, “virus”, “bacteria”, and “prion”, in literature, from 1800 to 2008.

Disease causing agents and their references over time.
Disease causing agents and their references over time.

Viruses are seen in blue, bacteria in red, and prions in green. For those who don’t know the difference, here’s a simple rundown:

  • Bacteria are your standard, run of the mill organisms: simple, single celled creatures, who exist to consume and reproduce. They cause disease in a wide variety of ways, but, generally, from either attacking body tissue, or (more often) as a byproduct of immune response. Antibiotics exist and are used to destroy them by preventing the microbes from reproducing (and letting your immune system do the dirty work), or by outright killing them. Due to the rapid reproductive rate of bacteria, mutations in a short span of time can make these robust organisms immune to antibiotics in subsequent generations- hence, the need to lower the selection pressure we’re putting on them and tone down our (often unnecessary) use of antibiotics. Famous examples: Yersinia pestis (the Black Plague), Salmonella, Staphylococcus.
  •  Somewhere in between bacteria and prions, you find viruses. Viruses are simple: DNA or RNA encapsulated by a protein structure. They get in you, hijack your cells, use them to reproduce…and don’t stop doing so. Illness is caused much like a bacteria- either by the destruction of your cells, or the immune response of the human body (fevers, inflammation). I say viruses fall in between bacteria and prions because the debate as to whether they’re truly living organisms is very much still ongoing. They fit all the characteristics (reproduce, a need to feed [of sorts], require shelter), but the sheer alien nature of their existence sets them apart from something like bacteria, which is fairly familiar (reproduces independently of a host, is comprised of organelles, needs to actively eat something to survive). No real treatments exist for viral infections, though antiviral medication does exist and is used in cases of certain diseases, such as HIV. Luckily, viruses are very vulnerable to vaccination, and prevention is possible for many viral diseases. Famous examples: Smallpox, Ebola, Influenza, the common cold, HIV.
  • Prions are a somewhat newfound disease, and, if the sliding scale goes from bacteria (living) to virus (uncertain), then prions comprise the final extreme. Prions are, simply, malformed proteins. They are not necessarily “malicious”like bacteria and viruses, which actively seek to spread, but are simply proteins that are stable, and, when assimilated by the body, scramble the reproductive codes for cells. When a prion enters the body, it is assimilated by the relevant tissues and replaces the previous proteins in its place. When this occurs, the cells left over attempt to reproduce this protein instead, and does so- from there on, a wildfire of exponential growth occurs and the protein floods the tissue, destroying it and causing severe, incurable and 100% terminal illness. Due to their stability, prions are very hard to deal with and require incredibly vigorous, extraordinary methods of cleaning to destroy them- unlike viruses and bacteria, they do not simply “die”. Famous examples: Mad Cow Disease, Kuru

The graph produced was fairly informative, and very surprising! Mostly, I was completely astounded that the word “virus” was actually used fairly often, even before the first confirmation of the entity itself was known. I am told by Google that the word had a previous medical application, meaning “slimy, liquid poison”, often originating from the body of a sickly person- and something that could spread to others. When discovered in 1892 by Martinus Beijerinck, who coined the term, I suppose that seemed like the most apt description of the contagion (and, I believe, it still is). A massive uptick in references to the word “virus” occur in about 1935, four years after the electron microscope first allowed us to physically see the entities that perplexed Pasteur and his ilk (Pasteur, who pioneered vaccination, could find no microorganism responsible for rabies- only symptoms). Since then, as microbiology, imaging, and a variety of health sciences advanced, a massive increase in usage of the phrase continued through to the 21st century- though, a decrease in usage is noted towards the 1990s.

Bacteria, having been discovered and confirmed much earlier, is unsurprisingly much more popular in usage from an earlier point. A massive uptick in usage is documented between 1910, and 1930- possibly due to battlefield injuries from the first World War, as well as advances in microbiology. Usage increases again in 1940 (again coinciding with another World War), possibly due to the beginning of the usage of commercially available antibiotics. Usage of the word has decreased ever since, presumably since treatment is so effective (who wants to write a fiction book about a bacterial plague, when we can easily cure one?) and discovery of new bacterial forms is no longer revolutionary, as it once was.

Prions have not yet seen their day of popularity (and, I hope, never will)- since Prion theory is very new (dating back to 1962, at the earliest), and only a handful of known diseases are caused by prions (and those require very special facilities to be worked on in a lab), I imagine not much literature would cover the topic. Interestingly, usage increases in the mid-nineties- coinciding with the release of Michael Crichton’s Jurassic Park and The Lost World- the former which mentioned prions, and the latter which used them centrally as a plot point.

Aside from disease, I opted to also take a look at something a little less serious- giant monsters.

Godzilla vs King Kong, and not for the first time...
Godzilla vs King Kong, and not for the first time…

Unsurprisingly, both movies feature heavy popularity and consistent mentions after their initial releases. Peaks exist, coinciding with major sequel releases- most notably, the 1976 remake of King Kong, and the 1998 American version of Godzilla. Surprisingly, the peak for Peter Jackson’s 2005’s remake of King Kong is fairly small. I would be very interested to see how it peaks after closer to 2014, with another American remake of Godzilla.

Incidentally, I find it very interesting that King Kong actually lags behind popular mentions in the English language- an American monster movie that existed for much longer prior to Godzilla, somehow finds much less staying power (in terms of popularity) than its Japanese counterpart.

Google Books Ngram Viewer: (Crazy, Insane, Lunatic) (Romance, Comedy)

These two diagrams were created in Google Books Ngram Viewer. Both are comparing terms found in 19th century (1800 to 1900) literature and helps displays patterns in writing.

google books ngarm crazy insane lunatic

The first terms I decided to look up were “insane”, “crazy,” and “lunatic.” Today, these terms are thrown around pretty loosely but they have obvious derogatory connotations. I have heard a lot about the mistreatment of people thought to have mental illnesses and disorders in our society today. Mistreatment was especially bad before most of these cases were understood, so I was curious to look into the use of these words in the 19th century. While these terms were not used so much in the beginning, they increased significantly by the second half of the 19th century. “Insane” increased more around 1825 – 1830 which was only a few decades after the popularization of insane asylums and relocation.

Also, at the end of the 19th century (namely around 1880) the use of phrenology and measuring cranial capacity was still used as a method to determine attributes such as criminality and mental ability/illness. After a few hours of looking at articles from the 19th century, it would seem that the interest in mental illness turned to fascination by the 1880’s. This makes sense also because it was in 1880 that a lot of light was shown on the poor conditions of asylums and the terrible treatment of it’s occupants. People where more interested in mental illnesses and disorders by the end of the century. It makes sense, then, that the terms “insane” and “lunatic” increased the most in literature during these years.

google books ngarm romance comedy png

I also compared “romance” and “comedy.” I was not sure what to expect, but there was certainly a change in popularity and use. It would seem that comedy peaked around 1824 and romance had surpassed it by 1830. In England, the late 18th century until almost half way through the 19th century was marked by romanticism. Authors such as Walter Scott and Jane Austen were extremely popular. The romantic movement reached beyond England (France) into the U.S. By the early 19th century, romantic novels and literature swept across the country. This may have been because of religious restriction and the desire to personify the individual (and their emotions) instead of god or religion. Romanticism of frontier life and Native Americans were also very popular.

Dan Albrecht’s N-gram post.

Science fiction novels have made leaps and bounds over the past 65 years.  Just before 1950 was when they really started to take off, as nuclear weapons and the space race were gripping the imaginations of people all over the world.  Having just survived WWII, many were wondering what the future would hold for them.  With barely 20 years separating the first two wars, it was understandable for many to take for granted that a third one would not be far off, and with the next war, a nuclear holocaust.

If one adds that fear to the rivalry of the Cold War, and the race for the two sides of that conflict to be technologically superior to the other, the imaginations of the world turned towards science in a way that it never had before, and thus it was my hypothesis that Science Fiction as a genre would take off around the outset of the Cold War.

To test that, I created a graph of the worlds “science fiction” in Google Ngram.  Here is what I got.

SciFi

As one can see, the words “science fiction” are almost non existent in literature prior to the mid 1940’s.  At this point, there is a steady rise in the frequency of those words until 1970, where is a sudden spike in frequency.  Since Neil Armstrong famously landed on the moon in 1969, it is very reasonable to assume that the moon landing had an impact on the popularity of science fiction.

One of the more popular science fiction novels of the time period The Moon is a Harsh Mistress, had a notable quote which became quite popular “There ain’t no such thing as a free lunch.”

free lunch

 

The popular on the words “Free Lunch” soars after 1960, which is about the time when the novel was published.  This does not prove that the novel was responsible for the phrase, but it is interesting nonetheless.

Google Ngrams Viewer

Capture1

While I was playing around with the Google Ngrams viewer this weekend, I came across an interesting discovery. If you search the terms “men, women” a very interesting graph displays itself. Women are mentioned severely less than men are in Google’s databases, that is until you hit around the year 1983. From that year on, the balance (or lack thereof) begins to shift very dramatically. Not only is the word “women” mentioned more after this year, but “men” is mentioned less!  To me, this is astonishing, because if you take a look at, say, the year 1810, women are mentioned a mere 0.01 while men are almost mentioned 0.1%! This may not seem like that big of a number, but considering the amount of words that Google is referencing here, that is a pretty hefty number! Especially compared to the little amount that Women is mentioned! To me, this represents the constant war of the sexes that is going on in our world, and really makes me glad that is has shifted. Even though there is a lot of work to be done in the way of complete equality, I think that this is quite an amazing thing to have changed so radically from both directions, and have these two lines meet in the middle.

-Austin Carpentieri

Capture

P. S.-  While futzing around with the viewer, I typed in the word “google” and came across some…puzzling results. I don’t know if this is a glitch in the system or what, but I have no clue who was using the word “google” in the 1840s, or why 1950 is the peak for this word. I have no clue it really is puzzling.

The Death of Romance due to the Rise in Sex in Literature

To explore Google Ngrams, I looked at the most recent century for trends between the words “romance” and “sex.” The 20th century was full of paradigm-shifting developments such as the two World Wars, United States Prohibition, the invention of television and the internet, progressive waves of feminist and gay rights movements, and drastic developments in medicine and science. Baring all of this in mind, I was curious to look at how literature would reflect the frequency of romance versus the frequency of sex in this century. Based on the graphs below, as I suspected, the frequency of sex in literature spiked upward very quickly. I did not however suspect that romance would decline, I hoped it would remain somewhat steady over time (but that may just be the optimist in me).

Decline of Romance Ngram
Sex vs Romance Comparison in Literature

One of the biggest social revolutions in the 20th century was the sexual revolution which first surfaced in the late 1910s and moved into the Roaring Twenties after World War I. This graph is a direct reflection of that shift in the time of hot jazz, speakeasies and flapper girls. We see a subtle drop in romance from 1910 to 1920, as sex steadily climbs to 1930, where it remains somewhat constant until 1960. In the Western World, the big sexual revolution took place from 1960-1980, where we see the biggest spike in the top curve. The last increase in the sex graph is in the late 1990s, with the advent of the internet and the shift in hyper-sexualized advertising on television. With free pornography on the internet and shows like Baywatch on broadcast television, this last rise in the sex graph is not a big surprise.

*All historical references from the following wikipedia entry: http://en.wikipedia.org/wiki/Sexual_revolution