NLP, much like AI, has a history of ups and downs. IBM’s early work in 1954 for the Georgetown demonstration emphasized the huge benefits of machine translation (translating over 60 Russian sentences into English). This early approach used six grammar rules for a dictionary of 250 words and resulted in large investments into machine translation, but rules-based approaches could not scale into production systems.

Early work in NLP

In the 1960s, work began on applying meaning to sequences of words. In a process called tagging, sentences could be broken down into their parts of speech to understand their relationship within the sentence. These taggers relied on human-constructed rules-based algorithms to “tag” words with their context in a sentence (for example noun, verb, adjective, etc.). But there was considerable complexity in this tagging, since in English there can be up to 150 different types of speech tags.

>>> quote = "Knowing yourself is the beginning of all wisdom."
>>> tokens = nltk.word_tokenize( quote )
>>> tags = nltk.pos_tag( tokens )
>>> tags
[('Knowing', 'VBG'), ('yourself', 'PRP'), ('is', 'VBZ'),
('the', 'DT'), ('beginning', 'NN'), ('of', 'IN'), ('all', 'DT'),
('wisdom', 'NN'), ('.', '.')]
>>> sentence = "the man we saw saw a saw"
>>> tokens = nltk.word_tokenize( sentence )
>>> list(nltk.bigrams( tokens ) )
[('the', 'man'), ('man', 'we'), ('we', 'saw'), ('saw', 'saw'),
('saw', 'a'), ('a', 'saw')]

Modern approaches

Modern approaches to NLP primarily focus on neural network architectures. As neural network architectures rely on numerical processing, an encoding is required to process words. Two common methods are one-hot encodings and word vectors.

Word encodings

A one-hot encoding translates words into unique vectors that can then be numerically processed by a neural network. Consider the words from our last bigram example. We create our one-hot vector of the dimension of the number of words to represent and assign a single bit in that vector to represent each word. This creates a unique mapping that can be used as input to a neural network (each bit in the vector as input to a neuron) (see Figure 2). This encoding is advantageous to simply encoding the words as numbers (label encoding) because networks can more efficiently train with one-hot vectors.

Recurrent neural networks

Developed in the 1980s, recurrent neural networks (RNNs) have found a unique place in NLP. As the name implies, RNNs — as compared to typical feed-forward neural networks — operate in the time domain. RNNs unfold in time and operate in stages where prior outputs feed subsequent stage inputs (see Figure 3 for an unrolled network example). This type of architecture applies well to NLP since the network considers not just the words (or their encodings) but the context in which the words appear (what follows, what came before). In this contrived network example, input neurons are fed with the word encodings, and the outputs feed forward through the network to the output nodes (with the goal of an output word encoding for language translation). In practice, each word encoding is fed one at a time and propagated through. On the next time step, the next word encoding is fed (with output occurring only after the last word is fed).

Reinforcement learning

Reinforcement learning focuses on selecting actions in an environment to maximize some cumulative reward (a reward that is not immediately understood but learned over many actions). Actions are selected based on a policy that defines whether the given action should explore new states (unknown territory where learning can take place) or old states (based on past experience).

Deep learning

Deep learning/deep neural networks have been applied successfully to a variety of problems. You’ll find deep learning at the heart of Q&A systems, document summarization, image caption generation, text classification and modeling, and many others. Note that these cases represent natural language understanding and natural language generation.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aditya B

Aditya B


Passionate author, strategic investor, financial advisor