What is difference between stemming and lemmatization?

Stemming (Rule Based Approach)

  • Stemming helps reduce a word to its stem form — It removes suffices, like “ing”, “ly”, “s”, etc.But it often the actual words get neglected. eg: Entitling,Entitled->Entitl
  • Stemming is faster as it chops the end of the word, without understanding the context of the word.

Lemmatizing (Dictionary-based approach)

  • Lemmatizing derives the canonical form (‘lemma’) of a word. Morphological analysis to the root form — Entitling, Entitled->Entitle
  • Lemmatizing is slower and more accurate and it takes context of the word in mind.

What is Stemming?

Stemming is the process of converting the words of a sentence to its non-changing portions. In the example of amusing, amusement, and amused above, the stem would be amus.

Types of Stemmers

You’re probably wondering how do I convert a series of words to its stems. Luckily, NLTK has a few built-in and established stemmers available for you to use! They work slightly differently since they follow different rules — which you use depends on whatever you happen to be working on.

Passionate author, strategic investor, financial advisor