Natural language processing
22 Feb 2019This is the 1st post of blog post series ‘Understanding Natural Language Processing’.
Dive in NLP
Techincal def: Natural language processing (NLP) is a subfield of computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data
The history of natural language processing generally started in the 1950s, although work can be found from earlier periods. In 1950, Alan Turing published an article titled “Intelligence” which proposed what is now called the Turing test as a criterion of intelligence.
Major evaluations and tasks
Though natural language processing tasks are closely intertwined, they are frequently subdivided into categories for convenience in Syntax ,Semantics ,Speech.
Syntax
In linguistics, syntax is the set of rules, principles, and processes that govern the structure of sentences (sentence structure) in a given language, usually including word order. The term syntax is also used to refer to the study of such principles and processes.
Grammar induction generate a formal grammar that describes a language’s syntax.
Lemmatization the task of removing inflectional endings only and to return the base dictionary form of a word which is also known as a lemma.
Part-of-speech tagging given a sentence, determine the part of speech for each word. Many words, especially common ones, can serve as multiple parts of speech
Parsing determine the parse tree (grammatical analysis) of a given sentence. The grammar for natural languages is ambiguous and typical sentences have multiple possible analyses. In fact, perhaps surprisingly, for a typical sentence there may be thousands of potential parses (most of which will seem completely nonsensical to a human).
Sentence breaking given a chunk of text, find the sentence boundaries. Sentence boundaries are often marked by periods or other punctuation marks, but these same characters can serve other purposes
Terminology extraction the goal of terminology extraction is to automatically extract relevant terms from a given corpus.
Thank you for reading my post..Keep Scrolling.
***