CBSE Class 10 Artificial Intelligence Question Bank for Board Examination
CHAPTER 7: NATURAL LANGUAGE PROCESSING
One (01) Mark Questions
1. What is a Chabot?
Answer: A chatbot is a computer program that’s designed to simulate human conversation through voice commands or text chats or both. Eg: Mitsuku Bot, Jabberwacky etc.
OR
A chatbot is a computer program that can learn over time how to best interact with humans. It can answer questions and troubleshoot customer problems, evaluate and qualify prospects, generate sales leads and increase sales on an ecommerce site.
OR
A chatbot is a computer program designed to simulate conversation with human users. A chatbot is also known as an artificial conversational entity (ACE), chat robot, talk bot, chatterbot or chatterbox.
OR
A chatbot is a software application used to conduct an on-line chat conversation via text or text-to-speech, in lieu of providing direct contact with a live human agent.
2. What is the full form of NLP?
Answer: Natural Language Processing
3. While working with NLP what is the meaning of ?
a. Syntax
b. Semantics
Answer:
Syntax: Syntax refers to the grammatical structure of a sentence.
Semantics: It refers to the meaning of the sentence.
4. What is the difference between stemming and lemmatization?
Answer: Stemming is a technique used to extract the base form of the words by removing affixes from them. It is just like cutting down the branches of a tree to its stems. For example, the stem of the words eating, eats, eaten is eat.
Lemmatization is the grouping together of different forms of the same word. In search queries, lemmatization allows end users to query any version of a base word and get relevant results.
OR
Stemming is the process in which the affixes of words are removed and the words are converted to their base form.
5. What is the full form of TFIDF?
Answer: Term Frequency and Inverse Document Frequency
6. What is meant by a dictionary in NLP?
Answer: Dictionary in NLP means a list of all the unique words occurring in the corpus. If some words are repeated in different documents, they are all written just once as while creating the dictionary.
7. What is term frequency?
Answer: Term frequency is the frequency of a word in one document. Term frequency can easily be found from the document vector table as in that table we mention the frequency of each word of the vocabulary in each document.
8. Which package is used for Natural Language Processing in Python programming?
Answer: Natural Language Toolkit (NLTK). NLTK is one of the leading platforms for building Python programs that can work with human language data.
9. What is a document vector table?
Answer: Document Vector Table is used while implementing Bag of Words algorithm.
In a document vector table, the header row contains the vocabulary of the corpus and other rows correspond to different documents.
If the document contains a particular word it is represented by 1 and absence of word is represented by 0 value.
OR
Document Vector Table is a table containing the frequency of each word of the vocabulary in each document.
10. What do you mean by corpus?
Answer: In Text Normalization, we undergo several steps to normalize the text to a lower level. That is, we will be working on text from multiple documents and the term used for the whole textual data from all the documents altogether is known as corpus.
OR
A corpus is a large and structured set of machine-readable texts that have been produced in a natural communicative setting.
OR
A corpus can be defined as a collection of text documents. It can be thought of as just a bunch of text files in a directory, often alongside many other directories of text files.
Two (02) Mark Questions
1. What are the types of data used for Natural Language Processing applications?
Answer: Natural Language Processing takes in the data of Natural Languages in the form of written words and spoken words which humans use in their daily lives and operates on this.
2. Differentiate between a script-bot and a smart-bot. (Any 2 differences)
Answer:
Script-bot | Smart-bot |
a. A scripted chatbot doesn’t carry even a glimpse of AI. b. Script bots are easy to make Script bot functioning is very limited as they are less powerful. c. Script bots work around a script which is programmed in them. d. No or little language processing skills e. Limited functionality | a. Smart bots are built on NLP and ML. b. Smart –bots are comparatively difficult to make. c. Smart-bots are flexible and powerful. d. Smart bots work on bigger databases and other resources directly. e. NLP and Machine learning skills are required. f. Wide functionality |
3. Give an example of the following:
- Multiple meanings of a word
- Perfect syntax, no meaning
Answer: Example of Multiple meanings of a word – His face turns red after consuming the medicine
Meaning – Is he having an allergic reaction? Or is he not able to bear the taste of that medicine?
Example of Perfect syntax, no meaning-
Chickens feed extravagantly while the moon drinks tea.
This statement is correct grammatically but it does not make any sense. In Human language, a perfect balance of syntax and semantics is important for better understanding.
4. What is inverse document frequency?
Answer: To understand inverse document frequency, first we need to understand document frequency.
Document Frequency is the number of documents in which the word occurs irrespective of how many times it has occurred in those documents.
In case of inverse document frequency, we need to put the document frequency in the denominator while the total number of documents is the numerator.
For example, if the document frequency of a word “AMAN” is 2 in a particular document then its inverse document frequency will be 3/2. (Here no. of documents is 3)
5. Define the following:
- Stemming
- Lemmatization
Answer: Stemming: Stemming is a rudimentary rule-based process of stripping the suffixes (“ing”, “ly”, “es”, “s” etc) from a word.
Stemming is a process of reducing words to their word stem, base or root form (for example, books — book, looked — look).
Lemmatization: Lemmatization, on the other hand, is an organized & step by step procedure of obtaining the root form of the word, it makes use of vocabulary (dictionary importance of words) and morphological analysis (word structure and grammar relations).
The aim of lemmatization, like stemming, is to reduce inflectional forms to a common base form. As opposed to stemming, lemmatization does not simply chop off inflections. Instead it uses lexical knowledge bases to get the correct base forms of words.
OR
Stemming is a technique used to extract the base form of the words by removing affixes from them. It is just like cutting down the branches of a tree to its stems. For example, the stem of the words eating, eats, eaten is eat.
Lemmatization is the grouping together of different forms of the same word. In search queries, lemmatization allows end users to query any version of a base word and get relevant results.
OR
Stemming is the process in which the affixes of words are removed and the words are converted to their base form.
In lemmatization, the word we get after affix removal (also known as lemma) is a meaningful one. Lemmatization makes sure that lemma is a word with meaning and hence it takes a longer time to execute than stemming.
OR
Stemming algorithms work by cutting off the end or the beginning of the word, taking into account a list of common prefixes and suffixes that can be found in an inflected word.
Lemmatization on the other hand, takes into consideration the morphological analysis of the words. To do so, it is necessary to have detailed dictionaries which the algorithm can look through to link the form back to its lemma.