Q81: Classification and regression are the properties of___
Answer option
- A. data analysis
- B. data manipulation’
- C. data mining
- D. none of these
Answer
Answer C. data mining
Q82: A class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory is named as___
Answer option
- A. Bayesian classifiers
- B. Dijkstra classifiers
- C. doppler classifiers
- D. all of these
Answer
Answer A. Bayesian classifiers
Q83: Which of the following can be used for finding deep knowledge?
Answer option
- A. stacks
- B. algorithms
- C. clues
- D. none of these
Answer
Answer C. clues
Q84: We define a___as a subdivision of a set of examples into a number of classes
Answer option
- A. kingdom
- B. tree
- C. classification
- D. array
Answer
Answer C. classification
Q85: Group of similar objects that differ significantly from other objects is named as___
Answer option
- A. classification
- B. cluster
- C. community
- D. none of these
Answer
Answer B. cluster
Q86: Combining different type of methods or information is___
Answer option
- A. analysis
- B. computation
- C. stack
- D. hybrid
Answer
Answer D. hybrid
Q87: Which of the following is not a Data discretization Method? Select one
Answer option
- A. Histogram analysis
- B. Cluster Analysis
- C. Data compression
- D. Binning
Answer
Answer C. Data compression
Q88: Question text Which of the following data mining task is known as Market Basket Analysis? Select one
Answer option
- A. Association Analysis
- B. Regression
- C. Clasification
- D. Outlier Analysis
Answer
Answer A. Association Analysis
Q89: We define a___as a subdivison of a set of examples into a number of classes
Answer option
- A. kingdom
- B. tree
- C. classification
- D. array
Answer
Answer C. classification
Q90: What is the name of database having a set of databases from different vendors, possibly using different database paradigms?
Answer option
- A. homogeneous database
- B. heterogeneous database
- C. hybrid database
- D. none of these
Answer
Answer B. heterogeneous database
Q91: What is the strategic value of data mining?
Answer option
- A. design sensitive
- B. cost sensitive
- C. technical sensitive
- D. time sensitive
Answer
Answer D. time sensitive
Q92: The amount of information with in data as opposed to the amount of redundancy or noise is known as___
Answer option
- A. paragraph content
- B. text content
- C. information content
- D. none of these
Answer
Answer C. information content
Q93: What is inductive learning?
Answer option
- A. learning by hypothesis
- B. learning by analyzing
- C. learning by generalizing
- D. none of these
Answer
Answer C. learning by generalizing
Q94: Which of the following is true for Classification?
Answer option
- A. A subdivision of a set
- B. A measure of the accuracy
- C. The task of assigning a classification
- D. All of these
Answer
Answer A. A subdivision of a set
Q95: This clustering approach initially assumes that each data instance represents a single cluster. Select one
Answer option
- A. expectation maximization
- B. K-Means clustering
- C. agglomerative clustering
- D. conceptual clustering
Answer
Answer C. agglomerative clustering
Q96: The correlation coefficient for two real-valued attributes is What does this value tell you?
Answer option
- A. The attributes are not linearly related
- B. As the value of one attribute decreases the value of the second attribute increases
- C. As the value of one attribute increases the value of the second attribute also increases
- D. The attributes show a linear relationship
Answer
Answer B. As the value of one attribute decreases the value of the second attribute increases
Q97: Time Complexity of k-means is given by
Answer option
- A. O(mn)
- B. O(tkn)
- C. O(kn)
- D. O(t2kn)
Answer
Answer B. O(tkn)
Q98: Given a rule of the form IF X THEN Y, rule confidence is defined as the conditional probability that
Answer option
- A. Y is false when X is known to be false
- B. Y is true when X is known to be true
- C. X is true when Y is known to be true
- D. X is false when Y is known to be false
Answer
Answer D. X is false when Y is known to be false
Q99: Chameleon is
Answer option
- A. Density based clustering algorithm
- B. Partitioning based algorithm
- C. Model based algorithm
- D. Hierarchical clustering algorithm
Answer
Answer C. Model based algorithm
Q100: In___clusterings, points may belong to multiple clusters
Answer option
- A. Non exclusive
- B. Partial
- C. Fuzzy
- D. Exclusive
Answer
Answer A. Non exclusive
Q101: Find odd man out
Answer option
- A. DBSCAN
- B. K mean
- C. PAM
- D. K medoid
Answer
Answer C. PAM
Q102: Which statement is true about the K-Means algorithm?
Answer option
- A. The output attribute must be cateogrical
- B. All attribute values must be categorical
- C. All attributes must be numeric
- D. Attribute values may be either categorical or numeric
Answer
Answer B. All attribute values must be categorical
Q103: This data transformation technique works well when minimum and maximum values for a real-valued attribute are known
Answer option
- A. z-score normalization
- B. min-max normalization
- C. logarithmic normalization
- D. decimal scaling
Answer
Answer C. logarithmic normalization
Q104: The number of iterations in apriori___
Answer option
- A. increases with the size of the data
- B. decreases with the increase in size of the data
- C. increases with the size of the maximum frequent set
- D. decreases with increase in size of the maximum frequent set
Answer
Answer B. decreases with the increase in size of the data
Q105: Which of the following are interestingness measures for association rules?
Answer option
- A. recall
- B. lift
- C. accuracy
- D. compactness
Answer
Answer A. recall
Q106: Which one of the following is not a major strength of the neural network approach?
Answer option
- A. Neural network learning algorithms are guaranteed to converge to an optimal solution
- B. Neural networks work well with datasets containing noisy data
- C. Neural networks can be used for both supervised learning and unsupervised clustering
- D. Neural networks can be used for applications that require a time element to be included in the data
Answer
Answer A. Neural network learning algorithms are guaranteed to converge to an optimal solution
Q107: Given a frequent itemset L, If |L| = k, then there are
Answer option
- A. 2k – 1 candidate association rules
- B. 2k candidate association rules
- C. 2k – 2 candidate association rules
- D. 2k -2 candidate association rules
Answer
Answer B. 2k candidate association rules
Q108: ___is an example for case based-learning
Answer option
- A. Decision trees
- B. Neural networks
- C. Genetic algorithm
- D. K-nearest neighbor
Answer
Answer C. Genetic algorithm
Q109: The average positive difference between computed and desired outcome values
Answer option
- A. mean positive error
- B. mean squared error
- C. mean absolute error
- D. root mean squared error
Answer
Answer B. mean squared error
Q110: Frequent item sets is
Answer option
- A. Superset of only closed frequent item sets
- B. Superset of only maximal frequent item sets
- C. Subset of maximal frequent item sets
- D. Superset of both closed frequent item sets and maximal frequent item sets
Answer
Answer D. Superset of both closed frequent item sets and maximal frequent item sets
Q111: Assume that we have a dataset containing information about 200 individuals. A supervised data mining session has discovered the following rule
IF age & lt; 30 & credit card insurance = yes THEN life insurance = yes
Rule Accuracy: 70% and Rule Coverage: 63%
How many individuals in the class life insurance= no have credit card insurance and are less than 30 years old?
A. 63
B. 30
C. 38
D. 70
Answer
Answer A. 63
Q112: Value set {poor, average, good, excellent} is an example of
Answer option
- A. Nominal attribute
- B. Numeric attribute
- C. Continuous attribute
- D. Ordinal attribute
Answer
Answer D. Ordinal attribute
Q113: Removing duplicate records is a data mining process called___
Answer option
- A. data isolation
- B. recovery
- C. data pruning
- D. data cleaning
Answer
Answer D. data cleaning
Q114: Various visualization techniques are used in___step of KDD
Answer option
- A. selection
- B. interpretation
- C. transformation
- D. data mining
Answer
Answer B. interpretation
Q115: Which of the following is not a Visualization Method?
Answer option
- A. Hierarchical visualization technique
- B. Tuple based visualization Technique
- C. Icon based visualization techniques
- D. Pixel oriented visualization technique
Answer
Answer B. Tuple based visualization Technique
Q116: The correct answer is: Tuple based visualization Technique
Data set {brown, black, blue, green, red} is example of
A. Continuous attribute
B. Ordinal attribute
C. Numeric attribute
D. Nominal attribute
Answer
Answer D. Nominal attribute
Q117: Which of the following is NOT a data quality related issue?
Answer option
- A. Attribute value range
- B. Outlier records
- C. Missing values
- D. Duplicate records
Answer
Answer A. Attribute value range
Q118: To detect fraudulent usage of credit cards, the following data mining task should be used
Answer option
- A. Outlier analysis
- B. prediction
- C. association analysis
- D. feature selection
Answer
Answer A. Outlier analysis
Q119: Which of the following is NOT example of ordinal attributes?
Answer option
- A. Ordered numbers
- B. Military ranks
- C. Zip codes
- D. Movie ratings
Answer
Answer C. Zip codes
Q120: Which of the following is not a data pre-processing methods
Answer option
- A. Data Cleaning
- B. Data Visualization
- C. Data Discretization
- D. Data Reduction
Answer
Answer B. Data Visualization