Data mining mcq set 3

Q81: Classification and regression are the properties of___

Answer option

A. data analysis
B. data manipulation’
C. data mining
D. none of these

Answer

Answer C. data mining

Q82: A class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory is named as___

Answer option

A. Bayesian classifiers
B. Dijkstra classifiers
C. doppler classifiers
D. all of these

Answer

Answer A. Bayesian classifiers

Q83: Which of the following can be used for finding deep knowledge?

Answer option

A. stacks
B. algorithms
C. clues
D. none of these

Answer

Answer C. clues

Q84: We define a___as a subdivision of a set of examples into a number of classes

Answer option

A. kingdom
B. tree
C. classification
D. array

Answer

Answer C. classification

Q85: Group of similar objects that differ significantly from other objects is named as___

Answer option

A. classification
B. cluster
C. community
D. none of these

Answer

Answer B. cluster

Q86: Combining different type of methods or information is___

Answer option

A. analysis
B. computation
C. stack
D. hybrid

Answer

Answer D. hybrid

Q87: Which of the following is not a Data discretization Method? Select one

Answer option

A. Histogram analysis
B. Cluster Analysis
C. Data compression
D. Binning

Answer

Answer C. Data compression

Q88: Question text Which of the following data mining task is known as Market Basket Analysis? Select one

Answer option

A. Association Analysis
B. Regression
C. Clasification
D. Outlier Analysis

Answer

Answer A. Association Analysis

Q89: We define a___as a subdivison of a set of examples into a number of classes

Answer option

A. kingdom
B. tree
C. classification
D. array

Answer

Answer C. classification

Q90: What is the name of database having a set of databases from different vendors, possibly using different database paradigms?

Answer option

A. homogeneous database
B. heterogeneous database
C. hybrid database
D. none of these

Answer

Answer B. heterogeneous database

Q91: What is the strategic value of data mining?

Answer option

A. design sensitive
B. cost sensitive
C. technical sensitive
D. time sensitive

Answer

Answer D. time sensitive

Q92: The amount of information with in data as opposed to the amount of redundancy or noise is known as___

Answer option

A. paragraph content
B. text content
C. information content
D. none of these

Answer

Answer C. information content

Q93: What is inductive learning?

Answer option

A. learning by hypothesis
B. learning by analyzing
C. learning by generalizing
D. none of these

Answer

Answer C. learning by generalizing

Q94: Which of the following is true for Classification?

Answer option

A. A subdivision of a set
B. A measure of the accuracy
C. The task of assigning a classification
D. All of these

Answer

Answer A. A subdivision of a set

Q95: This clustering approach initially assumes that each data instance represents a single cluster. Select one

Answer option

A. expectation maximization
B. K-Means clustering
C. agglomerative clustering
D. conceptual clustering

Answer

Answer C. agglomerative clustering

Q96: The correlation coefficient for two real-valued attributes is What does this value tell you?

Answer option

A. The attributes are not linearly related
B. As the value of one attribute decreases the value of the second attribute increases
C. As the value of one attribute increases the value of the second attribute also increases
D. The attributes show a linear relationship

Answer

Answer B. As the value of one attribute decreases the value of the second attribute increases

Q97: Time Complexity of k-means is given by

Answer option

A. O(mn)
B. O(tkn)
C. O(kn)
D. O(t2kn)

Answer

Answer B. O(tkn)

Q98: Given a rule of the form IF X THEN Y, rule confidence is defined as the conditional probability that

Answer option

A. Y is false when X is known to be false
B. Y is true when X is known to be true
C. X is true when Y is known to be true
D. X is false when Y is known to be false

Answer

Answer D. X is false when Y is known to be false

Q99: Chameleon is

Answer option

A. Density based clustering algorithm
B. Partitioning based algorithm
C. Model based algorithm
D. Hierarchical clustering algorithm

Answer

Answer C. Model based algorithm

Q100: In___clusterings, points may belong to multiple clusters

Answer option

A. Non exclusive
B. Partial
C. Fuzzy
D. Exclusive

Answer

Answer A. Non exclusive

Q101: Find odd man out

Answer option

A. DBSCAN
B. K mean
C. PAM
D. K medoid

Answer

Answer C. PAM

Q102: Which statement is true about the K-Means algorithm?

Answer option

A. The output attribute must be cateogrical
B. All attribute values must be categorical
C. All attributes must be numeric
D. Attribute values may be either categorical or numeric

Answer

Answer B. All attribute values must be categorical

Q103: This data transformation technique works well when minimum and maximum values for a real-valued attribute are known

Answer option

A. z-score normalization
B. min-max normalization
C. logarithmic normalization
D. decimal scaling

Answer

Answer C. logarithmic normalization

Q104: The number of iterations in apriori___

Answer option

A. increases with the size of the data
B. decreases with the increase in size of the data
C. increases with the size of the maximum frequent set
D. decreases with increase in size of the maximum frequent set

Answer

Answer B. decreases with the increase in size of the data

Q105: Which of the following are interestingness measures for association rules?

Answer option

A. recall
B. lift
C. accuracy
D. compactness

Answer

Answer A. recall

Q106: Which one of the following is not a major strength of the neural network approach?

Answer option

A. Neural network learning algorithms are guaranteed to converge to an optimal solution
B. Neural networks work well with datasets containing noisy data
C. Neural networks can be used for both supervised learning and unsupervised clustering
D. Neural networks can be used for applications that require a time element to be included in the data

Answer

Answer A. Neural network learning algorithms are guaranteed to converge to an optimal solution

Q107: Given a frequent itemset L, If |L| = k, then there are

Answer option

A. 2k – 1 candidate association rules
B. 2k candidate association rules
C. 2k – 2 candidate association rules
D. 2k -2 candidate association rules

Answer

Answer B. 2k candidate association rules

Q108: ___is an example for case based-learning

Answer option

A. Decision trees
B. Neural networks
C. Genetic algorithm
D. K-nearest neighbor

Answer

Answer C. Genetic algorithm

Q109: The average positive difference between computed and desired outcome values

Answer option

A. mean positive error
B. mean squared error
C. mean absolute error
D. root mean squared error

Answer

Answer B. mean squared error

Q110: Frequent item sets is

Answer option

A. Superset of only closed frequent item sets
B. Superset of only maximal frequent item sets
C. Subset of maximal frequent item sets
D. Superset of both closed frequent item sets and maximal frequent item sets

Answer

Answer D. Superset of both closed frequent item sets and maximal frequent item sets

Q111: Assume that we have a dataset containing information about 200 individuals. A supervised data mining session has discovered the following rule

IF age & lt; 30 & credit card insurance = yes THEN life insurance = yes

Rule Accuracy: 70% and Rule Coverage: 63%

How many individuals in the class life insurance= no have credit card insurance and are less than 30 years old?

A. 63

B. 30

C. 38

D. 70

Answer

Answer A. 63

Q112: Value set {poor, average, good, excellent} is an example of

Answer option

A. Nominal attribute
B. Numeric attribute
C. Continuous attribute
D. Ordinal attribute

Answer

Answer D. Ordinal attribute

Q113: Removing duplicate records is a data mining process called___

Answer option

A. data isolation
B. recovery
C. data pruning
D. data cleaning

Answer

Answer D. data cleaning

Q114: Various visualization techniques are used in___step of KDD

Answer option

A. selection
B. interpretation
C. transformation
D. data mining

Answer

Answer B. interpretation

Q115: Which of the following is not a Visualization Method?

Answer option

A. Hierarchical visualization technique
B. Tuple based visualization Technique
C. Icon based visualization techniques
D. Pixel oriented visualization technique

Answer

Answer B. Tuple based visualization Technique

Q116: The correct answer is: Tuple based visualization Technique

Data set {brown, black, blue, green, red} is example of

A. Continuous attribute

B. Ordinal attribute

C. Numeric attribute

D. Nominal attribute

Answer

Answer D. Nominal attribute

Q117: Which of the following is NOT a data quality related issue?

Answer option

A. Attribute value range
B. Outlier records
C. Missing values
D. Duplicate records

Answer

Answer A. Attribute value range

Q118: To detect fraudulent usage of credit cards, the following data mining task should be used

Answer option

A. Outlier analysis
B. prediction
C. association analysis
D. feature selection

Answer

Answer A. Outlier analysis

Q119: Which of the following is NOT example of ordinal attributes?

Answer option

A. Ordered numbers
B. Military ranks
C. Zip codes
D. Movie ratings

Answer

Answer C. Zip codes

Q120: Which of the following is not a data pre-processing methods

Answer option

A. Data Cleaning
B. Data Visualization
C. Data Discretization
D. Data Reduction

Answer

Answer B. Data Visualization

Total Views: 35

DistPub Team

Distance Publisher (DistPub.com) provide project writing help from year 2007 and provide writing and editing help to hundreds student every year.