Data mining mcq set 3

Q81: Classification and regression are the properties of___

Answer option

  • A. data analysis
  • B. data manipulation’
  • C. data mining
  • D. none of these
Answer

Answer C. data mining

Q82: A class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory is named as___

Answer option

  • A. Bayesian classifiers
  • B. Dijkstra classifiers
  • C. doppler classifiers
  • D. all of these
Answer

Answer A. Bayesian classifiers

Q83: Which of the following can be used for finding deep knowledge?

Answer option

  • A. stacks
  • B. algorithms
  • C. clues
  • D. none of these
Answer

Answer C. clues

Q84: We define a___as a subdivision of a set of examples into a number of classes

Answer option

  • A. kingdom
  • B. tree
  • C. classification
  • D. array
Answer

Answer C. classification

Q85: Group of similar objects that differ significantly from other objects is named as___

Answer option

  • A. classification
  • B. cluster
  • C. community
  • D. none of these
Answer

Answer B. cluster

Q86: Combining different type of methods or information is___

Answer option

  • A. analysis
  • B. computation
  • C. stack
  • D. hybrid
Answer

Answer D. hybrid

Q87: Which of the following is not a Data discretization Method? Select one

Answer option

  • A. Histogram analysis
  • B. Cluster Analysis
  • C. Data compression
  • D. Binning
Answer

Answer C. Data compression

Q88: Question text Which of the following data mining task is known as Market Basket Analysis? Select one

Answer option

  • A. Association Analysis
  • B. Regression
  • C. Clasification
  • D. Outlier Analysis
Answer

Answer A. Association Analysis

Q89: We define a___as a subdivison of a set of examples into a number of classes

Answer option

  • A. kingdom
  • B. tree
  • C. classification
  • D. array
Answer

Answer C. classification

Q90: What is the name of database having a set of databases from different vendors, possibly using different database paradigms?

Answer option

  • A. homogeneous database
  • B. heterogeneous database
  • C. hybrid database
  • D. none of these
Answer

Answer B. heterogeneous database

Q91: What is the strategic value of data mining?

Answer option

  • A. design sensitive
  • B. cost sensitive
  • C. technical sensitive
  • D. time sensitive
Answer

Answer D. time sensitive

Q92: The amount of information with in data as opposed to the amount of redundancy or noise is known as___

Answer option

  • A. paragraph content
  • B. text content
  • C. information content
  • D. none of these
Answer

Answer C. information content

Q93: What is inductive learning?

Answer option

  • A. learning by hypothesis
  • B. learning by analyzing
  • C. learning by generalizing
  • D. none of these
Answer

Answer C. learning by generalizing

Q94: Which of the following is true for Classification?

Answer option

  • A. A subdivision of a set
  • B. A measure of the accuracy
  • C. The task of assigning a classification
  • D. All of these
Answer

Answer A. A subdivision of a set

Q95: This clustering approach initially assumes that each data instance represents a single cluster. Select one

Answer option

  • A. expectation maximization
  • B. K-Means clustering
  • C. agglomerative clustering
  • D. conceptual clustering
Answer

Answer C. agglomerative clustering

Q96: The correlation coefficient for two real-valued attributes is What does this value tell you?

Answer option

  • A. The attributes are not linearly related
  • B. As the value of one attribute decreases the value of the second attribute increases
  • C. As the value of one attribute increases the value of the second attribute also increases
  • D. The attributes show a linear relationship
Answer

Answer B. As the value of one attribute decreases the value of the second attribute increases

Q97: Time Complexity of k-means is given by

Answer option

  • A. O(mn)
  • B. O(tkn)
  • C. O(kn)
  • D. O(t2kn)
Answer

Answer B. O(tkn)

Q98: Given a rule of the form IF X THEN Y, rule confidence is defined as the conditional probability that

Answer option

  • A. Y is false when X is known to be false
  • B. Y is true when X is known to be true
  • C. X is true when Y is known to be true
  • D. X is false when Y is known to be false
Answer

Answer D. X is false when Y is known to be false

Q99: Chameleon is

Answer option

  • A. Density based clustering algorithm
  • B. Partitioning based algorithm
  • C. Model based algorithm
  • D. Hierarchical clustering algorithm
Answer

Answer C. Model based algorithm

Q100: In___clusterings, points may belong to multiple clusters

Answer option

  • A. Non exclusive
  • B. Partial
  • C. Fuzzy
  • D. Exclusive
Answer

Answer A. Non exclusive

Q101: Find odd man out

Answer option

  • A. DBSCAN
  • B. K mean
  • C. PAM
  • D. K medoid
Answer

Answer C. PAM

Q102: Which statement is true about the K-Means algorithm?

Answer option

  • A. The output attribute must be cateogrical
  • B. All attribute values must be categorical
  • C. All attributes must be numeric
  • D. Attribute values may be either categorical or numeric
Answer

Answer B. All attribute values must be categorical

Q103: This data transformation technique works well when minimum and maximum values for a real-valued attribute are known

Answer option

  • A. z-score normalization
  • B. min-max normalization
  • C. logarithmic normalization
  • D. decimal scaling
Answer

Answer C. logarithmic normalization

Q104: The number of iterations in apriori___

Answer option

  • A. increases with the size of the data
  • B. decreases with the increase in size of the data
  • C. increases with the size of the maximum frequent set
  • D. decreases with increase in size of the maximum frequent set
Answer

Answer B. decreases with the increase in size of the data

Q105: Which of the following are interestingness measures for association rules?

Answer option

  • A. recall
  • B. lift
  • C. accuracy
  • D. compactness
Answer

Answer A. recall

Q106: Which one of the following is not a major strength of the neural network approach?

Answer option

  • A. Neural network learning algorithms are guaranteed to converge to an optimal solution
  • B. Neural networks work well with datasets containing noisy data
  • C. Neural networks can be used for both supervised learning and unsupervised clustering
  • D. Neural networks can be used for applications that require a time element to be included in the data
Answer

Answer A. Neural network learning algorithms are guaranteed to converge to an optimal solution

Q107: Given a frequent itemset L, If |L| = k, then there are

Answer option

  • A. 2k – 1 candidate association rules
  • B. 2k candidate association rules
  • C. 2k – 2 candidate association rules
  • D. 2k -2 candidate association rules
Answer

Answer B. 2k candidate association rules

Q108: ___is an example for case based-learning

Answer option

  • A. Decision trees
  • B. Neural networks
  • C. Genetic algorithm
  • D. K-nearest neighbor
Answer

Answer C. Genetic algorithm

Q109: The average positive difference between computed and desired outcome values

Answer option

  • A. mean positive error
  • B. mean squared error
  • C. mean absolute error
  • D. root mean squared error
Answer

Answer B. mean squared error

Q110: Frequent item sets is

Answer option

  • A. Superset of only closed frequent item sets
  • B. Superset of only maximal frequent item sets
  • C. Subset of maximal frequent item sets
  • D. Superset of both closed frequent item sets and maximal frequent item sets
Answer

Answer D. Superset of both closed frequent item sets and maximal frequent item sets

Q111: Assume that we have a dataset containing information about 200 individuals. A supervised data mining session has discovered the following rule

IF age & lt; 30 & credit card insurance = yes THEN life insurance = yes

Rule Accuracy: 70% and Rule Coverage: 63%

How many individuals in the class life insurance= no have credit card insurance and are less than 30 years old?

A. 63

B. 30

C. 38

D. 70

Answer

Answer A. 63

Q112: Value set {poor, average, good, excellent} is an example of

Answer option

  • A. Nominal attribute
  • B. Numeric attribute
  • C. Continuous attribute
  • D. Ordinal attribute
Answer

Answer D. Ordinal attribute

Q113: Removing duplicate records is a data mining process called___

Answer option

  • A. data isolation
  • B. recovery
  • C. data pruning
  • D. data cleaning
Answer

Answer D. data cleaning

Q114: Various visualization techniques are used in___step of KDD

Answer option

  • A. selection
  • B. interpretation
  • C. transformation
  • D. data mining
Answer

Answer B. interpretation

Q115: Which of the following is not a Visualization Method?

Answer option

  • A. Hierarchical visualization technique
  • B. Tuple based visualization Technique
  • C. Icon based visualization techniques
  • D. Pixel oriented visualization technique
Answer

Answer B. Tuple based visualization Technique

Q116: The correct answer is: Tuple based visualization Technique

Data set {brown, black, blue, green, red} is example of

A. Continuous attribute

B. Ordinal attribute

C. Numeric attribute

D. Nominal attribute

Answer

Answer D. Nominal attribute

Q117: Which of the following is NOT a data quality related issue?

Answer option

  • A. Attribute value range
  • B. Outlier records
  • C. Missing values
  • D. Duplicate records
Answer

Answer A. Attribute value range

Q118: To detect fraudulent usage of credit cards, the following data mining task should be used

Answer option

  • A. Outlier analysis
  • B. prediction
  • C. association analysis
  • D. feature selection
Answer

Answer A. Outlier analysis

Q119: Which of the following is NOT example of ordinal attributes?

Answer option

  • A. Ordered numbers
  • B. Military ranks
  • C. Zip codes
  • D. Movie ratings
Answer

Answer C. Zip codes

Q120: Which of the following is not a data pre-processing methods

Answer option

  • A. Data Cleaning
  • B. Data Visualization
  • C. Data Discretization
  • D. Data Reduction
Answer

Answer B. Data Visualization

ed010d383e1f191bdb025d5985cc03fc?s=120&d=mm&r=g

DistPub Team

Distance Publisher (DistPub.com) provide project writing help from year 2007 and provide writing and editing help to hundreds student every year.