Data mining mcq set 1

Q1: All of the following accurately describe Hadoop, EXCEPT

Answer option

  • A. Open source
  • B. Real-time
  • C. Java-based
  • D. Distributed computing approach
Answer

Answer B. Real-time

Q2: ___has the world’s largest Hadoop cluster

Answer option

  • A. Apple
  • B. Datamatics
  • C. Facebook
  • D. None of the mentioned
Answer

Answer C. Facebook

Q3: What are the five V’s of Big Data?

Answer option

  • A. Volume
  • B. Velocity
  • C. Variety
  • D. All the above
Answer

Answer D. All the above

Q4: ___hides the limitations of Java behind a powerful and concise Clojure API for Cascading

Answer option

  • A. Scalding
  • B. Cascalog
  • C. Hcatalog
  • D. Hcalding
Answer

Answer B. Cascalog

Q5: What are the main components of Big Data?

Answer option

  • A. MapReduce
  • B. HDFS
  • C. YARN
  • D. All of these
Answer

Answer D. All of these

Q6: What are the different features of Big Data Analytics?

Answer option

  • A. Open-Source
  • B. Scalability
  • C. Data Recovery
  • D. All the above
Answer

Answer D. All the above

Q7: Define the Port Numbers for NameNode, Task Tracker and Job Tracker

Answer option

  • A. NameNode
  • B. Task Tracker
  • C. Job Tracker
  • D. All of the above
Answer

Answer D. All of the above

Q8: This is an approach to selling goods and services in which a prospect explicitly agrees in advance to receive marketing information

Answer option

  • A. customer managed relationship
  • B. data mining
  • C. permission marketing
  • D. one-to-one marketing
  • E. batch processing
Answer

Answer C. permission marketing

Q9: This is an XML-based metalanguage developed by the Business Process Management Initiative (BPMI) as a means of modeling business processes, much as XML is, itself, a metalanguage with the ability to model enterprise data

Answer option

  • A. BizTalk
  • B. BPML
  • C. e-biz
  • D. ebXML
  • E. ECB
Answer

Answer B. BPML

Q10: This is a central point in an enterprise from which all customer contacts are managed

Answer option

  • A. contact center
  • B. help system
  • C. multichannel marketing
  • D. call center
  • E. help desk
Answer

Answer C. multichannel marketing

Q11: This is the practice of dividing a customer base into groups of individuals that are similar in specific ways relevant to marketing, such as age, gender, interests, spending habits, and so on

Answer option

  • A. customer service chat
  • B. customer managed relationship
  • C. customer life cycle
  • D. customer segmentation
  • E. change management
Answer

Answer D. customer segmentation

Q12: Can decision trees be used for performing clustering?

Answer option

  • A. True
  • B. False
Answer

Answer A. True

Q13: Which of the following is the most appropriate strategy for data cleaning before performing clustering analysis, given less than desirable number of data points

  1. Capping and flouring of variables
  2. Removal of outliers Options

A. 1 only

B. 2 only

C. 1 and 2

D. None of the above

Answer

Answer A. 1 only

Q14: The problem of finding hidden structure in unlabeled data is called

Answer option

  • A. Supervised learning
  • B. Unsupervised learning
  • C. Reinforcement learning
Answer

Answer B. Unsupervised learning

Q15: Task of inferring a model from labeled training data is called

Answer option

  • A. Unsupervised learning
  • B. Supervised learning
  • C. Reinforcement learning
Answer

Answer B. Supervised learning

Q16: Some telecommunication company wants to segment their customers into distinct groups in order to send appropriate subscription offers, this is an example of

Answer option

  • A. Supervised learning
  • B. Data extraction
  • C. Serration
  • D. Unsupervised learning
Answer

Answer D. Unsupervised learning

Q17: Self-organizing maps are an example of

Answer option

  • A. Unsupervised learning
  • B. Supervised learning
  • C. Reinforcement learning
  • D. Missing data imputation
Answer

Answer A. Unsupervised learning

Q18: You are given data about seismic activity in Japan, and you want to predict a magnitude of the next earthquake, this is in an example of

Answer option

  • A. Supervised learning
  • B. Unsupervised learning
  • C. Serration
  • D. Dimensionality reduction
Answer

Answer A. Supervised learning

Q19: Assume you want to perform supervised learning and to predict number of newborns according to size of storks’ population it is an example of

Answer option

  • A. Classification
  • B. Regression
  • C. Clustering
  • D. Structural equation modelling
Answer

Answer B. Regression

Q20: Discriminating between spam and ham e-mails is a classification task, true or false?

Answer option

  • A. True
  • B. False
Answer

Answer A. True

Q21: In the example of predicting number of babies based on storks’ population size, number of babies is

Answer option

  • A. outcome
  • B. feature
  • C. attribute
  • D. observation
Answer

Answer A. outcome

Q22: Data set {brown, black, blue, green, red} is example of Select one

Answer option

  • A. Continuous attribute
  • B. Ordinal attribute
  • C. Numeric attribute
  • D. Nominal attribute
Answer

Answer C. Numeric attribute

Q23: Which of the following activities is NOT a data mining task?

Answer option

  • A. Predicting the future stock price of a company using historical records
  • B. Monitoring and predicting failures in a hydropower plant
  • C. Extracting the frequencies of a sound wave
  • D. Monitoring the heart rate of a patient for abnormalities Show Answer
Answer

Answer C. Extracting the frequencies of a sound wave

Q24: Data Visualization in mining cannot be done using Select one

Answer option

  • A. Photos
  • B. Graphs
  • C. Charts
  • D. Information Graphics
Answer

Answer A. Photos

Q25: Which of the following is not a data pre-processing methods Select one

Answer option

  • A. Data Visualization
  • B. Data Discretization
  • C. Data Cleaning
  • D. Data Reduction
Answer

Answer A. Data Visualization

Q26: Dimensionality reduction reduces the data set size by removing___

Answer option

  • A. composite attributes
  • B. derived attributes
  • C. relevant attributes
  • D. irrelevant attributes
Answer

Answer C. relevant attributes

Q27: The difference between supervised learning and unsupervised learning is given by Select one

Answer option

  • A. unlike unsupervised learning, supervised learning needs labeled data
  • B. unlike unsupervised learning, supervised learning can be used to detect outliers
  • C. there is no difference
  • D. unlike supervised leaning, unsupervised learning can form new classes
Answer

Answer D. unlike supervised leaning, unsupervised learning can form new classes

Q28: Which of the following activities is a data mining task? Select one

Answer option

  • A. Monitoring the heart rate of a patient for abnormalities
  • B. Extracting the frequencies of a sound wave
  • C. Predicting the outcomes of tossing a (fair) pair of dice
  • D. Dividing the customers of a company according to their profitability
Answer

Answer A. Monitoring the heart rate of a patient for abnormalities

Q29: Identify the example of sequence data Select one

Answer option

  • A. weather forecast
  • B. data matrix
  • C. market basket data
  • D. genomic data
Answer

Answer A. weather forecast

Q30: To detect fraudulent usage of credit cards, the following data mining task should be used Select one

Answer option

  • A. Outlier analysis
  • B. prediction
  • C. association analysis
  • D. feature selection
Answer

Answer D. feature selection

Q31: Which of the following is NOT example of ordinal attributes? Select one

Answer option

  • A. Zip codes
  • B. Ordered numbers
  • C. Movie ratings
  • D. Military ranks
Answer

Answer A. Zip codes

Q32: Data scrubbing can be defined as Select one

Answer option

  • A. Check field overloading
  • B. Delete redundant tuples
  • C. Use simple domain knowledge (e.g., postal code, spell-check) to detect errors and make corrections
  • D. Analyzing data to discover rules and relationship to detect violators
Answer

Answer A. Check field overloading

Q33: Which data mining task can be used for predicting wind velocities as a function of temperature, humidity, air pressure, etc.?

Answer option

  • A. Cluster Analysis
  • B. Regression
  • C. Clasification
  • D. Sequential pattern discovery
Answer

Answer C. Clasification

Q34: Which statement is not TRUbE regarding a data mining task?

Answer option

  • A. Clustering is a descriptive data mining task
  • B. Classification is a predictive data mining task
  • C. Regression is a descriptive data mining task
  • D. Deviation detection is a predictive data mining task
Answer

Answer C. Regression is a descriptive data mining task

Q35: Identify the example of Nominal attribute Select one

Answer option

  • A. Temperature
  • B. Salary
  • C. Mass
  • D. Gender
Answer

Answer C. Mass

Q36: Synonym for data mining is Select one

Answer option

  • A. Data Warehouse
  • B. Knowledge discovery in database
  • C. Business intelligence
  • D. OLAP
Answer

Answer D. OLAP

Q37: Nominal and ordinal attributes can be collectively referred to as___attributes Select one

Answer option

  • A. perfect
  • B. qualitative
  • C. consistent
  • D. optimized
Answer

Answer B. qualitative

Q38: Which of the following is not a data mining task?

Answer option

  • A. Feature Subset Detection
  • B. Association Rule Discovery
  • C. Regression
  • D. Sequential Pattern Discovery
Answer

Answer B. Association Rule Discovery

Q39: Which of the following is an Entity identification problem? Select one

Answer option

  • A. One person with different email address
  • B. One person’s name written in different way
  • C. Title for person
  • D. One person with multiple phone numbers Show Answer
Answer

Answer A. One person with different email address

Q40: In Binning, we first sort data and partition into (equal-frequency) bins and then which of the following is not a valid step Select one

Answer option

  • A. smooth by bin boundaries
  • B. smooth by bin median
  • C. smooth by bin means
  • D. smooth by bin values
Answer

Answer B. smooth by bin median

ed010d383e1f191bdb025d5985cc03fc?s=120&d=mm&r=g

DistPub Team

Distance Publisher (DistPub.com) provide project writing help from year 2007 and provide writing and editing help to hundreds student every year.