Data warehousing and Mining-1



Data warehousing and Mining

Assignment A

Q1: Find out the frequent item set of maximum length for the given database D using Apriori algorithm with minimum support count is 2 and minimum confidence is 50 %.

TID List of item IDs
T100 I1,I2, I5
T101 I2, I4
T102 I2, I3
T103 I1, I2, I4
T104 I1, I3
T105 I2, I3
T106 I1, I3
T107 I1 ,I2, I3, I5
T108 I1, I2, I3

Q2: What is Clustering? What are the different challenges to clustering?

Q3: What is classification? Explain the decision tree technique with example.

Q4: What is bitmap indexing? Explain with example .Write the techniques to optimize the extraction process.

Q5: The Retail industry is a major application area for data mining since it collects huge amounts of data on sales, customer shopping history, goods transportation, consumption and service record. Prove this.

Q6: What are the different schemas for data warehouse? Explain by example.

Write a query for defining a cube in star schema.

Q7: What is data mining in KDD? Explain the architecture of typical data mining.

Q8: What are the different OLAP operations in multidimensional data model?


Case Study

Most banks and financial institutions offers a wide variety of banking services (such as checking, savings, and business and individual customer transactions), credit (such as business , mortgage and automobile loans) , an investment services (such as mutual funds).Some also offer insurance services and stock investment services.

Financial data collected in the banking and financial industries are often relatively complete, reliable and of high quality, which facilitate systematic data analysis and data mining.


Q1. Give some typical cases to prove these facts i.e. how you can design a data warehouse , what are the techniques those can be used in this and predict right loan payment, customer credit policy system etc.


Assignment C

Q1: Which type of decision is one that happens repeatedly, and often periodically, whether weekly, monthly, quarterly, or yearly?

  1. a) Structured decision
  2. b) Non structured decision
  3. c) Nonrecurring decision
  4. d) None of the above


Q2: In multi-dimensional analysis, roll-up can be achieved by:

  1. a) Moving up a dimension hierarchy like city -> state, etc.
  2. b) Moving down a dimension hierarchy like state -> city, etc.
  3. c) Adding a new dimension
  4. d) Removing one or more dimensions


Q3: Which technique can be used to reduce the size of the candidate k-itemsets in Apriori Algorithm?

  1. a) Pruning
  2. b) Cleaning
  3. c) Hash based
  4. d) Partitioning


Q4: Classification belongs to which type of learning?

  1. a) Supervised
  2. b) Unsupervised
  3. c) Rote
  4. d) Machine


Q5: Structured query language (SQL) is a standardized fourth generation query language found in most DBMSs

  1. a) True
  2. b) false
  3. c) nil
  4. d) nil


Q6: A data warehouse is a logical collection of information – gathered from many different operational databases – used to create business intelligence that supports business analysis activities and decision-making tasks.

  1. a) Ttrue
  2. b) false
  3. c) nil
  4. d) nil


Q7: Online transactions processing (OLTP) is the manipulation of information to support decision making.

  1. a) true
  2. b) false
  3. c) nil
  4. d) nil


Q8: Data warehouses support only OLAP.

  1. a) True
  2. b) false
  3. c) nil
  4. d) nil


Q9: A database is a collection of information that you organize and access according to the physical structure of that information.

  1. a) True
  2. b) False
  3. c) nil
  4. d) nil


Q10: The physical view of information focuses on how you need to arrange and access information to meet the needs of the business.

  1. a) true
  2. b) false
  3. c) nil
  4. d) nil


Q11: A view helps you add, change, and delete information in a database and mine it for valuable information

  1. a) True
  2. b) false
  3. c) nil
  4. d) nil


Q12: Apriori algorithm used in–

  1. a) Clustering
  2. b) Classification
  3. c) Association Rule mining
  4. d) None of these


Q13: DBSCAN technique comes under–

  1. a) Classification
  2. b) Clustering
  3. c) Prediction
  4. d) Other


Q14: Data warehouse schema can be–

  1. a) Star schema
  2. b) Asynchronous table
  3. c) Formula
  4. d) None of these


Q15: In Association Rule Mining support (A=>B) is–

  1. a) Probability (AUB)
  2. b) Subtraction i.e. (A-B)
  3. c) Addition i.e. (A+B)
  4. d) None of these


Q16: Select the Non Data Mining techniques–

  1. a) Association rule mining
  2. b) Clustering
  3. c) Classification
  4. d) Selection Sort


Q17: Which step in Apriori algorithm uses apriori property?

  1. a) Join step
  2. b) Prune step
  3. c) Candidate generation step
  4. d) Query processing


Q18: Which among the following is an accuracy measure of classifier?

  1. a) Support
  2. b) Confidence
  3. c) Specificity
  4. d) Recall


Q19: Which decision type below represents employing a new marketing campaign?

  1. a) Structured decision
  2. b) Non structured decision
  3. c) Nonrecurring decision
  4. d) Recurring decision


Q20: Which phrase describes a decision support system?

  1. a) Highly flexible IT system
  2. b) Interactive IT system
  3. c) Designed to support decision making
  4. d) All of the above


Q21: Which component of a DSS consists of both the DSS models and the DSS model management system?

  1. a) Data management
  2. b) Model management
  3. c) User interface management
  4. d) All of the above


Q22: Which type of AI system is an adaptive system that works independently, carrying out specific, repetitive, or predictable tasks?

  1. a) Expert systems
  2. b) Neural networks
  3. c) Genetic algorithms
  4. d) None of the above


Q23: All of the following are IT components of an expert system, except:

  1. a) Knowledge base
  2. b) User acquisition
  3. c) Inference engine
  4. d) User interface


Q24: An intelligent agent that checks your e-mail and sorts it according to priority is called

  1. a) User agent
  2. b) Monitoring-and-surveillance agent
  3. c) Data-mining agent
  4. d) Buyer agent


Q25: Which of the following uses a series of logically related two-dimensional tables or files to store information in the form of a database?

  1. a) Database
  2. b) Database management system
  3. c) Data warehouse
  4. d) None of the above


Q26: All of the following terms describe OLAP, except

  1. a) The gathering of input information
  2. b) Processing input information
  3. c) Updating existing information to reflect to the gathered and processed information
  4. d) None of the above


Q27: Which tool is used to help an organization build and use business intelligence?

  1. a) Data warehouse
  2. b) Data mining tools
  3. c) Database management systems
  4. d) All of the above


Q28: What does the data dictionary identify?

  1. a) Field names
  2. b) Field types
  3. c) Field formats
  4. d) All of the above


Q29: What DBMS component contains facilities to help you develop transaction-intensive applications?

  1. a) DBMS engine
  2. b) Data definition subsystem
  3. c) Application generation subsystem
  4. d) Data administration subsystem


Q30: Which of the following is a data manipulation tool?

  1. a) File generators
  2. b) Query by example tool
  3. c) Structure question language
  4. d) All of the above


Q31: What data manipulation tool is a standardized fourth-generation query language found in most DBMSs?

  1. a) Report generator
  2. b) Query-by-example tool
  3. c) Statistical tool
  4. d) None of the above


Q32: The data administration subsystem helps you perform all of the following, except

  1. a) Backups and recovery
  2. b) Query optimization
  3. c) Security management
  4. d) Create, change, and delete information


Q33: Which data administration subsystem periodically backs up information contained in a database?

  1. a) Concurrency control facilities
  2. b) Reorganization facilities
  3. c) Backup and recovery facilities
  4. d) Security management facilities


Q34: Who is the person responsible for the more technical and operational aspects of managing the information contained in organizational databases?

  1. a) Chief information officer
  2. b) Data administration
  3. c) Data information officer
  4. d) None of the above


Q35: Which phase of decision making carries out the chosen solution, monitors the results, and makes adjustments as necessary?

  1. a) Intelligence
  2. b) Design
  3. c) Choice
  4. d) None of the above


Q36: Which type of intelligent agent travels around a network finding information and bringing it back to you?

  1. a) Personal agent
  2. b) User agent
  3. c) Monitoring-and-surveillance agent
  4. d) Shopping bots


Q37: Intelligent agents are software that does all of the following, except–

  1. a) Assists you
  2. b) Acts on your behalf
  3. c) Performs repetitive computer-related tasks
  4. d) Performs repetitive information-processing tasks


Q38: Business intelligence is information about your–

  1. a) Customers
  2. b) Competitors
  3. c) Partners
  4. d) Competitive environment


Q39: Operational databases are databases that support OLTP.

  1. a) True
  2. b) False
  3. c) nil
  4. d) nil


Q40: A primary key field can be blank

  1. a) true
  2. b) false
  3. c) NIL
  4. d) nil

See Others Amity MBA assignments Questions


There are no reviews yet.

Be the first to review “Data warehousing and Mining-1”

Your email address will not be published. Required fields are marked *

PlaceholderData warehousing and Mining-1