Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
Filter by Categories
nmims post
Objective Type Set
Online MCQ Assignment
Question Solution
Solved Question
Uncategorized

Hadoop MCQ Set 1

1. Apache __________ is a data repository containing device information, images and other relevant information for all sorts of mobile devices.
a) DirectMemory
b) Directory
c) DeviceMap
d) Drill

View Answer

Answer: c [Reason:] Drill is a distributed system for interactive analysis of large-scale datasets.

2. Point out the correct statement :
a) Drill is a build system based on Apache Ant and Apache Ivy
b) DirectMemory’s main purpose is to to act as a second level cache
c) Easyant is inspired by Google’s Dremel
d) None of the mentioned

View Answer

Answer: b [Reason:] DirectMemory is used to store large amounts of data without filling up the Java heap and thus avoiding long garbage collection cycles.

3. ____________ is a secure and highly scalable microsharing and micromessaging platform.
a) ESME
b) Directory
c) Empire-db
d) All of the mentioned

View Answer

Answer: a [Reason:] ESME allows people to discover and meet one another and get controlled access to other sources of information, all in a business process context.

4. Which of the framework is used for building and consuming network services ?
a) ESME
b) DirectoryMap
c) Empire-db
d) Etch

View Answer

Answer: d [Reason:] Etch is a cross-platform, language- and transport-independent framework.

5. Point out the wrong statement :
a) Felix is implementation of the OSGi R4 specification
b) Falcon is a data processing and management solution
c) Flex is application framework for building Flash-based applications
d) None of the mentioned

View Answer

Answer: b [Reason:] Falcon is used for coordination of data pipelines, lifecycle management, and data discovery.

6. _____________ is an open source system for expressive, declarative, fast, and efficient data analysis.
a) Flume
b) Flink
c) Flex
d) ESME

View Answer

Answer: b [Reason:] Stratosphere combines the scalability and programming flexibility of distributed MapReduce-like platforms with the efficiency, out-of-core execution.

7. ________________ is complete FTP Server based on Mina I/O system.
a) Giraph
b) Gereition
c) FtpServer
d) Oozie

View Answer

Answer: c [Reason:] Giraph is a large-scale, fault-tolerant, Bulk Synchronous Parallel (BSP)-based graph processing framework.

8. _____________ is a distributed computing framework based on BSP
a) HCataMan
b) HCatlaog
c) Hama
d) All of the mentioned

View Answer

Answer: c [Reason:] BSP stands for Bulk Synchronous Parallel.

9. Apache __________ is a generic cluster management framework used to build distributed systems
a) Helix
b) Gereition
c) FtpServer
d) None of the mentioned

View Answer

Answer: a [Reason:] Helix provides automatic partition management, fault tolerance and elasticity.

10. The __________ data Mapper framework makes it easier to use a database with Java or .NET applications
a) iBix
b) Helix
c) iBATIS
d) iBAT

View Answer

Answer: c [Reason:] iBATIS couples objects with stored procedures or SQL statements using a XML descriptor.

Hadoop MCQ Set 2

1. Microsoft and Hortonworks joined their forces to make Hadoop available on ___________ for on-premise deployments
a) Windows 7
b) Windows Server
c) Windows 8
d) Ubuntu

View Answer

Answer: b [Reason:] Win32 is supported as a development platform.

2. Point out the correct statement :
a) Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes
b) GNU/Linux is supported as a development and production platform
c) Distributed operation has not been well tested on Win32, so it is not supported as a production platform
d) All of the mentioned

View Answer

Answer: d [Reason:] Microsoft and Hortonworks joined their forces to make Hadoop available on Windows Azure to support big data in the cloud.

3. Hadoop ___________ is a utility to support running external map and reduce jobs.
a) Orchestration
b) Streaming
c) Collection
d) All of the mentioned

View Answer

Answer: b [Reason:] These external jobs can be written in various programming languages such as Python or Ruby.

4. In Hadoop _____________, go to the Hadoop distribution directory for HDInsight
a) Shell
b) Command Line
c) Compaction
d) None of the mentioned

View Answer

Answer: b [Reason:] In order to run Hadoop command line from Windows cmd prompt, you need to login to the HDInsight headnode using Remote Desktop.

5. Point out the wrong statement :
a) The other flavor of HDInsight interactive console is based on JavaScript
b) Microsoft and Hortonworks have re-implemented the key binaries as executables
c) The distribution consists of Hadoop 1.1.0, Pig-0.9.3, Hive 0.9.0, Mahout 0.5 and Sqoop 1.4.2
d) All of the mentioned

View Answer

Answer: d [Reason:] JavaScript commands are converted to Pig statements.

6. Microsoft Azure HDInsight comes with __________ types of interactive console.
a) two
b) three
c) four
d) five

View Answer

Answer: a [Reason:] One is the standard Hadoop Hive console, the other one is unique in Hadoop world, it is based on JavaScript.

7. The key _________ command – which is traditionally a bash script – is also re-implemented as hadoop.cmd.
a) start
b) hadoop
c) had
d) hadstrat

View Answer

Answer: b [Reason:] HDInsight is the framework for the Microsoft Azure cloud implementation of Hadoop.

8. Which of the following individual components are included on HDInsight clusters ?
a) Hive
b) Pig
c) Oozie
d) All of the mentioned

View Answer

Answer: d [Reason:] HDInsight provides several configurations for specific workloads, or you can customize clusters using Script Actions.

9. Microsoft .NET Library for Avro provides data serialization for the Microsoft ___________ environment
a) .NET
b) Hadoop
c) Ubuntu
d) None of the mentioned

View Answer

Answer: a [Reason:] The Microsoft .NET Library for Avro implements the Apache Avro compact binary data interchange format for serialization for the Microsoft .NET environment.

10. Which of the following benefit is not a feature of HDInsight ?
a) High availability
b) High reliabilty
c) High cost
d) All of the mentioned

View Answer

Answer: c [Reason:] HDInsight clusters are much easier to create than manually configuring Hadoop clusters.

Hadoop MCQ Set 3

1. Streaming supports streaming command options as well as _________ command options.
a) generic
b) tool
c) library
d) task

View Answer

Answer: a [Reason:] Place the generic options before the streaming options, otherwise the command will fail.

2. Point out the correct statement :
a) You can specify any executable as the mapper and/or the reducer
b) You cannot supply a Java class as the mapper and/or the reducer
c) The class you supply for the output format should return key/value pairs of Text class
d) All of the mentioned

View Answer

Answer: a [Reason:] If you do not specify an input format class, the TextInputFormat is used as the default.

3. Which of the following Hadoop streaming command option parameter is required ?
a) output directoryname
b) mapper executable
c) input directoryname
d) all of the mentioned

View Answer

Answer: d [Reason:] Required parameters is used for Input and Output location for mapper.

4. To set an environment variable in a streaming command use:
a) -cmden EXAMPLE_DIR=/home/example/dictionaries/
b) -cmdev EXAMPLE_DIR=/home/example/dictionaries/
c) -cmdenv EXAMPLE_DIR=/home/example/dictionaries/
d) -cmenv EXAMPLE_DIR=/home/example/dictionaries/

View Answer

Answer: c [Reason:] Environment Variable is set using cmdenv command.

5. Point out the wrong statement :
a) Hadoop has a library package called Aggregate
b) Aggregate allows you to define a mapper plugin class that is expected to generate “aggregatable items” for each input key/value pair of the mappers
c) To use Aggregate, simply specify “-mapper aggregate”
d) None of the mentioned

View Answer

Answer: c [Reason:] To use Aggregate, simply specify “-reducer aggregate”:

6. The ________ option allows you to copy jars locally to the current working directory of tasks and automatically unjar the files.
a) archives
b) files
c) task
d) none of the mentioned

View Answer

Answer: a [Reason:] Archives options is also a generic option.

7. ______________ class allows the Map/Reduce framework to partition the map outputs based on certain key fields, not the whole keys.
a) KeyFieldPartitioner
b) KeyFieldBasedPartitioner
c) KeyFieldBased
d) None of the mentioned

View Answer

Answer: b [Reason:] The primary key is used for partitioning, and the combination of the primary and secondary keys is used for sorting.

8. Which of the following class provides a subset of features provided by the Unix/GNU Sort ?
a) KeyFieldBased
b) KeyFieldComparator
c) KeyFieldBasedComparator
d) All of the mentioned

View Answer

Answer: c [Reason:] Hadoop has a library class, KeyFieldBasedComparator, that is useful for many applications.

9. Which of the following class is provided by Aggregate package ?
a) Map
b) Reducer
c) Reduce
d) None of the mentioned

View Answer

Answer: b [Reason:] Aggregate provides a special reducer class and a special combiner class, and a list of simple aggregators that perform aggregations such as “sum”, “max”, “min” and so on over a sequence of values.

10.Hadoop has a library class, org.apache.hadoop.mapred.lib.FieldSelectionMapReduce, that effectively allows you to process text data like the unix ______ utility.
a) Copy
b) Cut
c) Paste
d) Move

View Answer

Answer: b [Reason:] The map function defined in the class treats each input key/value pair as a list of fields.

Hadoop MCQ Set 4

1. Which of the following is a Web search software ?
a) Imphala
b) Nutch
c) Oozie
d) Manmgy

View Answer

Answer: b [Reason:] Oozie is server-based workflow scheduling and coordination system to manage data processing jobs for Apache Hadoop.

2. Point out the correct statement :
a) OFBiz stands for “The Open For Business Project”
b) Od stands for “Orchestration Director Engine”
c) OCG is Object-Graph Notation Language implementation in Java
d) All of the mentioned

View Answer

Answer: a [Reason:] The Open For Business Project (OFBiz) is an open source enterprise automation software project.

3. ___________ defines an open application programming interface for common cloud application services.
a) Bigred
b) Nuvem
c) Oozie
d) All of the mentioned

View Answer

Answer: b [Reason:] Nuvem allowed applications to be easily ported across the most popular cloud platforms.

4. __________ is OData implementation in Java.
a) Bigred
b) Nuvem
c) Olingo
d) Onami

View Answer

Answer: d [Reason:] Apache Onami aims to create a community focused on the development and maintenance of a set of Google Guice extensions.

5. Point out the wrong statement :
a) OpenOffice.org is comprised of six personal productivity applications
b) Open Climate Workbench is tool for scalable comparison of remote sensing observations
c) OpenNLP is a machine learning based toolkit for the processing of natural language text
d) None of the mentioned

View Answer

Answer: b [Reason:] Open Climate Workbench is used to observe climate model outputs, regionally and globally.

6. ___________ is an open source SQL query engine for Apache HBase
a) Pig
b) Phoenix
c) Pivot
d) None of the mentioned

View Answer

Answer: b [Reason:] Pig is a platform for analyzing large datasets.

7. ___________ provides multiple language implementations of the Advanced Messaged Queuing Protocol (AMQP)
a) RTA
b) Qpid
c) RAT
d) All of the mentioned

View Answer

Answer: b [Reason:] RAT became part of new Apache Creadur TLP.

8. ___________ is A WEb And SOcial Mashup Engine.
a) ServiceMix
b) Samza
c) Rave
d) All of the mentioned

View Answer

Answer: c [Reason:] Samza is a stream processing system for running continuous computation on infinite streams of data.

9. The ___________ project will create an ESB and component suite based on the Java Business Interface (JBI) standard – JSR 208.
a) ServiceMix
b) Samza
c) Rave
d) All of the mentioned

View Answer

Answer: a [Reason:] ServiceMix project is Geronimo developed by James.

10. Which of the following is spatial information system ?
a) Sling
b) Solr
c) SIS
d) All of the mentioned

View Answer

Answer: c [Reason:] The Spatial Information System (SIS) Project is a toolkit that spatial information system builders or users can leverage to build applications containing location context.

Hadoop MCQ Set 5

1. How many types of modes are present in Hama ?
a) 2
b) 3
c) 4
d) 5

View Answer

Answer: b [Reason:] Just like Hadoop,Hama has distinct between three modes.

2. Point out the correct statement :
a) In local mode, nothing must be launched via the start scripts
b) Distributed Mode is just like the “Pseudo Distributed Mode”
c) Apache Hama is one of the under-hyped projects in the Hadoop ecosystem
d) All of the mentioned

View Answer

Answer: b [Reason:] You can adjust the number of threads used in this utility by setting the bsp.local.tasks.maximum property.

3. __________ is the default mode if you download Hama.
a) Local Mode
b) Pseudo Distributed Mode
c) Distributed Mode
d) All of the mentioned

View Answer

Answer: a [Reason:] This mode can be configured via the bsp.master.address property to local.

4. _________ mode is used when you just have a single server and want to launch all the deamon processes
a) Local Mode
b) Pseudo Distributed Mode
c) Distributed Mode
d) All of the mentioned

View Answer

Answer: b [Reason:] Pseudo Distributed Mode can be configured when you set the bsp.master.address to a host address.

5. Point out the wrong statement :
a) Apache Hama is not a pure Bulk Synchronous Parallel Engine
b) Hama uses the Hadoop Core for RPC calls
c) Apache Hama is optimized for massive scientific computations such as matrix, graph and network algorithms
d) Hama is a relatively newer project than Hadoop

View Answer

Answer: a [Reason:] Apache Hama is not a pure BSP.

6. Distributed Mode are mapped in the __________ file.
a) groomservers
b) grervers
c) grsvers
d) groom

View Answer

Answer: a [Reason:] Distributed Mode is used when you have multiple machines.

7. The web UI provides information about ________ job statistics of the Hama cluster
a) MPP
b) BSP
c) USP
d) ISP

View Answer

Answer: b [Reason:] Running/completed/Failed jobs is detailed in UI interface.

8. Apache Hama provides complete clone of :
a) Pragmatic
b) Pregel
c) ServePreg
d) All of the mentioned

View Answer

Answer: b [Reason:] Pregel is used for large processing of graph.

9. A __________ in a social graph is a group of people who interact frequently with each other and less frequently with others.
a) semi-cluster
b) partial cluster
c) full cluster
d) none of the mentioned

View Answer

Answer: a [Reason:] semi-cluster is different from ordinary clustering in the sense that a vertex may belong to more than one semi-cluster.

10. Which of the following apache project is gaining a lot of traction steadily with the efforts of its committers ?
a) Hama
b) Hadoop
c) Hive
d) Pig

View Answer

Answer: a [Reason:] HAMA is a distributed framework on Hadoop for massive matrix algorithms.

.woocommerce-message { background-color: #98C391 !important; }