Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
Filter by Categories
nmims post
Objective Type Set
Online MCQ Assignment
Question Solution
Solved Question
Uncategorized

Hadoop MCQ Set 1

1. ___________ provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs.
a) Oozie
b) Ambari
c) Hive
d) Imphala

View Answer

Answer: b [Reason:] The Apache Ambari project is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters.

2. Point out the correct statement :
a) Ambari provides a dashboard for monitoring health and status of the Hadoop cluster
b) Ambari provides a step-by-step wizard for installing Hadoop services across any number of hosts
c) Ambari handles configuration of Hadoop services for the cluster
d) All of the mentioned

View Answer

Answer: a [Reason:] Ambari provides central management for starting, stopping, and reconfiguring Hadoop services across the entire cluster.

3. Ambari leverages ________ for metrics collection.
a) Nagios
b) Nagaond
c) Ganglia
d) All of the mentioned

View Answer

Answer: c [Reason:] Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids.

4. Ambari leverages ___________ for system alerting and will send emails when your attention is needed
a) Nagios
b) Nagaond
c) Ganglia
d) All of the mentioned

View Answer

Answer: a [Reason:] Nagios Is The Industry Standard In IT Infrastructure Monitoring.

5. Point out the wrong statement :
a) Ambari Views framework was greatly improved to better support instantiating and loading custom views
b) The Ambari shell is written is Java, and uses the Groovy bases Ambari REST client
c) Ambari-Shell is distributed as a single-file executable jar
d) None of the mentioned

View Answer

Answer: d [Reason:] The uber jar is generated with the help of spring-boot-maven-plugin.

6. A ________ is a way of extending Ambari that allows 3rd parties to plug in new resource types along with the APIs
a) trigger
b) view
c) schema
d) none of the mentioned

View Answer

Answer: b [Reason:] A view is an application that is deployed into the Ambari container.

7. Ambari ___________ deliver a template approach to cluster deployment.
a) View
b) Stack Advisor
c) Blueprints
d) All of the mentioned

View Answer

Answer: c [Reason:] Ambari Blueprints deliver a template approach to cluster deployment.

8. ___________ facilitates installation of Hadoop across any number of hosts.
a) API-driven installations
b) Wizard-driven interface
c) Extensible framework
d) All of the mentioned

View Answer

Answer: b [Reason:] Extensible framework brings custom services under management via Ambari Stacks.

9. Ambari provides a ________ API that enables integration with existing tools, such as Microsoft System Center
a) RestLess
b) Web Service
c) RESTful
d) None of the mentioned

View Answer

Answer: c [Reason:] RESTful APIs enables integration with enterprise systems.

10. If Ambari Agent has any output in /var/log/ambari-agent/ambari-agent.out, it is indicative of a __________ problem.
a) Less Severe
b) Significant
c) Extremely Severe
d) None of the mentioned

View Answer

Answer: b [Reason:] Ambari enables system administrators to provision, manage and monitor a Hadoop cluster, and also to integrate Hadoop with the existing enterprise infrastructure.

Hadoop MCQ Set 2

1. Mapper implementations are passed the JobConf for the job via the ________ method
a) JobConfigure.configure
b) JobConfigurable.configure
c) JobConfigurable.configureable
d) None of the mentioned

View Answer

Answer: b [Reason:] JobConfigurable.configure method is overridden to initialize themselves.

2. Point out the correct statement :
a) Applications can use the Reporter to report progress
b) The Hadoop MapReduce framework spawns one map task for each InputSplit generated by the InputFormat for the job
c) The intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format
d) All of the mentioned

View Answer

Answer: d [Reason:] Reporters can be used to set application-level status messages and update Counters.

3. Input to the _______ is the sorted output of the mappers.
a) Reducer
b) Mapper
c) Shuffle
d) All of the mentioned

View Answer

Answer: a [Reason:] In Shuffle phase the framework fetches the relevant partition of the output of all the mappers, via HTTP.

4. The right number of reduces seems to be :
a) 0.90
b) 0.80
c) 0.36
d) 0.95

View Answer

Answer: d [Reason:] The right number of reduces seems to be 0.95 or 1.75.

5. Point out the wrong statement :
a) Reducer has 2 primary phases
b) Increasing the number of reduces increases the framework overhead, but increases load balancing and lowers the cost of failures
c) It is legal to set the number of reduce-tasks to zero if no reduction is desired
d) The framework groups Reducer inputs by keys (since different mappers may have output the same key) in sort stage

View Answer

Answer: a [Reason:] Reducer has 3 primary phases: shuffle, sort and reduce.

6. The output of the _______ is not sorted in the Mapreduce framework for Hadoop.
a) Mapper
b) Cascader
c) Scalding
d) None of the mentioned

View Answer

Answer: d [Reason:] The output of the reduce task is typically written to the FileSystem. The output of the Reducer is not sorted.

7. Which of the following phases occur simultaneously ?
a) Shuffle and Sort
b) Reduce and Sort
c) Shuffle and Map
d) All of the mentioned

View Answer

Answer: a [Reason:] The shuffle and sort phases occur simultaneously; while map-outputs are being fetched they are merged.

8. Mapper and Reducer implementations can use the ________ to report progress or just indicate that they are alive.
a) Partitioner
b) OutputCollector
c) Reporter
d) All of the mentioned

View Answer

Answer: c [Reason:] Reporter is a facility for MapReduce applications to report progress, set application-level status messages and update Counters.

9. __________ is a generalization of the facility provided by the MapReduce framework to collect data output by the Mapper or the Reducer
a) Partitioner
b) OutputCollector
c) Reporter
d) All of the mentioned

View Answer

Answer: b [Reason:] Hadoop MapReduce comes bundled with a library of generally useful mappers, reducers, and partitioners.

10. _________ is the primary interface for a user to describe a MapReduce job to the Hadoop framework for execution.
a) Map Parameters
b) JobConf
c) MemoryConf
d) None of the mentioned

View Answer

Answer: b [Reason:] JobConf represents a MapReduce job configuration.

Hadoop MCQ Set 3

1. Avro schemas are defined with _____
a) JSON
b) XML
c) JAVA
d) All of the mentioned

View Answer

Answer: a [Reason:] JSON implementation facilitates implementation in languages that already have JSON libraries.

2. Point out the correct statement :
a) Avro provides functionality similar to systems such as Thrift
b) When Avro is used in RPC, the client and server exchange data in the connection handshake
c) Apache Avro, Avro, Apache, and the Avro and Apache logos are trademarks of The Java Foundation
d) None of the mentioned

View Answer

Answer: a [Reason:] Avro differs from these systems in the fundamental aspects like untagged data.

3. __________ facilitates construction of generic data-processing systems and languages.
a) Untagged data
b) Dynamic typing
c) No manually-assigned field IDs
d) All of the mentioned

View Answer

Answer: b [Reason:] Avro does not require that code be generated.

4. With ______ we can store data and read it easily with various programming languages
a) Thrift
b) Protocol Buffers
c) Avro
d) None of the mentioned

View Answer

Answer: c [Reason:] Avro is optimized to minimize the disk space needed by our data and it is flexible.

5. Point out the wrong statement :
a) Apache Avro™ is a data serialization system
b) Avro provides simple integration with dynamic languages
c) Avro provides rich data structures
d) All of the mentioned

View Answer

Answer: d [Reason:] Code generation is not required to read or write data files nor to use or implement RPC protocols in Avro.

6. ________ are a way of encoding structured data in an efficient yet extensible format.
a) Thrift
b) Protocol Buffers
c) Avro
d) None of the mentioned

View Answer

Answer: b [Reason:] Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats.

7. Thrift resolves possible conflicts through _________ of the field.
a) Name
b) Static number
c) UID
d) None of the mentioned

View Answer

Answer: b [Reason:] Avro resolves possible conflicts through the name of the field.

8. Avro is said to be the future _______ layer of Hadoop.
a) RMC
b) RPC
c) RDC
d) All of the mentioned

View Answer

Answer: b [Reason:] When Avro is used in RPC, the client and server exchange schemas in the connection handshake.

9. When using reflection to automatically build our schemas without code generation, we need to configure Avro using :
a) AvroJob.Reflect(jConf);
b) AvroJob.setReflect(jConf);
c) Job.setReflect(jConf);
d) None of the mentioned

View Answer

Answer: c [Reason:] For strongly typed languages like Java, it also provides a generation code layer, including RPC services code generation.

10. We can declare the schema of our data either in a ______ file.
a) JSON
b) XML
c) SQL
d) R

View Answer

Answer: c [Reason:] Schema can be declared using an IDL or simply through Java beans by using reflection-based schema building.

Hadoop MCQ Set 4

1. Which of the following is project for Infrastructure Engineers and Data Scientists ?
a) Impala
b) BigTop
c) Oozie
d) Flume

View Answer

Answer: b [Reason:] Bigtop supports a wide range of components/projects, including, but not limited to, Hadoop, HBase and Spark.

2. Point out the correct statement :
a) Bigtop provides an integrated smoke testing framework, alongside a suite of over 10 test files
b) Bigtop includes tools and a framework for testing at various levels
c) Bigtop components supports only one Operating Systems
d) All of the mentioned

View Answer

Answer: b [Reason:] Bigtop is used for both initial deployments as well as upgrade scenarios for the entire data platform, not just the individual components.

3. Which of the following work is done by BigTop in Hadoop framework ?
a) Packaging
b) Smoke Testing
c) Virtualization
d) All of the mentioned

View Answer

Answer: d [Reason:] BigTop is looking for comprehensive packaging, testing, and configuration of the leading open source big data components.

4. Which of the following operating system is not supported by BigTop ?
a) Fedora
b) Solaris
c) Ubuntu
d) SUSE

View Answer

Answer: b [Reason:] Bigtop components power the leading Hadoop distros and support many Operating Systems, including Debian/Ubuntu, CentOS, Fedora, SUSE and many others.

5. Point out the wrong statement :
a) Bigtop-0.5.0 : Builds the 0.5.0 release.
b) Bigtop-trunk-HBase builds the HCatalog packages only
c) There are also jobs for building virtual machine images
d) All of the mentioned

View Answer

Answer: a [Reason:] Bigtop provides vagrant recipes, raw images, and (work-in-progress) docker recipes for deploying Hadoop from zero.

6. Apache Bigtop uses ___________ for continuous integration testing.
a) Jenkinstop
b) Jerry
c) Jenkins
d) None of the mentioned

View Answer

Answer: c [Reason:] There are 2 Jenkins servers running for the project.

7. The Apache Jenkins server runs the ______________ job whenever code is committed to the trunk branch.
a) “Bigtop-trunk”
b) “Bigtop”
c) “Big-trunk”
d) None of the mentioned

View Answer

Answer: a [Reason:] Jenken Server in turn runs several test jobs.

8. The Bigtop Jenkins server runs daily jobs for the _______ and trunk branches.
a) 0.1
b) 0.2
c) 0.3
d) 0.4

View Answer

Answer: c [Reason:] Each job has a configuration for each supported operating system. In each branch there is a job to build each component.

9. Which of the following builds an APT or YUM package repository ?
a) Bigtop-trunk-packagetest
b) Bigtop-trunk-repository
c) Bigtop-VM-matrix
d) None of the mentioned

View Answer

Answer: b [Reason:] Bigtop-trunk-packagetest runs the package tests.

10. ___________ builds virtual machines of branches trunk and 0.3 for KVM, VMWare and VirtualBox.
a) Bigtop-trunk-packagetest
b) Bigtop-trunk-repository
c) Bigtop-VM-matrix
d) None of the mentioned

View Answer

Answer: c [Reason:] Bigtop-trunk-repository builds an APT or YUM package repository.

Hadoop MCQ Set 5

1. ________ includes a flexible and powerful toolkit for displaying monitoring and analyzing results.
a) Imphala
b) Chukwa
c) BigTop
d) Oozie

View Answer

Answer: b [Reason:] Chukwa is built on top of the Hadoop distributed filesystem (HDFS) and MapReduce framework and inherits Hadoop’s scalability and robustness.

2. Point out the correct statement :
a) Log processing was one of the original purposes of MapReduce
b) Chukwa is a Hadoop subproject devoted to bridging that gap between logs processing and Hadoop ecosystem
c) HICC stands for Hadoop Infrastructure Care Center
d) None of the mentioned

View Answer

Answer: b [Reason:] Chukwa is a scalable distributed monitoring and analysis system, particularly logs from Hadoop and other large systems.

3. The items stored on _______ are organized in a hierarchy of widget category
a) HICE
b) HICC
c) HIEC
d) All of the mentioned

View Answer

Answer: b [Reason:] HICC stands for Hadoop Infrastructure Care Center. It is the central dashboard for visualize and monitoring of metrics collected by Chukwa.

4. HICC, the Chukwa visualization interface, requires HBase version :
a) 0.90.5+.
b) 0.10.4+.
c) 0.90.4+.
d) None of the mentioned

View Answer

Answer: c [Reason:] The Chukwa cluster management scripts rely on ssh; these scripts, however, are not required if you have some alternate mechanism for starting and stopping daemons.

5. Point out the wrong statement :
a) Using Hadoop for MapReduce processing of logs is easy
b) Chukwa should work on any POSIX platform
c) Chukwa is a system for large-scale reliable log collection and processing with Hadoop
d) All of the mentioned

View Answer

Answer: a [Reason:] Logs are generated incrementally across many machines, but Hadoop MapReduce works best on a small number of large files.

6. __________ are the Chukwa processes that actually produce data.
a) Collectors
b) Agents
c) HBase Table
d) HCatalog

View Answer

Answer: b [Reason:] Setting the option chukwaAgent.control.remote will disallow remote connections to the agent control socket.

7. Chukwa ___________ are responsible for accepting incoming data from Agents, and storing the data.
a) HBase Table
b) Agents
c) Collectors
d) None of the mentioned

View Answer

Answer: c [Reason:] Most commonly, collectors simply write all received to HBase or HDFS.

8. For enabling streaming data to _________ chukwa collector writer class can be configured in chukwa-collector-conf.xml.
a) HCatalog
b) HBase
c) Hive
d) All of the mentioned

View Answer

Answer: b [Reason:] In this mode, the filesystem to write to is determined by the option writer.hdfs.filesystem in chukwa-collector-conf.xml.

9. By default, collectors listen on port :
a) 8008
b) 8070
c) 8080
d) None of the mentioned

View Answer

Answer: c [Reason:] Port number can be configured in chukwa-collector.conf.xml

10. _________ class allows other programs to get incoming chunks fed to them over a socket by the collector.
a) PipelineStageWriter
b) PipelineWriter
c) SocketTeeWriter
d) None of the mentioned

View Answer

Answer: c [Reason:] PipelineStageWriter lets you string together a series of PipelineableWriters for pre-processing or post-processing incoming data.