Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
Filter by Categories
nmims post
Objective Type Set
Online MCQ Assignment
Question Solution
Solved Question
Uncategorized

Hadoop MCQ Set 1

1. Apache Cassandra™ is a massively scalable open source _______ database.
a) SQL
b) NoSQL
c) NewSQL
d) All of the mentioned

View Answer

Answer: b [Reason:] Cassandra is perfect for managing large amounts of data across multiple data centers and the cloud.

2. Point out the correct statement :
a) Cassandra delivers continuous availability, linear scalability, and operational simplicity across many commodity servers
b) Cassandra has a “masterless” architecture, meaning all nodes are the same
c) Cassandra also provides customizable replication, storing redundant copies of data across nodes that participate in a Cassandra ring
d) All of the mentioned

View Answer

Answer: d [Reason:] Cassandra provides automatic data distribution across all nodes that participate in a “ring” or database cluster.

3. Cassandra uses a protocol called _______ to discover location and state information
a) gossip
b) intergos
c) goss
d) all of the mentioned

View Answer

Answer: a [Reason:] Gossip is used for internode communication.

4. A __________ determines which data centers and racks nodes belong to.
a) Client requests
b) Snitch
c) Partitioner
d) None of the mentioned

View Answer

Answer: b [Reason:] Client read or write requests can be sent to any node in the cluster because all nodes in Cassandra are peers.

5. Point out the wrong statement :
a) Cassandra supplies linear scalability, meaning that capacity may be easily added simply by adding new nodes online
b) Cassandra 2.0 included major enhancements to CQL, security, and performance
c) CQL for Cassandra 2.0.6 adds several important features including batching of conditional updates, static columns, and increased control over slicing of clustering columns
d) None of the Mentioned

View Answer

Answer: d [Reason:] Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store.

6. User accounts may be altered and dropped using the __________ Query Language.
a) Hive
b) Cassandra
c) Sqoop
d) None of the mentioned

View Answer

Answer: b [Reason:] Cassandra manages user accounts and access to the database cluster using passwords.

7. Authorization capabilities for Cassandra use the familiar _________ security paradigm to manage object permissions.
a) COMMIT
b) GRANT
c) ROLLBACK
d) None of the mentioned

View Answer

Answer: b [Reason:] Once authenticated into a database cluster using either internal authentication, the next security issue to be tackled is permission management.

8. Client-to-node encryption protects data in flight from client machines to a database cluster using :
a) SSL
b) SSH
c) SSN
d) All of the mentioned

View Answer

Answer: a [Reason:] Client-to-node encryption establishes a secure channel between the client and the coordinator node.

9. Using ___________ file means you don’t have to override the SSL_CERTFILE environmental variables every time.
a) qlshrc
b) cqshrc
c) cqlshrc
d) none of the mentioned

View Answer

Answer: c [Reason:] cqlsh is used with SSL encryption.

10. Internal authentication stores usernames and bcrypt-hashed passwords in the ____________ table.
a) system_auth.creds
b) system_auth.credentials
c) system.credentials
d) sys_auth.credentials

View Answer

Answer: b [Reason:] PasswordAuthenticator is an IAuthenticator implementation that you can use to configure Cassandra for internal authentication out-of-the-box.

Hadoop MCQ Set 2

1. HBase is a distributed ________ database built on top of the Hadoop file system.
a) Column-oriented
b) Row-oriented
c) Tuple-oriented
d) None of the mentioned

View Answer

Answer: a [Reason:] HBase is a data model that is similar to Google’s big table designed to provide quick random access to huge amounts of structured data.

2. Point out the correct statement :
a) HDFS provides low latency access to single rows from billions of records (Random access)
b) HBase sits on top of the Hadoop File System and provides read and write access
c) HBase is a distributed file system suitable for storing large files
d) None of the mentioned

View Answer

Answer: b [Reason:] One can store the data in HDFS either directly or through HBase. Data consumer reads/accesses the data in HDFS randomly using HBase.

3. HBase is ________, defines only column families.
a) Row Oriented
b) Schema-less
c) Fixed Schema
d) All of the mentioned

View Answer

Answer: b [Reason:] HBase doesn’t have the concept of fixed columns schema.

4. Apache HBase is a non-relational database modeled after Google’s _________
a) BigTop
b) Bigtable
c) Scanner
d) FoundationDB

View Answer

Answer: b [Reason:] Bigtable acts up on Google File System, likewise Apache HBase works on top of Hadoop and HDFS.

5. Point out the wrong statement :
a) HBase provides only sequential access of data
b) HBase provides high latency batch processing
c) HBase internally provides serialized access
d) All of the mentioned

View Answer

Answer: c [Reason:] HBase internally uses Hash tables and provides random access.

6. The _________ Server assigns regions to the region servers and takes the help of Apache ZooKeeper for this task.
a) Region
b) Master
c) Zookeeper
d) All of the mentioned

View Answer

Answer: b [Reason:] Master Server maintains the state of the cluster by negotiating the load balancing.

7. Which of the following command provides information about the user ?
a) status
b) version
c) whoami
d) user

View Answer

Answer: c [Reason:] status command provides the status of HBase, for example, the number of servers.

8. Which of the following command does not operate on tables ?
a) enabled
b) disabled
c) drop
d) all of the mentioned

View Answer

Answer: b [Reason:] is_disabled command verifies whether a table is disabled.

9. _________ command fetches the contents of row or a cell.
a) select
b) get
c) put
d) none of the mentioned

View Answer

Answer: b [Reason:] put commmand puts a cell value at a specified column in a specified row in a particular table.

10. HBaseAdmin and ____________ are the two important classes in this package that provide DDL functionalities.
a) HTableDescriptor
b) HDescriptor
c) HTable
d) HTabDescriptor

View Answer

Answer: a [Reason:] Java provides an Admin API to achieve DDL functionalities through programming.

Hadoop MCQ Set 3

1. HCatalog supports reading and writing files in any format for which a ________ can be written.
a) SerDE
b) SaerDear
c) DocSear
d) All of the mentioned

View Answer

Answer: a [Reason:] By default, HCatalog supports RCFile, CSV, JSON, and SequenceFile, and ORC file formats. To use a custom format, you must provide the InputFormat, OutputFormat, and SerDe.

2. Point out the correct statement :
a) HCat provides connectors for MapReduce
b) Apache HCatalog provides table data access for CDH components such as Pig and MapReduce
c) HCat makes Hive metadata available to users of other Hadoop tools like Pig, MapReduce and Hive
d) All of the mentioned

View Answer

Answer: b [Reason:] Table definitions are maintained in the Hive metastore.

3. Hive version ___________ is the first release that includes HCatalog.
a) 0.10.0
b) 0.11.0
c) 0.12.0
d) All of the mentioned

View Answer

Answer: b [Reason:] HCatalog graduated from the Apache incubator and merged with the Hive project on March 26, 2013.

4. HCatalog is built on top of the Hive metastore and incorporates Hive’s :
a) DDL
b) DML
c) TCL
d) DCL

View Answer

Answer: a [Reason:] HCatalog provides read and write interfaces for Pig and MapReduce and uses Hive’s command line interface for issuing data definition and metadata exploration commands.

5. Point out the wrong statement :
a) HCatalog is a table and storage management layer for Hadoop that enables users with different data processing tools
b) There is Hive-specific interface for HCatalog
c) Data is defined using HCatalog’s command line interface (CLI)
d) All of the mentioned

View Answer

Answer: b [Reason:] Since HCatalog uses Hive’s metastore, Hive can read data in HCatalog directly.

6. The HCatalog interface for Pig consists of ____________ and HCatStorer, which implement the Pig load and store interfaces respectively
a) HCLoader
b) HCatLoader
c) HCatLoad
d) None of the mentioned

View Answer

Answer: b [Reason:] HCatLoader accepts a table to read data from; you can indicate which partitions to scan by immediately following the load statement with a partition filter statement.

7. _____________ accepts a table to read data from and optionally a selection predicate to indicate which partitions to scan.
a) HCatOutputFormat
b) HCatInputFormat
c) OutputFormat
d) InputFormat

View Answer

Answer: b [Reason:] The HCatalog interface for MapReduce — HCatInputFormat and HCatOutputFormat — is an implementation of Hadoop InputFormat and OutputFormat.

8. The HCatalog __________ supports all Hive DDL that does not require MapReduce to execute.
a) Powershell
b) CLI
c) CMD
d) All of the mentioned

View Answer

Answer: b [Reason:] Data is defined using HCatalog’s command line interface (CLI).

9. You can write to a single partition by specifying the partition key(s) and value(s) in the ___________ method.
a) setOutput
b) setOut
c) put
d) get

View Answer

Answer: a [Reason:] You can write to multiple partitions if the partition key(s) are columns in the data being stored.

10. HCatalog supports the same datatypes as :
a) Pig
b) Hama
c) Hive
d) Oozie

View Answer

Answer: c [Reason:] Partitions are multi-dimensional and not hierarchical. Records are divided into columns.

Hadoop MCQ Set 4

1. Which of the following command sets the value of a particular configuration variable (key) ?
a) set -v
b) set <key>=<value>
c) set
d) reset

View Answer

Answer: b [Reason:] If you misspell the variable name, the CLI will not show an error.

2. Point out the correct statement :
a) Hive Commands are non-SQL statement such as setting a property or adding a resource
b) Set -v prints a list of configuration variables that are overridden by the user or Hive
c) Set sets a list of variables that are overridden by the user or Hive
d) None of the mentioned

View Answer

Answer: a [Reason:] Commands can be used in HiveQL scripts or directly in the CLI or Beeline.

3. Which of the following operator executes a shell command from the Hive shell ?
a) |
b) !
c) ^
d) +

View Answer

Answer: b [Reason:] Excalamation operator is for execution of command.

4. Which of the following will remove the resource(s) from the distributed cache ?
a) delete FILE[S] <filepath>*
b) delete JAR[S] <filepath>*
c) delete ARCHIVE[S] <filepath>*
d) all of the mentioned

View Answer

Answer: d [Reason:] Delete command is used to remove existing resource.

5. Point out the wrong statement :
a) source FILE executes a script file inside the CLI
b) bfs executes a dfs command from the Hive shell
c) hive is Query lanaguage similar to SQL
d) none of the mentioned

View Answer

Answer: b [Reason:] dfs executes a dfs command from the Hive shell.

6. _________ is a shell utility which can be used to run Hive queries in either interactive or batch mode.
a) $HIVE/bin/hive
b) $HIVE_HOME/hive
c) $HIVE_HOME/bin/hive
d) All of the mentioned

View Answer

Answer: c [Reason:] Various types of command line operations are available in the shell utilty.

7. Which of the following is a command line option ?
a) -d,–define <key=value>
b) -e,–define <key=value>
c) -f,–define <key=value>
d) None of the mentioned

View Answer

Answer: a [Reason:] Variable substitution to apply to hive commands. e.g. -d A=B or –define A=B.

8. Which is the additional command line option is available in Hive 0.10.0 ?
a) –database <dbname>
b) –db <dbname>
c) –dbase <<dbname>
d) All of the mentioned

View Answer

Answer: a [Reason:] Database is specified which is to be used.

9. The CLI when invoked without the -i option will attempt to load $HIVE_HOME/bin/.hiverc and $HOME/.hiverc as _______ files
a) processing
b) termination
c) initialization
d) none of the mentioned

View Answer

Answer: c [Reason:] Hiverc file is loaded as per options selected.

10. When $HIVE_HOME/bin/hive is run without either the -e or -f option, it enters _______ mode.
a) Batch
b) Interactive shell
c) Multiple
d) None of the mentioned

View Answer

Answer: b [Reason:] Use “;” (semicolon) to terminate commands for multiple options available

Hadoop MCQ Set 5

1. BigDecimal is comprised of a ________ with an integer ‘scale’ field.
a) BigInt
b) BigInteger
c) MediumInt
d) SmallInt

View Answer

Answer: b [Reason:] The BigDecimal/BigInteger can also return itself as a ‘long’ value.

2. Point out the correct statement :
a) BooleanSerializer is used to parse string representations of boolean values into boolean scalar types
b) BlobRef is a wrapper that holds a BLOB either directly
c) BooleanParse is used to parse string representations of boolean values into boolean scalar types
d) All of the mentioned

View Answer

Answer: b [Reason:] BlobRef is used for reference to a file that holds the BLOB data.

3. ClobRef is a wrapper that holds a CLOB either directly, or a reference to a file that holds the ______ data.
a) CLOB
b) BLOB
c) MLOB
d) All of the mentioned

View Answer

Answer: a [Reason:] Create a ClobRef based on parsed data from a line of text.

4. __________ encapsulates a set of delimiters used to encode a record.
a) LargeObjectLoader
b) FieldMapProcessor
c) DelimiterSet
d) LobSerializer

View Answer

Answer: c [Reason:] Delimiter set is created with the specified delimiters.

5. Point out the wrong statement :
a) Abstract base class that holds a reference to a Blob or a Clob
b) ACCESSORTYPE is the type used to access this data in a streaming fashion
c) CONTAINERTYPE is the type used to hold this data (e.g., BytesWritable)
d) None of the mentioned

View Answer

Answer: d [Reason:] DATATYPE is the type being held (e.g., a byte array).

6. _________ supports null values for all types.
a) SmallObjectLoader
b) FieldMapProcessor
c) DelimiterSet
d) JdbcWritableBridge

View Answer

Answer: d [Reason:] JdbcWritableBridge class contains a set of methods which can read db columns from a ResultSet into Java types.

7. Which of the following is a singleton instance class ?
a) LargeObjectLoader
b) FieldMapProcessor
c) DelimiterSet
d) LobSerializer

View Answer

Answer: a [Reason:] Lifetime is limited to the current TaskInputOutputContext’s life.

8. Which of the following class is used for general processing of error ?
a) LargeObjectLoader
b) ProcessingException
c) DelimiterSet
d) LobSerializer

View Answer

Answer: b [Reason:] General error occurs during processing of a SqoopRecord.

9. Records are terminated by a __________ character.
a) RECORD_DELIMITER
b) FIELD_DELIMITER
c) FIELD_LIMITER
d) None of the mentioned

View Answer

Answer: a [Reason:] Class RecordParser parses a record containing one or more fields.

10.The fields parsed by ____________ are backed by an internal buffer.
a) LargeObjectLoader
b) ProcessingException
c) RecordParser
d) None of the Mentioned

View Answer

Answer: c [Reason:] Multiple threads must use separate instances of RecordParser.