Select Page
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
Filter by Categories
nmims post
Objective Type Set
Online MCQ Assignment
Question Solution
Solved Question
Uncategorized

# Multiple choice question for engineering

## Set 1

1. Which of the following is correct about regularized regression ?
a) Can help with bias trade-off
b) Cannot help with model selection
c) Cannot help with variance trade-off
d) All of the Mentioned

Answer: a [Reason:] Regularized regression does not perform as well as random forest.

2. Point out the wrong statement:
a) Model based approach may be computationally convenient
b) Model based approach use Bayes theorem
c) Model based approach are reasonably inaccurate on real problems
d) All of the Mentioned

Answer: c [Reason:] Model based approach are reasonably accurate on real problems.

3. Which of the following methods are present in caret for regularized regression ?
a) ridge
b) lasso
c) relaxo
d) all of the Mentioned

Answer: d [Reason:] In caret one can tune over the no of predictors to retain instead of defined values for penalty.

4. Which of the following method can be used to combine different classifiers ?
a) Model stacking
b) Model combining
c) Model structuring
d) None of the Mentioned

Answer: a [Reason:] Model ensembling is also used for combining different classifiers.

5. Point out the correct statement:
a) Combining classifiers improves interpretability
b) Combining classifiers reduces accuracy
c) Combining classifiers improves accuracy
d) All of the Mentioned

Answer: c [Reason:] You can combine classifier by averaging.

6. Which of the following function provides unsupervised prediction ?
a) cl_forecast
b) cl_nowcast
c) cl_precast
d) none of the Mentioned

Answer: d [Reason:] cl_predict function is clue package provides unsupervised prediction.

7. Model based prediction considers relatively easy version for covariance matrix.
a) True
b) False

Answer: b [Reason:] Model based prediction considers relatively easy version for covariance matrix.

8. Which of the following is used to assist the quantitative trader in the development ?
a) quantmod
b) quantile
c) quantity
d) mboost

Answer: a [Reason:] Quandl package is simialr to quantmod.

9. Which of the following function can be used for forecasting ?
a) predict
b) forecast
c) ets
d) all of the Mentioned

Answer: b [Reason:] Forecasting is the process of making predictions of the future based on past and present data and analysis of trends.

10. Predictive analytics is same as forecasting.
a) True
b) False

Answer: b [Reason:] Predictive analytics goes beyond forecasting.

## Set 2

1. Which of the following thing can be data in Pandas ?
a) a python dict
b) an ndarray
c) a scalar value
d) all of the Mentioned

Answer: d [Reason:] The passed index is a list of axis labels.

2. Point out the correct statement:
a) If data is a list, if index is passed the values in data corresponding to the labels in the index will be pulled out
b) NaN is the standard missing data marker used in pandas
c) Series acts very similarly to a array
d) None of the Mentioned

Answer: b [Reason:] If data is a dict, if index is passed the values in data corresponding to the labels in the index will be pulled out.

3. The result of an operation between unaligned Series will have the ________ of the indexes involved.
a) intersection
b) union
c) total
d) all of the Mentioned

Answer: b [Reason:] If a label is not found in one Series or the other, the result will be marked as missing NaN.

4. Which of the following input can be accepted by DataFrame ?
a) Structured ndarray
b) Series
c) DataFrame
d) All of the Mentioned

Answer: d [Reason:] DataFrame is a 2-dimensional labeled data structure with columns of potentially different types.

5. Point out the wrong statement:
a) A DataFrame is like a fixed-size dict in that you can get and set values by index label
b) Series can be be passed into most NumPy methods expecting an ndarray
c) A key difference between Series and ndarray is that operations between Series automatically align the data based on label
d) None of the Mentioned

Answer: a [Reason:] A Series is like a fixed-size dict in that you can get and set values by index label.

6. Which of the following takes a dict of dicts or a dict of array-like sequences and returns a DataFrame ?
a) DataFrame.from_items
b) DataFrame.from_records
c) DataFrame.from_dict
d) All of the Mentioned

Answer: a [Reason:] DataFrame.from_dict operates like the DataFrame constructor except for the orient parameter which is ‘columns’ by default.

7. Series is a one-dimensional labeled array capable of holding any data type.
a) True
b) False

Answer: a [Reason:] The axis labels are collectively referred to as the index.

8. Which of the following works analogously to the form of the dict constructor ?
a) DataFrame.from_items
b) DataFrame.from_records
c) DataFrame.from_dict
d) All of the Mentioned

Answer: a [Reason:] DataFrame.from_records takes a list of tuples or an ndarray with structured dtype.

9. Which of the following operation works with the same syntax as the analogous dict operations ?
a) Getting columns
b) Setting columns
c) Deleting columns
d) All of the Mentioned

Answer: d [Reason:] You can treat a DataFrame semantically like a dict of like-indexed Series objects.

10. If data is an ndarray, index must be the same length as data.
a) True
b) False

Answer: a [Reason:] If no index is passed, one will be created having values [0, …, len(data) – 1].

## Set 3

1. The plot method on Series and DataFrame is just a simple wrapper around :
a) gplt.plot()
b) plt.plot()
c) plt.plotgraph()
d) none of the Mentioned

Answer: b [Reason:] If the index consists of dates, it calls gcf().autofmt_xdate() to try to format the x-axis nicely.

2. Point out the correct combination with regards to kind keyword for graph plotting:
a) ‘hist’ for histogram
b) ‘box’ for boxplot
c) ‘area’ for area plots
d) all of the Mentioned

Answer: d [Reason:] The kind keyword argument of plot() accepts a handful of values for plots other than the default Line plot.

3. Which of the following value is provided by kind keyword for barplot ?
a) barh
b) kde
c) hexbin
d) none of the Mentioned

Answer: a [Reason:] bar can also be used for barplot.

4. You can create a scatter plot matrix using the __________ method in pandas.tools.plotting.
a) sca_matrix
b) scatter_matrix
c) DataFrame.plot
d) all of the Mentioned

Answer: b [Reason:] You can create density plots using the Series/DataFrame.plot.

5. Point out the wrong combination with regards to kind keyword for graph plotting:
a) ‘scatter’ for scatter plots
b) ‘kde’ for hexagonal bin plots
c) ‘pie’ for pie plots
d) none of the Mentioned

Answer: b [Reason:] kde is used for density plots.

6. Which of the following plots are used to check if a data set or time series is random ?
a) Lag
b) Random
d) None of the Mentioned

Answer: a [Reason:] Random data should not exhibit any structure in the lag plot.

7. Plots may also be adorned with errorbars or tables.
a) True
b) False

Answer: a [Reason:] There are several plotting functions in pandas.tools.plotting.

8. Which of the following plots are often used for checking randomness in time series ?
a) Autocausation
b) Autorank
c) Autocorrelation
d) None of the Mentioned

Answer: c [Reason:] If time series is random, such autocorrelations should be near zero for any and all time-lag separations.

9. __________ plots are used to visually assess the uncertainty of a statistic.
a) Lag
c) Bootstrap
d) None of the Mentioned

Answer: c [Reason:] Resulting plots and histograms are what constitutes the bootstrap plot.

10. Andrews curves allow one to plot multivariate data.
a) True
b) False

Answer: a [Reason:] Curves belonging to samples of the same class will usually be closer together and form larger structures.

## Set 4

1. Predicting with trees evaluate _____________ within each group of data.
a) equality
b) homogeniety
c) heterogeniety
d) all of the Mentioned

Answer: b [Reason:] Predicting with trees is easy to interpret.

2. Point out the wrong statement:
a) Training and testing data must be processed in different way
b) Test transformation would mostly be imperfect
c) The first goal is statistical and second is data compression in PCA
d) All of the Mentioned

Answer: a [Reason:] Training and testing data must be processed in same way.

3. Which of the following method options is provided by train function for bagging ?
a) bagEarth
b) treebag
c) bagFDA
d) all of the Mentioned

Answer: d [Reason:] Bagging can be done using bag function as well.

4. Which of the following is correct with respect to random forest?
a) Random forest are difficult to interpret but often very accurate
b) Random forest are easy to interpret but often very accurate
c) Random forest are difficult to interpret but very less accurate
d) None of the Mentioned

Answer: a [Reason:] Random forest is top performing algorithm in prediction.

5. Point out the correct statement:
a) Prediction with regression is easy to implement
b) Prediction with regression is easy to interpret
c) Prediction with regression performs well when linear model is correct
d) All of the Mentioned

Answer: d [Reason:] Prediction with regression gives poor performance in non linear settings.

6. Which of the following library is used for boosting generalized additive models ?
a) gamBoost
b) gbm
d) all of the Mentioned

Answer: a [Reason:] Boosting can be used with any subset of classifier.

7. The principal components are equal to left singular values if you first scale the variables.
a) True
b) False

Answer: b [Reason:] The principal components are equal to left singular values if you first scale the variables.

8. Which of the following is statistical boosting based on additive logistic regression ?
a) gamBoost
b) gbm
d) mboost

Answer: a [Reason:] mboost is used for model based boosting.

9. Which of the following is one of the largest boost subclass in boosting ?
a) variance boosting
c) mean boosting
d) all of the Mentioned

Answer: b [Reason:] R has multiple boosting libraries.

10. PCA is most useful for non linear type models.
a) True
b) False

Answer: b [Reason:] PCA is most useful for linear type models.

## Set 5

1. Which of the following is the valid component of predictor ?
a) data
b) question
c) algorithm
d) all of the Mentioned

2. Point out the wrong statement:
a) In Sample Error is also called generalization error
b) Out of Sample Error is the error rate you get on the new dataset
c) In Sample Error is also called resubstitution error
d) All of the Mentioned

Answer: a [Reason:] Out of Sample Error is also called generalization error.

3. Which of the following is correct order of working ?
a) questions->input data ->algorithms
b) questions->evaluation ->algorithms
c) evaluation->input data ->algorithms
d) all of the Mentioned

Answer: a [Reason:] Evaluation is done in the last.

4. Which of the following shows correct relative order of importance ?
a) question->features->data->algorithms
b) question->data->features->algorithms
c) algorithms->data->features->question
d) none of the Mentioned

Answer: b [Reason:] Garbage in should be equal to garbage out.

5. Point out the correct statement:
a) In Sample Error is the error rate you get on the same dataset used to model a predictor
b) Data have two parts-signal and noise
c) The goal of predictor is to find signal
d) None of the Mentioned

Answer: d [Reason:] Perfect in sample predictor can be built.

6. Which of the following is characteristic of best machine learning method ?
a) Fast
b) Accuracy
c) Scalable
d) All of the Mentioned

7. True positive means correctly rejected.
a) True
b) False

Answer: b [Reason:] True positive means correctly identified.

8. Which of the following trade-off occurs during prediction ?
a) Speed vs Accuracy
b) Simplicity vs Accuracy
c) Scalability vs Accuracy
d) None of the Mentioned

Answer: d [Reason:] Interpretability also matters during prediction.

9. Which of the following expression is true ?
a) In sample error < out sample error
b) In sample error > out sample error
c) In sample error = out sample error
d) All of the Mentioned