Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
Filter by Categories
nmims post
Objective Type Set
Online MCQ Assignment
Question Solution
Solved Question
Uncategorized

Multiple choice question for engineering

Set 1

1. Quandl API for Python wraps the ________ REST API to return Pandas DataFrames with timeseries indexes.
a) Quandl
b) PyDatastream
c) PyData
d) None of the Mentioned

View Answer

Answer: a [Reason:] PyDatastream is a Python interface to the Thomson Dataworks Enterprise (DWE/Datastream) SOAP API to return indexed Pandas DataFrames or Panels with financial data.

2. Point out the correct statement:
a) Statsmodels provides powerful statistics, econometrics, analysis and modeling functionality that is out of pandas’ scope
b) Vintage leverages pandas objects as the underlying data container for computation
c) Bokeh is a Python interactive visualization library for small datasets
d) All of the Mentioned

View Answer

Answer: a [Reason:] Bokeh’s goal is to provide elegant, concise construction of novel graphics in the style of D3.

3. Which of the following library is used to retrieve and acquire statistical data and metadata disseminated in SDMX 2.1 ?
a) pandaSDMX
b) freedapi
c) Geopandas
d) All of the Mentioned

View Answer

Answer: a [Reason:] Geopandas extends pandas data objects to include geographic information which support geometric operations.

4. Which of the following provides a standard API for doing computations with MongoDB ?
a) Blaze
b) Geopandas
c) FRED
d) All of the Mentioned

View Answer

Answer: a [Reason:] If your work entails maps and geographical coordinates, and you love pandas, you should take a close look at Geopandas.

5. Point out the wrong statement:
a) qgrid is an interactive grid for sorting and filtering DataFrames
b) Pandas DataFrames implement _repr_html_ methods which are utilized by IPython Notebook
c) Spyder is a cross-platform Qt-based open-source R IDE
d) None of the Mentioned

View Answer

Answer: c [Reason:] Spyder is a cross-platform Qt-based open-source Python IDE.

6. Which of the following makes use of pandas and returns data in a Series or DataFrame ?
a) pandaSDMX
b) freedapi
c) OutPy
d) none of the Mentioned

View Answer

Answer: b [Reason:] freedapi module requires a FRED API key that you can obtain for free on the FRED website.

7. Spyder can introspect and display Pandas DataFrames.
a) True
b) False

View Answer

Answer: b [Reason:] Spyder show both “column wise min/max and global min/max coloring.

8. Which of the following is used for machine learning in python ?
a) scikit-learn
b) seaborn-learn
c) stats-learn
d) none of the Mentioned

View Answer

Answer: a [Reason:] scikit-learn is built on NumPy, SciPy, and matplotlib.

9. The ________ project builds on top of pandas and matplotlib to provide easy plotting of data.
a) yhat
b) Seaborn
c) Vincent
d) None of the Mentioned

View Answer

Answer: b [Reason:] Seaborn has great support for pandas data objects.

10. xray brings the labeled data power of pandas to the physical sciences.
a) True
b) False

View Answer

Answer: a [Reason:] It aims to provide a pandas-like and pandas-compatible toolkit for analytics on multi- dimensional arrays.

Set 2

1. varImp is a wrapper around the evimp function in the _______ package.
a) numpy
b) earth
c) plot
d) none of the Mentioned

View Answer

Answer: b [Reason:] The earth package is an implementation of Jerome Friedman’s Multivariate Adaptive Regression Splines.

2. Point out the wrong statement:
a) The trapezoidal rule is used to compute the area under the ROC curve
b) For regression, the relationship between each predictor and the outcome is evaluated
c) An argument, para, is used to pick the model fitting technique
d) All of the Mentioned

View Answer

Answer: c [Reason:] An argument, nonpara, is used to pick the model fitting technique.

3. Which of the following curve analysis is conducted on each predictor for classification ?
a) NOC
b) ROC
c) COC
d) All of the Mentioned

View Answer

Answer: b [Reason:] For two class problems, a series of cutoffs is applied to the predictor data to predict the class.

4. Which of the following function tracks the changes in model statistics ?
a) varImp
b) varImpTrack
c) findTrack
d) none of the Mentioned

View Answer

Answer: a [Reason:] GCV change value can also be tracked.

5. Point out the correct statement:
a) The difference between the class centroids and the overall centroid is used to measure the variable influence
b) The Bagged Trees output contains variable usage statistics
c) Boosted Trees uses different approach as a single tree
d) None of the Mentioned

View Answer

Answer: a [Reason:] The larger the difference between the class centroid and the overall center of the data, the larger the separation between the classes.

6. Which of the following model model include a backwards elimination feature selection routine ?
a) MCV
b) MARS
c) MCRS
d) All of the Mentioned

View Answer

Answer: b [Reason:] MARS stands for Multivariate Adaptive Regression Splines.

7. The advantage of using a model-based approach is that is more closely tied to the model performance.
a) True
b) False

View Answer

Answer: a [Reason:] Model-based approach is able to incorporate the correlation structure between the predictors into the importance calculation.

8. Which of the following model sums the importance over each boosting iteration ?
a) Boosted Trees
b) Bagged Trees
c) Partial Least Squares
d) None of the Mentioned

View Answer

Answer: a [Reason:] gbm package can be used here.

9. Which of the following argument is used to set importance values ?
a) scale
b) set
c) value
d) all of the Mentioned

View Answer

Answer: a [Reason:] All measures of importance are scaled to have a maximum value of 100.

10. For most classification models, each predictor will have a separate variable importance for each class.
a) True
b) False

View Answer

Answer: a [Reason:] The exceptions are classification trees, bagged trees and boosted trees.

Set 3

1. How many components are present in generalized linear models ?
a) 2
b) 4
c) 6
d) None of the Mentioned

View Answer

Answer: d [Reason:] Generalized linear models involve three components.

2. Point out the wrong statement:
a) Additive response models don’t make much sense if the response is discrete, or strictly positive
b) Transformations are often easy to interpret in linear model
c) Regression models are used to predict one variable from one or more other variables
d) All of the Mentioned

View Answer

Answer: b [Reason:] Transformations are often hard to interpret in linear model.

3. Which of the following component is involved in generalized linear models ?
a) An exponential family model for the response
b) A systematic component via a linear predictor
c) A link function that connects the means of the response to the linear predictor
d) All of the Mentioned

View Answer

Answer: d [Reason:] GLM is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution.

4. Collection of exchangeable binary outcomes for the same covariate data are called _______ outcomes.
a) random
b) direct
c) binomial
d) none of the Mentioned

View Answer

Answer: c [Reason:] The multivariate regression model for binary outcomes gives odds ratios, not risk ratios.

5. Point out the wrong statement:
a) Asymptotics are used for inference usually
b) Adding squared terms makes it continuously differentiable at the knot points
c) Adding squared terms makes it twice continuously differentiable at the knot points
d) None of the Mentioned

View Answer

Answer: c [Reason:] Adding cubic terms makes it twice continuously differentiable at the knot points.

6. Which of the following is example use of Poisson distribution ?
a) Analyzing contigency table data
b) Modeling web traffic hits
c) Incidence rates
d) All of the Mentioned

View Answer

Answer: d [Reason:] The Poisson distribution is a useful model for counts and rates.

7. Principal components or factor analytic models on covariates are often useful for reducing complex covariate spaces.
a) True
b) False

View Answer

Answer: a [Reason:] The space of models explodes quickly as you add interactions and polynomial terms.

8. How many outcomes are possible with bernoulli trial ?
a) 2
b) 3
c) 4
d) None of the Mentioned

View Answer

Answer: a [Reason:] Bernoulli trial is a random experiment with exactly two possible outcomes.

9. Which of the following analysis is a statistical process for estimating the relationships among variables ?
a) Causal
b) Regression
c) Multivariate
d) All of the Mentioned

View Answer

Answer: b [Reason:] Regression models provide the scientist with a powerful tool, allowing predictions about past, present, or future events to be made with information about past or present events.

10. Linear models are the most useful applied statistical technique.
a) True
b) False

View Answer

Answer: b [Reason:] Linear model do have limitations.

Set 4

1. Which of the following is contained in NumPy library ?
a) n-dimensional array object
b) tools for integrating C/C++ and Fortran code
c) fourier transform
d) all of the Mentioned

View Answer

Answer: d [Reason:] NumPy is the fundamental package for scientific computing with Python.

2. Point out the wrong statement:
a) ipython is an enhanced interactive Python shell
b) matplotlib will enable you to plot graphics
c) rPy provides a lot of scientific routines that work on top of NumPy
d) all of the Mentioned

View Answer

Answer: c [Reason:] SciPy provides a lot of scientific routines that work on top of NumPy.

3. The ________ function returns its argument with a modified shape, whereas the ________ method modifies the array itself.
a) reshape,resize
b) resize,reshape
c) reshape2,resize
d) all of the Mentioned

View Answer

Answer: a [Reason:] If a dimension is given as -1 in a reshaping operation, the other dimensions are automatically calculated.

4. To create sequences of numbers, NumPy provides a function __________ analogous to range that returns arrays instead of lists.
a) arange
b) aspace
c) aline
d) all of the Mentioned

View Answer

Answer: a [Reason:] When arange is used with floating point arguments, it is generally not possible to predict the number of elements obtained.

5. Point out the correct statement:
a) NumPy’s main object is the homogeneous multidimensional array
b) In Numpy, dimensions are called axes
c) Numpy’s array class is called ndarray
d) All of the Mentioned

View Answer

Answer: d [Reason:] The number of axes is called rank.

6. Which of the following function stacks 1D arrays as columns into a 2D array ?
a) row_stack
b) column_stack
c) com_stack
d) all of the Mentioned

View Answer

Answer: b [Reason:] column_stack is equivalent to vstack only for 1D arrays.

7. ndarray is also known as the alias array.
a) True
b) False

View Answer

Answer: a [Reason:] numpy.array is not the same as the Standard Python Library class array.array.

8. Which of the following method creates a new array object that looks at the same data ?
a) view
b) copy
c) paste
d) all of the Mentioned

View Answer

Answer: a [Reason:] The copy method makes a complete copy of the array and its data.

9. Which of the following function can be used to combine different vectors so as to obtain the result for each n-uplet ?
a) iid_
b) ix_
c) ixd_
d) all of the Mentioned

View Answer

Answer: b [Reason:] Length of the 1D boolean array must coincide with the length of the dimension (or axis) you want to slice.

10. ndarray.dataitemSize is the buffer containing the actual elements of the array.
a) True
b) False

View Answer

Answer: a [Reason:] ndarray.data is the buffer containing the actual elements of the array.

Set 5

1. Which of the following can be used to generate balanced cross–validation groupings from a set of data ?
a) createFolds
b) createSample
c) createResample
d) none of the Mentioned

View Answer

Answer: a [Reason:] createResample can be used to make simple bootstrap samples.

2. Point out the wrong statement:
a) Simple random sampling of time series is probably the best way to resample times series data.
b) Three parameters are used for time series splitting
c) Horizon parameter is the number of consecutive values in test set sample
d) All of the Mentioned

View Answer

Answer: a [Reason:] Simple random sampling of time series is probably not the best way to resample times series data.

3. Which of the following function can be used to maximize the minimum dissimilarities ?
a) sumDiss
b) minDiss
c) avgDiss
d) all of the Mentioned

View Answer

Answer: d [Reason:] sumDiss can be used to maximize the total dissimilarities.

4. Which of the following function can create the indices for time series type of splitting ?
a) newTimeSlices
b) createTimeSlices
c) binTimeSlices
d) none of the Mentioned

View Answer

Answer: b [Reason:] Rolling forecasting origin techniques are associated with time series type of splitting.

5. Point out the correct statement:
a) Asymptotics are used for inference usually
b) caret includes several functions to pre-process the predictor data
c) The function dummyVars can be used to generate a complete set of dummy variables from one or more factors
d) All of the Mentioned

View Answer

Answer: d [Reason:] The function dummyVars takes a formula and a data set and outputs an object that can be used to create the dummy variables using the predict method.

6. Which of the following can be used to create sub–samples using a maximum dissimilarity approach ?
a) minDissim
b) maxDissim
c) inmaxDissim
d) all of the Mentioned

View Answer

Answer: b [Reason:] Splitting is based on the predictors.

7. caret does not use the proxy package.
a) True
b) False

View Answer

Answer: b [Reason:] caret uses the proxy package.

8. Which of the following function can be used to create balanced splits of the data ?
a) newDataPartition
b) createDataPartition
c) renameDataPartition
d) none of the Mentioned

View Answer

Answer: b [Reason:] If the y argument to this function is a factor, the random sampling occurs within each class and should preserve the overall class distribution of the data.

9. Which of the following package tools are present in caret ?
a) pre-processing
b) feature selection
c) model tuning
d) all of the Mentioned

View Answer

Answer: d [Reason:] There are many different modeling functions in R.

10. caret stands for classification And regression training.
a) True
b) False

View Answer

Answer: a [Reason:] The caret package is a set of functions that attempt to streamline the process for creating predictive models.

.woocommerce-message { background-color: #98C391 !important; }