Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
Filter by Categories
nmims post
Objective Type Set
Online MCQ Assignment
Question Solution
Solved Question
Uncategorized

Multiple choice question for engineering

Set 1

1. Rigorously evaluating the performance of RNA prediction programs has traditionally been hindered by the dearth of three-dimensional structural information for RNA.
a) True
b) False

View Answer

Answer: a [Reason:] The availability of recently solved crystal structures of the entire ribosome provides a wealth of structural details relating to diverse types of RNA molecules. The high resolution structural information can then be used as a benchmark for evaluating state-of-the-art RNA structure prediction programs in all categories.

2. If prediction accuracy can be represented using a ______ the ______ programs score roughly 20% to 60% depending on the length of the sequences.
a) multiple parameter, ab initio–based
b) single parameter, ab initio–based
c) multiple parameter, comparative–based
d) single parameter, comparative–based

View Answer

Answer: b [Reason:] As mentioned, the scores depend on the length of the sequences. Generally speaking, the programs perform better for shorter RNA sequences than for longer ones.

3. For ____ RNA sequences, such as tRNA, some programs may be able to produce ___% accuracy.
a) small, 70
b) small, 40
c) large, 90
d) large, 75

View Answer

Answer: a [Reason:] The number of the percentage may vary but the qualitative idea is that for small RNA sequences, some programs may produce better accuracy. The major limitation for performance gains of this category appears to be dependence on energy parameters alone, which may not be sufficient to distinguish different structural possibilities of the same molecule.

4. The pre-alignment independent programs fare _____ for predicting long sequences.
a) slight better
b) much better
c) a bit worse
d) much worse

View Answer

Answer: d [Reason:] For small RNA sequences such as tRNA, both subtypes can achieve very high accuracy (up to 100%). This illustrates that the comparative approach is consistently more accurate than the ab initio one.

5. Based on recent benchmark comparisons, the comparative-type algorithms can reach an accuracy range of 20% to 80%.
a) True
b) False

View Answer

Answer: a [Reason:] The results depend on whether a program is pre-alignment dependent or not. Most of the superior performance comes from pre-alignment-dependent programs such as RNAalifold.

6. In comparative approach to RNA structure prediction, algorithms that do not use pre-alignment, align multiple input sequences and infers a consensus structure.
a) True
b) False

View Answer

Answer: a [Reason:] The alignment is produced using dynamic programming with a scoring scheme that incorporates sequence similarity as well as energy terms. Because the full dynamic programming for multiple alignment is computationally too demanding, currently available programs limit the input to two sequences.

7. In comparative approach to RNA structure prediction, Foldalign is a web-based only program for RNA alignment
a) True
b) False

View Answer

Answer: b [Reason:] Foldalign is a web-based program for RNA alignment and structure prediction. The user provides a pair of unaligned sequences.

8. In comparative approach to RNA structure prediction, the Foldalign program doesn’t use the covariation information.
a) True
b) False

View Answer

Answer: b [Reason:] The program uses a combination of Clustal and dynamic programming with a scoring scheme that includes covariation information to construct the alignment. A commonly conserved structure for both sequences is subsequently derived based on the alignment. To reduce computational complexity, the program ignores multi-branch loops and is only suitable for handling short RNA sequences.

9. In comparative approach to RNA structure prediction, Dynalign is a ________ program
a) Windows based
b) Fedora
c) UNIX
d) iOS based

View Answer

Answer: c [Reason:] Is a UNIX program with a free source code for downloading. Here, the user again provides two input sequences. The program calculates the possible secondary structures of each using a method similar to Mfold.

10. In comparative approach to RNA structure prediction, in Dynalign program–by comparing _________ from each sequence, a ______ structure common to both sequences is selected that serves as the basis for sequence alignment.
a) multiple alternative structures, lowest energy
b) single structure, lowest energy
c) single structure, highest energy
d) multiple alternative structures, highest energy

View Answer

Answer: a [Reason:] The unique feature of this program is that it does not require sequence similarity and therefore can handle very divergent sequences. However, because of the computation complexity, the program only predicts small RNA sequences such as tRNA with reasonable accuracy.

Set 2

1. PAUP is a Macintosh program (UNIX version available in the GCG package) with a very user-friendly graphical interface.
a) True
b) False

View Answer

Answer: a [Reason:] It stands for Phylogenetic analysis using parsimony. It is a commercial phylogenetic package. It is probably one of the most widely used phylogenetic programs available from Sinauer Publishers. PAUP was originally developed as parsimony program, but expanded to a comprehensive package that is capable of performing distance, parsimony, and likelihood analyses.

2. In PAUP, The distance options include NJ, ME, FM, and UPGMA.
a) True
b) False

View Answer

Answer: a [Reason:] For distance or ML analyses, PAUP has the option for detailed specifications of substitution models, base frequencies, and among site rate heterogeneity (γ -shape parameters, proportion of invariant sites). PAUP is also able to perform nonparametric bootstrapping, jackknifing, KH testing, and SH testing.

3. Phylip stands for Phylogenetic inference package (by Joe Felsenstein)
a) True
b) False

View Answer

Answer: a [Reason:] Is a free multiplatform comprehensive package containing thirty-five subprograms for performing distance, parsimony, and likelihood analysis, as well as bootstrapping for both nucleotide and amino acid sequences.

4. In PAUP, to complete an analysis the user is not required to move between different subprograms while keeping modifying names of the intermediate output files.
a) True
b) False

View Answer

Answer: a [Reason:] The only problem is that to complete an analysis the user is required to move between different subprograms while keeping modifying names of the intermediate output files. It is command-line based, but relatively easy to use for each single program.

5. Which of the following is untrue regarding TREE-PUZZLE?
a) It is a program performing quartet puzzling
b) It allows various substitution models for likelihood score estimation
c) It doesn’t incorporate a discrete γ model
d) Because of the heuristic nature of the program, it allows ML analyses of large datasets

View Answer

Answer: c [Reason:] The advantage is that it allows various substitution models for likelihood score estimation. Also, it incorporates a discrete γ model for rate heterogeneity among sites.

6. Which of the following is untrue regarding TREE-PUZZLE?
a) The resulting puzzle trees are automatically assigned puzzle support values to internal branches
b) The support values are percentages of consistent quartet trees
c) The support values do not have the same meaning as bootstrap values
d) The support values have the same meaning as bootstrap values

View Answer

Answer: d [Reason:] Because of the heuristic nature of the program, it allows ML analyses of large datasets. TREE-PUZZLE version 5.0 is available for Mac, UNIX, and Windows.

7. PHYML is a web-based _______ program using the _____
a) phylogenetic, GA (Genetic Algorithm )
b) sequence based alignment, GA (Genetic Algorithm )
c) phylogenetic, dynamic programming
d) sequence based alignment, dynamic programming

View Answer

Answer: a [Reason:] It first builds an NJ tree. Further it uses it as a starting tree for subsequent iterative refinement through subtree swapping. Branch lengths are simultaneously optimized during this process.

8. In PHYML, The tree searching _____ when the total ML score no longer ______
a) ceases, increases
b) stops, decreases
c) terminates, decreases
d) stops, increases

View Answer

Answer: d [Reason:] PHYML is a web-based phylogenetic program using the GA. The main advantage of this program is the ability to build trees from very large datasets with hundreds of taxa and to complete tree searching within a relatively short time frame.

9. MrBayes is a Bayesian phylogenetic inference program.
a) True
b) False

View Answer

Answer: a [Reason:] It randomly samples tree topologies using the MCMC procedure. Next it infers the posterior distribution of tree topologies.

10. MrBayes has a range of probabilistic models available to search for a set of trees with the lowest posterior probability.
a) True
b) False

View Answer

Answer: b [Reason:] MrBayes has a range of probabilistic models available to search for a set of trees with the highest posterior probability. It is fast and capable of handling large datasets. The program is available in multi platform versions. A web program that also employs Bayesian inference for phylogenetic analysis is BAMBE.

Set 3

1. Which of the following is incorrect about Bootstrapping?
a) It is a statistical technique that tests the sampling errors of a phylogenetic tree
b) It does the tests by repeatedly sampling trees through slightly perturbed datasets
c) A newly constructed tree is not biased at all
d) The robustness of the original tree can be assessed here

View Answer

Answer: c [Reason:] The rationale for bootstrapping is that a newly constructed tree is possibly biased owing to incorrect alignment or chance fluctuations of distance measurements. To determine the robustness or reproducibility of the current tree, trees are repeatedly constructed with slightly perturbed alignments that have some random fluctuations introduced.

2. A truly robust phylogenetic relationship should have enough characters to support the relationship even if the dataset is perturbed in such away.
a) True
b) False

View Answer

Answer: a [Reason:] Otherwise, the noise introduced in the resampling process is sufficient to generate different trees, indicating that the original topology may be derived from weak phylogenetic signals. Thus, this type of analysis gives an idea of the statistical confidence of the tree topology.

3. Which of the following is incorrect about nonparametric bootstrapping?
a) A new multiple sequence alignment of the same length is generated with random duplication of some of the sites
b) A new multiple sequence alignment of the distinct lengths is generated with random duplication of some of the sites
c) Certain sites are randomly replaced by other existing sites
d) Certain sites may appear multiple times, and other sites may not appear at all in the new alignment

View Answer

Answer: b [Reason:] In nonparametric bootstrapping, a new multiple sequence alignment of the same length is generated with random duplication of some of the sites (i.e., the columns in an alignment) at the expense of some other sites. This process is repeated 100 to 1,000 times to create 100 to 1,000 new alignments that are used to reconstruct phylogenetic trees using the same method as the originally inferred tree.

4. Which of the following is incorrect about nonparametric bootstrapping?
a) All the bootstrapped trees are summarized into a consensus tree based on a majority rule
b) The most supported branching patterns shown at each node are labeled with bootstrap values
c) The most supported branching patterns are the percentage of appearance of a particular clade.
d) This test doesn’t provide a measure for evaluating the confidence levels of the tree topology.

View Answer

Answer: d [Reason:] The bootstrap test provides a measure for evaluating the confidence levels of the tree topology. Analysis has shown that a bootstrap value of 70% approximately corresponds to 95% statistical confidence, although the issue is still a subject of debate.

5. Which of the following is incorrect about Caveats?
a) Unusually high GC content in the original dataset is the potential cause for generating biased trees
b) Unusually accelerated evolutionary rates is the potential cause for generating biased trees
c) Unusually accelerated evolutionary rates is the potential cause for generating biased bootstrap estimates
d) Not a large number of bootstrap re-sampling steps are needed to achieve yielding results

View Answer

Answer: d [Reason:] In addition, from a statistical point of view, a large number of bootstrap resampling steps are needed to achieve meaningful results. It is generally recommended that a phylogenetic tree should be bootstrapped 500 to 1,000 times. However, this presents a practical dilemma.

6. Which of the following is incorrect statement?
a) In this method one half of the sites in a dataset are randomly deleted
b) It creates datasets half as long as the original
c) Each new dataset is subjected to phylogenetic tree construction using the different methods as the original
d) One criticism of this approach is that the size of datasets has been changed into one half and that the datasets are no longer considered replicates

View Answer

Answer: c [Reason:] Each new dataset is subjected to phylogenetic tree construction using the same method as the original. The advantage of jackknifing is that sites are not duplicated relative to the original dataset and that computing time is much shortened because of shorter sequences.

7. Which of the following is incorrect about Bayesian Simulation?
a) It does not require bootstrapping
b) It requires bootstrapping
c) The MCMC procedure itself involves thousands or millions of steps of resampling
d) Posterior probabilities are assigned at each node of a best Bayesian tree as statistical support

View Answer

Answer: b [Reason:] Because of fast computational speed of MCMC tree searching, the Bayesian method offers a practical advantage over regular ML and makes the statistical evaluation of ML trees more feasible. Unlike bootstrap values, Bayesian probabilities are normally higher because most trees are sampled near a small number of optimal trees. Therefore, they have a different statistical meaning from bootstrap.

8. In phylogenetic analysis, it is also important to test whether two competing tree topologies can be distinguished and whether one tree is significantly better than the other.
a) True
b) False

View Answer

Answer: a [Reason:] The task is different from bootstrapping in that it tests the statistical significance of the entire phylogeny, not just portions of it. For that purpose, several statistical tests have been developed specifically for each of the three types of tree reconstruction methods, distance, parsimony, and likelihood. A test devised specifically for MP trees is called the Kishino–Hasegawa (KH) test.

9. The KH test sets out to test the null hypothesis that the two competing tree topologies are not significantly different.
a) True
b) False

View Answer

Answer: a [Reason:] A paired Student t-test is used to assess whether the null hypothesis can be rejected at a statistically significant level. In this test, the difference of branch lengths at each informative site between the two trees is calculated.

10. In Shimodaira–Hasegawa Test, The degree of freedom used for the analysis depends on the substitution model used. It relies on the following test formula d = 2(ln LA – ln LB) = 2 ln(LA/LB). Here, is the log likelihood ratio score and ln LA and ln LB are likelihood scores for tree A and tree B, respectively.
a) True
b) False

View Answer

Answer: a [Reason:] A frequently used statistical test for ML trees is the Shimodaira–Hasegawa (SH) test (likelihood ratio test). It tests the goodness of fit of two competing trees using the χ2 test. For this test, log likelihood scores of two competing trees have to be obtained first.

Set 4

1. Phylogenetics is the study of the evolutionary history of living organisms using treelike diagrams to represent pedigrees of these organisms.
a) True
b) False

View Answer

Answer: a [Reason:] Tree branching patterns representing the evolutionary divergence are referred to as phylogeny. Phylogenetics can be studied in various ways. It is often studied using fossil records, which contain morphological information about ancestors of current species and the timeline of divergence.

2. The descriptions of morphological traits are often_____ which are due to _____
a) ambiguous, multiple genetic factors
b) lucid, more than one genetic factors
c) clear, multiple genetic factors
d) ambiguous, one or two genetic factors

View Answer

Answer: a [Reason:] Thus, using fossil records to determine phylogenetic relationships can often be biased. For microorganisms, fossils are essentially nonexistent, which makes it impossible to study phylogeny with this approach.

3. Which of the following is incorrect regarding the advantages of Molecular data for phylogenetics study?
a) They are more numerous than fossil records
b) They are easier to obtain as compared to fossil records
c) Sampling bias is involved
d) More clear-cut and robust phylogenetic trees can be constructed with the molecular data

View Answer

Answer: c [Reason:] There is no sampling bias involved, which helps to mend the gaps in real fossil records. Therefore, they have become favorite and sometimes the only information available for researchers to reconstruct evolutionary history. The advent of the genomic era with tremendous amounts of molecular sequence data has led to the rapid development of molecular phylogenetics.

4. To use molecular data to reconstruct evolutionary history requires making a number of reasonable assumptions. Which of the following is incorrect about it?
a) The molecular sequences used in phylogenetic construction are homologous
b) The molecular sequences used in phylogenetic construction share a common origin
c) Phylogenetic divergence cannot be bifurcating
d) Parent branch splits into two daughter branches at any given point.

View Answer

Answer: c [Reason:] Here, option c and d contradict. Another assumption in phylogenetics is that each position in a sequence evolved independently. The variability among sequences is sufficiently informative for constructing unambiguous phylogenetic trees.

5. Building phylogenetic tree involves bifurcation and multifurcation.
a) True
b) False

View Answer

Answer: a [Reason:] Multifurcation is normally a result of insufficient evidence to fully resolve the tree or a result of an evolutionary process known as radiation. Sometimes, a branch point on a phylogenetic tree may have more than two descendents, resulting in a multifurcating node.

6. Which of the following is incorrect regarding the terminologies of phylogenetics?
a) The connecting point where two adjacent branches join is called a node
b) Node represents an inferred ancestor of extant taxa
c) At the tips of the branches are long lost species or sequences
d) The lines in the tree are called branches

View Answer

Answer: c [Reason:] At the tips of the branches are present-day species or sequences known as taxa (the singular form is taxon) or operational taxonomic units. The bifurcating point at the very bottom of the tree is the root node, which represents the common ancestor of all members of the tree.

7. Which of the following is incorrect regarding the terminologies of phylogenetics?
a) A group of taxa descended from a single common ancestor is defined as a clade or monophyletic group
b) In a monophyletic group, two taxa share a unique common ancestor shared by other taxa as well
c) Lineage is often synonymous with a tree branch leading to a defined monophyletic group
d) When a number of taxa share more than one closest common ancestors, they do not fit the definition of a clade. In this case, they are referred to as paraphyletic

View Answer

Answer: b [Reason:] In a monophyletic group, two taxa share a unique common ancestor not shared by any other taxa. They are also referred to as sister taxa to each other. The branch path depicting an ancestor–descendant relationship on a tree is called a lineage.

8. Which of the following is incorrect regarding the terminologies of phylogenetics?
a) The branching pattern in a tree is called tree topology
b) When all branches bifurcate on a phylogenetic tree, it is referred to as dichotomy
c) In case of dichotomy, each ancestor divides and gives rise to multiple descendants
d) An unrooted phylogenetic tree does not assume knowledge of a common ancestor

View Answer

Answer: c [Reason:] Sometimes, a branch point on a phylogenetic tree may have more than two descendents, resulting in a multifurcating node. The phylogeny with multifurcating branches is called polytomy. A polytomy is an be a result of either an ancestral taxon giving rise to more than two immediate descendants simultaneously during evolution, a process known as radiation, or an unresolved phylogeny in which the exact order of bifurcations cannot be determined precisely.

9. Because there is no indication of which node represents an ancestor, there is no direction of an evolutionary path in an unrooted tree.
a) True
b) False

View Answer

Answer: a [Reason:] To define the direction of an evolution path, a tree must be rooted. In a rooted tree, all the sequences under study have a common ancestor or root node from which a unique evolutionary path leads to all other nodes.

10. Molecular clock is an assumption by which molecular sequences evolve at varying rates.
a) True
b) False

View Answer

Answer: b [Reason:] Molecular clock is an assumption by which molecular sequences evolve at constant rates so that the amount of accumulated mutations is proportional to evolutionary time. Based on this hypothesis, branch lengths on a tree can be used to estimate divergence time. This assumption of uniformity of evolutionary rates, however, rarely holds true in reality.

Set 5

1. Analysis of s for conserved blocks of sequence leads to production of the position-specific scoring matrix.
a) True
b) False

View Answer

Answer: a [Reason:] The analysis of MSAs (Multiple Sequence Alignment) for conserved blocks of sequence leads to production of the position-specific scoring matrix, or PSSM. The PSSM may be used to search a sequence to obtain the most probable location or locations of the motif represented by the PSSM. Alternatively, the PSSM may be used to search an entire database to identify additional sequences that also have the same motif.

2. The quality and quantity of information provided by the PSSM also varies for ________ in the motif.
a) each row
b) each column
c) rows and columns
d) neither the rows nor the columns

View Answer

Answer: b [Reason:] The quality and quantity of information provided by the PSSM also varies for each column in the motif, and this variation profoundly influences the matches found with sequences. This situation can be accurately described by information theory, and the results can be displayed by a colored graph called a sequence logo.

3. Two considerations arise in trying to tune the PSSM so that it adequately represents the training sequences. Which of the following is not their description?
a) If a given column in 20 sequences has only isoleucine, it is not very likely that a different amino acid will be found in other sequences with that motif because the residue is probably important for function
b) If a given column in 20 sequences has only isoleucine, it is very likely that a different amino acid will be found in other sequences with that motif because the residue is probably important for function
c) If the number of sequences with the found motif is large and reasonably diverse, the sequences represent a good statistical sampling of all sequences that are ever likely to be found with that same motif
d) Another column in the motif from the 20 sequences may have several amino acids, and some amino acids may not be represented at all

View Answer

Answer: b [Reason:] The PSSM is constructed by a simple logarithmic transformation of a matrix giving the frequency of each amino acid in the motif. Even more variation may be expected at that position in other sequences, although the more abundant amino acids already found in that column would probably be favored.

4. If a good sampling of sequences is _______ the number of sequences is _________ and the motif structure is ________ it should, in principle, be possible to obtain frequencies highly representative of the same motif in other sequences also.
a) available, sufficiently large, not too complex
b) unavailable, sufficiently large, not too complex
c) unavailable, sufficiently small, not too complex
d) available, sufficiently large, too complex

View Answer

Answer: a [Reason:] the more abundant amino acids already found in that column would probably be favored. Thus, if a good sampling of sequences is available, the number of sequences is sufficiently large, and the motif structure is not too complex, it should, in principle, be possible to obtain frequencies highly representative of the same motif in other sequences also (Henikoff and Henikoff 1996).

5. If the data set is _______, then unless the motif has __________ amino acids in each column, the column frequencies in the motif may not be highly representative of all other occurrences of the motif.
a) small, distinct
b) small, almost identical
c) large, almost identical
d) large, distinct

View Answer

Answer: b [Reason:] the number of sequences for producing the motif may be small, highly diverse, or complex, giving rise to a second level of consideration. If the data set is small, then unless the motif has almost identical amino acids in each column, the column frequencies in the motif may not be highly representative of all other occurrences of the motif. In such cases, it is desirable to improve the estimates of the amino acid frequencies by adding extra amino acid counts, called pseudocounts, to obtain a more reasonable distribution of amino acid frequencies in the column.

6. Even if many pseudocounts are added in comparison to real sequence counts, the amino acid frequencies will not have any effect or influence.
a) True
b) False

View Answer

Answer: b [Reason:] Knowing how many counts to add is a difficult but fortunately solvable problem. On the one hand, if too many pseudocounts are added in comparison to real sequence counts, the pseudocounts will become the dominant influence in the amino acid frequencies and searches using the motif will not work. On the other hand, if there are relatively few real counts, many amino acid variations may not be present because of the small sample of sequences.

7. Which of the following is not a feature of editors and formatters?
a) provision for displaying the sequence on a color monitor with residue colors to aid in a clear visual representation of the alignment
b) recognition of the multiple sequence format that was output by the MSA (Multiple Sequence Alignment) program
c) maintenance of the alignment in a suitable format when the editing is completed
d) disallowing shading conserved residues in the alignment

View Answer

Answer: d [Reason:] In addition to this, provision of a suitable windows interface, allowing use of the mouse to add, delete, or move sequence followed by an updated display of the alignment, is a feature. In addition, there are other types of editing that are commonly performed on MSAs (Multiple Sequence Alignment) program such as, for example, shading conserved residues in the alignment.

8. GDE (Genetic Data Environment) provides a general interface on UNIX machines for sequence analysis, sequence alignment editing, and display.
a) True
b) False

View Answer

Answer: a [Reason:] It is available from several anonymous FTP sites. This interface requires communication with a host UNIX machine running the Genetics Computer Group software. Interface with MS-DOS or Macintosh is possible if the computer is equipped with the appropriate X-Windows client software.

9. MACAW is a local multiple sequence alignment program only.
a) True
b) False

View Answer

Answer: b [Reason:] MACAW is both a local multiple sequence alignment program and a sequence editing tool. Given a set of sequences, the program finds ungapped blocks in the sequences and gives their statistical significance. Later versions of the program find blocks by one of three user-chosen methods.

10. Two commonly encountered examples are the Genetics Computer Group’s MSF format and the CLUSTALW ALN format.
a) True
b) False

View Answer

Answer: a [Reason:] This is because these formats follow a precise outline, one may be readily converted to another by computer programs. READSEQ by D.G.Gilbert at Indiana University at Bloomington is one such program.