Select Page
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
Filter by Categories
nmims post
Objective Type Set
Online MCQ Assignment
Question Solution
Solved Question
Uncategorized

# Multiple choice question for engineering

## Set 1

1. Which of the following is incorrect about the prediction of RNA secondary structure?
a) Every base is first compared to every other base by a type of analysis very similar to the dot matrix analysis
b) A row of matches in the RNA matrix indicates a succession of complementary nucleotides that can potentially form a double-stranded region
c) A row of matches in the RNA matrix indicates a failure of complementary nucleotides that can potentially form a double-stranded region
d) The sequence is listed across the top and down the side of the page, and G/C, A/U, and G/U base pairs are scored

Answer: c [Reason:] The energy of each predicted structure is estimated by the nearest- neighbor rule by summing the negative base-stacking energies for each pair of bases in double-stranded regions. By adding the estimated positive energies of destabilizing regions such as loops at the end of hairpins, bulges within hairpins, internal bulges, and other unpaired regions.

2. Through a single scoring matrix, evaluation of all the different possible configurations is done.
a) True
b) False

Answer: b [Reason:] To evaluate all the different possible configurations and to find the most energetically favorable, several types of scoring matrices are used. The complementary regions are evaluated by a dynamic programming algorithm to predict the most energetically stable molecule. The method is similar to the dynamic programming method used for sequence alignment.

3. The object is to find a diagonal row of matches that goes from upper left to lower right.
a) True
b) False

Answer: b [Reason:] The object is to find a diagonal row of matches that goes from upper right to lower left. In general, each matrix value is obtained by considering the minimum energy values, obtained by all previous complementary pairs, decreased by the stacking energy of any additional complementary base pairs or increased by the destabilizing energy associated with non-complementary bases.

4. The increase depends on the type and length of loop that is introduced by the non-complementary base pair, whether internal loop, bulge loop, or hairpin loop.
a) True
b) False

Answer: a [Reason:] This comparison of all possible matches and energy values is continued until all nucleotides have been compared. There is a pattern followed in comparing bases within the RNA molecule.

5. The sequence is listed down the first column of base comparisons’ table and free energy calculations’ table in the 5’→3’ orientation.
a) True
b) False

Answer: a [Reason:] the first four bases of the sequence are also listed in the first row of the tables in the 5’→3’ direction. Several complementary base pairs between the first and last four bases that could lead to secondary structure are shown in the tables.

6. A general theory for modeling strings of symbols, such as bases in DNA sequences, has been developed by linguists. There is a hierarchy of these so-called transformational grammars that deal with situations of increasing complexity.
a) True
b) False

Answer: a [Reason:] The application of these grammars to sequence analysis has been extensively discussed elsewhere. The context-free grammar is suitable for finding groups of symbols in different parts of the input sequence that thus are not in the same context.

7. ______ regions in sequences, such as those in RNA that will form secondary structures, are an example of such context-free sequences.
a) non-interlocking
b) non-Complementary
c) complementary
d) non-compatible

Answer: c [Reason:] Stochastic context-free grammars (SCFG) introduce uncertainty into the definition of such regions. It allows them to use alternative symbols as found in the evolution of RNA molecules.

8. The use of SCFGs in RNA secondary structure production analysis is in fact very similar to that of the covariance model, with the grammatical productions resembling the nodes in the ordered binary tree.
a) True
b) False

Answer: a [Reason:] As with hidden Markov models, the probability distribution of each production must be derived by training with known sequences. The algorithms used for training the SCFG and for aligning a sequence with the SCFG are somewhat different from those used with hidden Markov models, and the time and memory requirements are greater.

9. In a SCFG, each production of a non-terminal symbol has an associated probability for giving rise to the resulting product, and there are a set of productions, each giving a different result.
a) True
b) False

Answer: a [Reason:] For example, the production S1 →C S2 G could also be represented by 15 other base-pair combinations, and each of these has a corresponding probability. Thus, each production can be considered to be represented by a probability distribution over the possible outcomes.

10. The application of SCFGs to RNA secondary structure analysis is very similar in form to the probabilistic covariance models.
a) True
b) False

Answer: a [Reason:] For RNA, the symbols of the alphabet are A, C, G, and U. The context-free grammar establishes a set of rules called productions for generating the sequence from the alphabet, in this case an RNA molecule with sections that can base-pair and others that cannot base-pair.

## Set 2

1. More than _____ of the human genome consists of interspersed repetitive sequences derived from TEs (transposable elements).
a) one-third
b) one-eighth
c) one-fifth
d) half

Answer: a [Reason:] The presence of these elements may be demonstrated using programs for detection of low-complexity regions in sequences. For e.g. in the fruit fly Drosophila has 15% of genome that is made up of transposable elements.

2. The retroposons include short _________ interspersed nuclear elements (SINES)
a) 90–4000 bp long
b) 80–500 Mbp long
c) 80–300 bp long
d) 100–3000 bp long

Answer: c [Reason:] There exists also (6–8 kbp long) interspersed nuclear elements (LINES). Different types of transposable elements are present in high copy numbers in mammalian genomes in varying manner.

3. _____ of the human genome comprises one particular family of the SINE
Element, designated Alu (1.2 million copies)
a) 10%
b) 20%
c) 60%
d) 40%

Answer: a [Reason:] Ten percent of the human genome comprises one particular family of the SINE Element. And 14.6% of one particular LINE designated LINE1 (593,000 copies) are present.

4. Vertebrate chromosomes have long (>300 kb) regions of distinct GC richness, repeat content, and gene density, designated isochores in a model of genome organization proposing that genomes are made up of distinct segments of unique composition.
a) True
b) False

Answer: a [Reason:] Human and mouse chromosomal regions that have a low density of genes are AT-rich and have more Alu or B1/B2 (SINES) than LINE1 elements. Whereas the reverse is true for regions that have a high gene density, and those regions are more GC-rich.

5. The human genome contains about _____ of class II of elements that probably predate human evolution (Smit 1996).
a) 2,000 copies
b) 200,000 copies
c) 2,00,00,000 copies
d) 20,00,000 copies

Answer: b [Reason:] The class of TEs, class II, is made up of elements that employ a DNA-based mechanism of transposition. Class II elements also include the Activation-Dissociation (Ac-Ds) family in maize and the P element in Drosophila.

6. A third category of TEs has features of both class I and class II TEs. These miniature, inverted repeat TEs (MITES) are ____ in length.
a) 400 bp
b) 500 Mbp
c) 300 kbp
d) 600 kbp

Answer: a [Reason:] They were discovered in diverse flowering plants where they are frequently associated with regulatory regions of genes. Hence, they could be exerting an influence on regulation of gene expression.

7. Which of the given features is incorrect?
a) TEs are present in few particular chromosomes
b) TEs are present in all of the chromosomes
c) Abundance of TEs varies
d) TEs can comprise a large portion of the genomes of higher eukaryotes, both plants and animals

Answer: a [Reason:] TEs are present in all of the chromosomes, ranging from bacteria to humans, but their abundance varies. They can comprise a large portion of the genomes of higher eukaryotes, thus, only a small fraction of the genome of these organisms carries gene sequences.

8. Eukaryotic genes that encode proteins are interrupted by ________
a) exons of varying length and number
b) introns of varying length and number
c) exons of varying length and but same number
d) introns of varying number but same length

Answer: b [Reason:] In S. cerevisiae (budding yeast), only a small fraction of the genes contain introns, and there are a total of 239 introns in the entire genome. In contrast, in individual human genes, introns may be present in numbers exceeding 100 and comprise more than 95% of the gene.

9. Introns can remain at a corresponding position in a eukaryotic gene for long periods of evolutionary time.
a) True
b) False

Answer: a [Reason:] The origin of introns in eukaryotic genes is not understood but has been accounted for by two models. The “introns-early” view proposes that introns were used to assemble the first genes from sets of ancient conserved exons, whereas the “introns-late” view proposes that introns broke up previously continuous genes by inserting into them.

10. The intron structure of genes in a particular eukaryote is used for predicting the location of genes of genome sequences.
a) True
b) False

Answer: a [Reason:] Other features of eukaryotic genes in a particular organism that are useful for gene prediction include the consensus sequences at exon–intron and intron–exon splice junctions, base composition, codon usage, and preference for neighboring codons. Computational methods incorporate this information into a gene model that may be used to predict the presence of genes in a genome sequence.

## Set 3

1. In dot matrix in ab initio methods, the diagonals _____ to the main diagonal represent regions that can self hybridize.
a) parallel
b) cutting in random fashion
c) perpendicular

Answer: c [Reason:] The diagonals perpendicular to the main diagonal represent regions that can self hybridize to form double-stranded structure with traditional A–U and G–C base pairs. In reality, the pattern detection in a dot matrix is often obscured by high noise levels.

2. In dot matrix in ab initio methods, one way to reduce the noise in the matrix is to select an appropriate window size of a maximum number of contiguous base matches.
a) True
b) False

Answer: b [Reason:] to reduce the noise in the matrix is to select an appropriate window size of a minimum number of contiguous base matches. Normally, only a window size of four consecutive base matches is used. If the dot plot reveals more than one feasible structure, the lowest energy one is chosen.

3. In dynamic programming, in ab initio methods, the use of a dot plot can be effective in finding a ____ secondary structure in a ____ molecule.
a) multiple, large
b) single, large
c) single, small
d) multiple, small

Answer: c [Reason:] Mostly, The use of a dot plot can be effective in finding a single secondary structure in a small Molecule. However, if a large molecule contains multiple secondary structure segments, choosing a combination that is energetically most stable among a large number of possibilities can be a daunting task.

4. In ab initio methods, a quantitative approach such as dynamic programming can be used to assemble a final structure with optimal base-paired regions.
a) True
b) False

Answer: a [Reason:] In this approach, an RNA sequence is compared with itself. A scoring scheme is applied to fill the matrix with match scores based on Watson–Crick base complementarity.

5. In dynamic programming, in ab initio methods, Often, ____ base pairing and energy terms of the base pairing _____ incorporated into the scoring process.
a) G-C, are
b) G–U, are not
c) G–U, are also
d) G-C, are not

Answer: c [Reason:] Although the traditional structure comprises of A–U and G–C base pairs, G-U base pairing is incorporated into the scoring process. A path with the maximal score within a scoring matrix after taking into account the entire sequence information represents the most probable secondary structure form.

6. The dynamic programming method produces ____ structure with _____ score.
a) one, single best
b) multiple, single best
c) multiple, multiple
d) single, multiple

Answer: a [Reason:] However, this is potentially a drawback of this approach. In reality an RNA may exist in multiple alternative forms with near minimum energy but not necessarily the one with maximum base pairs.

7. The problem of dynamic programming to select one single structure can be complemented by adding a probability distribution function, known as the _________ which calculates a mathematical distribution of probable base pairs in a thermodynamic equilibrium.
a) partition function
b) division function
c) increment function
d) fold function

Answer: a [Reason:] This function helps to select a number of suboptimal structures within a certain energy range. The MFOLD and RNAFold are two well-known programs using the ab initio prediction method.

8. Which of following is correct about MFOLD?
a) It uses the dynamic programming only
b) It uses the thermodynamic calculations only
c) It uses the both dynamic programming and thermodynamic calculations as well
d) It doesn’t take into account the themoststablility of the secondary structures

Answer: b [Reason:] It is a web-based program for RNA secondary structure prediction. It combines dynamic programming and thermodynamic calculations for identifying themostable secondary structures with the lowest energy. It also produces dot plots coupled with energy terms. This method is reliable for short sequences, but becomes less accurate as the sequence length increases.

9. Like Mfold, RNAfold only examines the energy terms of the optimal alignment in a dot plot.
a) True
b) False

Answer: b [Reason:] is one of the web programs in the Vienna package. Unlike Mfold, which only examines the energy terms of the optimal alignment in a dot plot, RNAfold extends the sequence alignment to the vicinity of the optimal diagonals to calculate thermodynamic stability of alternative structures.

10. Which of the following about the RNAFold is incorrect?
a) It extends the sequence alignment to the vicinity of the optimal diagonals to calculate thermodynamic stability of alternative structures
b) It incorporates a partition function
c) It doesn’t necessarily use a partition function
d) It aims to select a number of statistically most probable structures in one of its steps

Answer: c [Reason:] Based on both thermodynamic calculations and the partition function, a number of alternative structures that may be suboptimal are provided. The collection of the predicted structures may provide a better estimate of plausible foldings of an RNA molecule than the predictions by Mfold.

## Set 4

1. Which of the following is untrue about homology modeling?
a) Homology modeling predicts protein structures based on sequence homology with known structures
b) It is also known as comparative modeling
c) The principle behind it is that if two proteins share a high enough sequence similarity, they are likely to have very similar three-dimensional structures
d) It doesn’t involve the evolutionary distances anywhere

Answer: d [Reason:] As the name suggests, homology modeling predicts protein structures based on sequence homology with known structures. Homology modeling produces an all-atom model based on alignment with template proteins.

2. Which of the following is untrue about template Selection Step?
a) The first step in protein structural modeling is to select appropriate structural templates
b) This forms the foundation for rest of the modeling process
c) There is no use of heuristic alignment search programs
d) The template selection involves searching the Protein Data Bank (PDB) for homologous proteins with determined structures

Answer: c [Reason:] The search can be performed using a heuristic pair wise alignment search program such as BLAST or FASTA. However, the use of dynamic programming based search programs such as SSEARCH or ScanPS can result in more sensitive search results. The relatively small size of the structural database means that the search time using the exhaustive method is still within reasonable limits, while giving a more sensitive result to ensure the best possible similarity hits.

3. Which of the following is untrue about Sequence Alignment Step?
a) Once the structure with the highest sequence similarity is identified as a template, the full-length sequences of the template and target proteins need to be realigned using refined alignment algorithms to obtain optimal alignment
b) The realignment is the most critical step in homology modeling
c) The realignment directly affects the quality of the final model
d) Errors made in the alignment step can be corrected in the following modeling steps

Answer: d [Reason:] incorrect alignment at this stage leads to incorrect designation of homologous residues and therefore to incorrect structural models. Errors made in the alignment step cannot be corrected in the following modeling steps. Therefore, the best possible multiple alignment algorithms, such as Praline and T-Coffee should be used for this purpose.

4. Which of the following is untrue about Backbone Model Building Step?
a) Once optimal alignment is achieved, residues in the aligned regions of the target protein can assume a similar structure as the template proteins
b) Coordinates of the corresponding residues of the template proteins can be simply copied onto the target protein
c) If the two residues differ, everything other than the backbone atoms can be copied
d) If the two aligned residues are identical, coordinates of the side chain atoms are copied along with the main chain atoms

Answer: c [Reason:] Option a and b mean the same. If the two residues differ, only the backbone atoms can be copied. The side chain atoms are rebuilt in a subsequent procedure. In backbone modeling, it is simplest to use only one template structure. The structure with the best quality and highest resolution is normally chosen if multiple options are available.

5. Which of the following is untrue about Loop Modeling Step?
a) In the sequence alignment for modeling, there are often regions caused by insertions and deletions producing gaps in sequence alignment
b) In the sequence alignment for modeling, there are no regions producing gaps in sequence alignment
c) The gaps cannot be directly modeled
d) Loop modeling is required for closing the gaps requires

Answer: b [Reason:] Closing the gaps requires loop modeling, which is a very difficult problem in homology modeling and is also a major source of error. Loop modeling can be considered a mini–protein modeling problem by itself. Unfortunately, there are no mature methods available that can model loops reliably. Currently, there are two main techniques used to approach the problem: the database searching method and the ab initio method.

6. The procedure begins by measuring the orientation and distance of the anchor regions in the stems and searching PDB for segments of the same length that also match the above endpoint conformation.
a) True
b) False

Answer: a [Reason:] Usually, many different alternative segments that fit the endpoints of the stems are available. The best loop can be selected based on sequence similarity as well as minimal steric clashes with the neighboring parts of the structure. The conformation of the best matching fragments is then copied onto the anchoring points of the stems.

7. Which of the following is untrue about specialized programs for loop modeling?
a) PETRA is a web server that models loops using the database approach
b) FREAD is a web server that models loops using the database approach
c) CODA is a web server that uses a consensus method based on the prediction results from FREAD and PETRA
d) For loops of three to eight residues, CODA uses consensus conformation of both methods

Answer: a [Reason:] PETRA is a web server that uses the ab initio method to model loops. For nine to thirty residues, CODA uses FREAD prediction only.

8. In Side Chain Refinement step, A side chain can be built by searching every possible conformation at every torsion angle of the side chain to select the one that has the lowest interaction energy with neighboring atoms.
a) True
b) False

Answer: a [Reason:] However, this approach is computationally prohibitive in most cases. In fact, most current side chain prediction programs use the concept of rotamers, which are favored side chain torsion angles extracted from known protein crystal structures. In prediction of side chain conformation, only the possible rotamers with the lowest interaction energy with nearby atoms are selected.

9. In the step of Model Refinement Using Energy Function, the structural irregularities can be corrected by applying the energy minimization procedure on the entire model, which moves the atoms in such a way that the overall conformation has the lowest energy potential.
a) True
b) False

Answer: a [Reason:] In these loop modeling and side chain modeling steps, potential energy calculations are applied to improve the model. However, this does not guarantee that the entire raw homology model is free of structural irregularities such as unfavorable bond angles, bond lengths, or close atomic contacts. There, this step is used. The goal of energy minimization is to relieve steric collisions and strains without significantly altering the overall structure.

10. Energy minimization has to be used with caution because excessive energy minimization often moves residues away from their correct positions.
a) True
b) False

Answer: a [Reason:] only limited energy minimization is recommended (a few hundred iterations) to remove major errors, such as short bond distances and close atomic clashes. Key conserved residues and those involved in cofactor binding have to be restrained if necessary during the process.

11. Which of the following is untrue about Ab initio prediction?
a) The limited knowledge of protein folding forms the basis of ab initio prediction
b) The ab initio prediction method attempts to produce all-atom protein models based on sequence information alone without the aid of known protein structures
c) The ab initio prediction method attempts to produce all-atom protein models based on sequence information alone with some aid of known protein structures
d) The perceived advantage of this method is that predictions are not restricted by known folds and that novel protein folds can be identified

Answer: c [Reason:] Alongside the advantages, because the physicochemical laws governing protein folding are not yet well understood, the energy functions used in the ab initio prediction are, at present, rather inaccurate. The folding problem remains one of the greatest challenges in bioinformatics today.

12. The prediction programs are thus designed using the energy minimization principle.
a) True
b) False

Answer: a [Reason:] Current ab initio algorithms are not yet able to accurately simulate the protein folding process. They work by using some type of heuristics. Because the native state of a protein structure is near energy minimum, the prediction programs are thus designed using the energy minimization principle.

13. Searching for a fold with the absolute minimum energy may not be valid in reality.
a) True
b) False

Answer: a [Reason:] These algorithms search for every possible conformation to find the one with the lowest global energy. However, searching for a fold with the absolute minimum energy may not be valid in reality. This contributes to one of the fundamental flaws of this approach. In addition, searching for all possible structural conformations is not yet computationally feasible.

14. Rosetta is a web server that predicts protein three-dimensional conformations using the ab initio method.
a) True
b) False

Answer: a [Reason:] This in fact relies on a “mini-threading” method. The method first breaks down the query sequence into many very short segments (three to nine residues) and predict the secondary structure of the small segments using a Hidden Markov model–based program, HMMSTR.

15. In Rosetta, The segments with assigned _______ structures are subsequently assembled into a ______ dimensional configuration.
a) primary, three
b) secondary, three
c) secondary, two
d) primary, three

Answer: b [Reason:] Through random combinations of the fragments, a large number of models are built and their overall energy potentials calculated. The conformation with the lowest global free energy is chosen as the best model.

## Set 5

1. Which of the following is untrue about FGENES?
a) It stands for FindGenes
b) It is a web-based program that uses LDA
c) It is used to determine whether a signal is an exon
d) It does not make a use of HMMs

Answer: d [Reason:] In addition to FGENES, there are many variants of the program. Some programs, such as FGENESH, make use of HMMs. There are others, such as FGENESH C, that are similarity based. Some programs, such as FGENESH+, combine both ab initio and similarity-based approaches.

2. GENSCAN is awebbased program that makes predictions based on fifth-order HMMs.
a) True
b) False

Answer: a [Reason:] It combines hexamer frequencies with coding signals (initiation codons, TATA box, cap site, poly-A, etc.) in prediction. Putative exons are assigned a probability score (P) of being a true exon. Only predictions with P > 0.5 are deemed reliable. This program is trained for sequences from vertebrates, Arabidopsis, and maize. It has been used extensively in annotating the human genome.

3. Which of the following wrong about HMM GENE?
a) It is also an HMM-based web program
b) It uses a criterion called the conditional maximum likelihood to discriminate coding from non-coding features
c) HMM prediction is unbiased toward the locked region
d) If a sequence already has a sub-region identified as coding region, which may be based on similarity with cDNAs or proteins in a database, these regions are locked as coding regions

Answer: c [Reason:] An HMM prediction is subsequently made with a bias toward the locked region and is extended from the locked region to predict the rest of the gene coding regions and even neighboring genes. The program is in a way a hybrid algorithm that uses both ab initio-based and homology-based criteria.

4. Which of the following is untrue about Homology-Based Programs?
a) They are based on the fact that exon structures and exon sequences of related species are less conserved
b) This approach assumes that the database sequences are correct
c) It is a reasonable assumption in light of the fact that many homologous sequences to be compared with are derived from cDNA or expressed sequence tags (ESTs) of the same species
d) Potential coding frames in a query sequence are translated and used to align with closest protein homologs found in databases

Answer: a [Reason:] Homology-based programs are based on the fact that exon structures and exon sequences of related species are highly conserved. When potential coding frames in a query sequence are translated and used to align with closest protein homologs found in databases, near perfectly matched regions can be used to reveal the exon boundaries in the query.

5. The drawback of Homology-based approach is its reliance on the presence of homologs in databases.
a) True
b) False

Answer: a [Reason:] If the homologs are not available in the database, the method cannot be used. Novel genes in a new species cannot be discovered without matches in the database. A number of publicly available programs use this approach.

6. GenomeScan is a web-based server that combines GENSCAN prediction results with BLASTX similarity searches.
a) True
b) False

Answer: a [Reason:] The user provides genomic DNA and protein sequences from related species. The genomic DNA is translated in all six frames to cover all possible exons. The translated exons are then used to compare with the user-supplied protein sequences.

7. Which of the following is untrue about EST2Genome?
a) It is a web-based program purely based on the sequence alignment approach to define intron–exon boundaries
b) It compares an EST (or cDNA) sequence with a genomic DNA sequence containing the corresponding gene
c) The alignment is rarely done using a dynamic programming–based algorithm
d) Advantage of the approach is the ability to find very small exons and alternatively spliced exons that are very difficult to predict by any ab initio–type algorithms

Answer: c [Reason:] The alignment is done using a dynamic programming–based algorithm. Another advantage is that there is no need for model training, which provides much more flexibility for gene prediction. The limitation is that EST or cDNA sequences often contain errors or even introns if the transcripts are not completely spliced before reverse transcription.

8. Which of the following is untrue about SGP-1?
a) The program translates all potential exons in each sequence and does pair wise alignment for the translated protein sequences using a dynamic programming approach
b) The near-perfect matches at the protein level define coding regions
c) It is a similarity-based web program that aligns two genomic DNA sequences from distinctly related organisms
d) It stands for Syntenic Gene Prediction

Answer: c [Reason:] It aligns two genomic DNA sequences from closely related organisms. Similar to EST2Genome, there is no training needed. The limitation is the need for two homologous sequences having similar genes with similar exon structures; if this condition is not met, a gene escapes detection from one sequence when there is no counterpart in another sequence.

9. TwinScan is also a similarity-based gene-finding Server and it is similar to GenomeScan in that it uses GenScan to predict all possible exons from the genomic sequence.
a) True
b) False