Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages
Filter by Categories
nmims post
Objective Type Set
Online MCQ Assignment
Question Solution
Solved Question
Uncategorized

Multiple choice question for engineering

Set 1

1. ______ molecules can simply be identified based on their sequence similarity with already-known sequences.
a) Larger, less conserved
b) Larger, highly conserved
c) smaller, highly conserved
d) shorter, highly conserved

View Answer

Answer: b [Reason:] For smaller sequences with more sequence variation, this method does not work. A number of methods for finding small RNA genes have been described and are available on the Web. A major problem with these methods in searches of large genomes is that a small false positive rate becomes quite unacceptable because there are so many false positives to check out.

2. One of the first methods used to find tRNA genes was to search for sequences that are complementary and can fold into a knot like the three found in tRNAs.
a) True
b) False

View Answer

Answer: b [Reason:] One of the first methods used to find tRNA genes was to search for sequences that are self-complementary and can fold into a hairpin like the three found in tRNAs (Staden 1980). Through the regions of self-complimentarity it was first possible to find the tRNA.

3. Fichant and Burks (1991) described a program, tRNAscan, that searches a genomic sequence with a sliding window searching simultaneously for matches to a set of invariant bases and conserved self-complementary regions in tRNAs with an accuracy of 97.5%.
a) True
b) False

View Answer

Answer: a [Reason:] A method for finding the RNA polymerase III transcriptional control regions of tRNA genes using a scoring matrix derived from known control regions, was derived. That is also very accurate. Finally, Lowe and Eddy (1997) have devised a search algorithm tRNAscan-SE that uses a combination of three methods to find tRNA genes in genomic sequences—tRNAscan, the Pavesi algorithm, and the COVELS program based on sequence covariance analysis (Eddy and Durbin 1994). This method is reportedly 99–100% accurate with an extremely low rate of false positives.

4. The probabilistic model was used to identify small nucleolar (sno) RNAs in the yeast genome that methylate ribosomal RNA.
a) True
b) False

View Answer

Answer: a [Reason:] The model is not used to search genomic sequences directly. Instead, a list of candidate sequences is first found by searching for patterns that match the sequences in the model (Lowe and Eddy 1999).

5. The probability model mentioned above was a hybrid combination of HMMs and SCFGs trained on sno RNAs.
a) True
b) False

View Answer

Answer: a [Reason:] These RNAs vary sufficiently in sequence and structure that they are not found by straightforward similarity searches. The RNAs found were shown to be sno RNAs by insertional mutagenesis.

6. Which of the following is untrue regarding RNA structure?
a) RNA structure 4.6 is a Windows implementation of the Zuker algorithm
b) It includes additional options for other folding algorithms and incorporation of experimental data
c) The authors of RNA structure collaborate very closely with the Turner laboratory and keep the most up-to-date thermodynamic parameters
d) The OligoWalk program cannot be used for siRNA design

View Answer

Answer: d [Reason:] Two unique ways of incorporating experimental data in the RNA folding is done with Dynalign and chemical modification. The Dynalign program computes the lowest free-energy sequence alignment and secondary structure common to two RNA sequences.

7. Which of the following is untrue about Vienna RNA Websuite?
a) It introduced the Wuchty algorithm, developed applications of the McCaskill algorithm
b) It also offers a wide variety of algorithms and functions
c) The Wuchty algorithm generates a small but complete set of suboptimal structures
d) The Wuchty algorithm computes some possible tertiary structures within a narrow free-energy range

View Answer

Answer: d [Reason:] The Wuchty algorithm computes all possible secondary structures within a narrow free-energy range. The Wuchty algorithm generates a small but complete set of suboptimal structures that may include some very different secondary structures but also very many highly similar structures. However, structures containing more than one suboptimal region may occur in the Wuchty set of structures but would be absent if the Zuker method for sampling suboptimal structures were used.

8. Which of the following is untrue about the Sfold algorithm?
a) It uses a unique algorithm to aid in the design of siRNA
b) The algorithm combines thermodynamic stabilities, calculations of target accessibility , and empirical rules
c) The website offers specialized programs for the design of siRNA, antisense RNA, trans-cleaving RNA, and mRNA-microRNA interactions
d) The website doesn’t offer programs for the design of a general program for statistically sampling suboptimal RNA structures

View Answer

Answer: d [Reason:] The algorithm uses a partition function calculation and then groups suboptimal structures by similarity .The centroid structure is the most-representative structure that is closest in similarity to all the other structures.

9. If the centroid structure is different from the minimum free-energy structure, the centroid structure is often closer to the phylogenetic prediction and contains fewer base pairs, or fewer false-positive base pair predictions, than the minimum free-energy prediction.
a) True
b) False

View Answer

Answer: a [Reason:] The point is to show a structure that represents a group of structures rather than a single predicted structure. Many long RNA sequences, such as viral genomes or mRNA, may not have a single structure but instead have a dynamic structure that has some conserved features but also varies and changes, and these many conformations may all exist simultaneously in the cell.

10. The ILM program uses an iterative loop matching algorithm to maximize base pairs and allows pseudoknots to form by allowing base .
pairs to be added or removed in successive rounds.
a) True
b) False

View Answer

Answer: a [Reason:] The Nussinov algorithm, or maximum loop matching algorithm, is the basic framework for generating a structure with the most possible base pairs. The base pairs are ranked using both thermodynamic parameters and covariation data for aligned sequences. ILM requires the RnaViz program to visualize the RNA secondary structure with pseudoknots.

Set 2

1. In a similar way to structure prediction methods models can be evaluated using RMSD to measure the similarity between two molecular complexes.
a) True
b) False

View Answer

Answer: a [Reason:] This is only the case if an experimental structure already exists. Several docking methods have been evaluated at ‘Critical Assessment of Structure Prediction 2’ (CASP2) for both protein-ligand and protein-protein interactions.

2. In case of protein-protein docking, the level of success is dependent on the system under study.
a) True
b) False

View Answer

Answer: a [Reason:] For the protein-protein docking evaluation several of the same docking methodologies were used in both docking challenges. In the first challenge involving a protein inhibitor all the groups successfully predicted the complex.

3. There was little degree of success in docking an antibody-antigen complex in the second challenge for the protein-protein docking evaluation.
a) True
b) False

View Answer

Answer: a [Reason:] This may be because modeling molecular recognition is in general more difficult in antibody-antigen than protein-inhibitor systems. In the protein-ligand docking challenge at CASP2 there was generally a good level of success, however, again certain targets proved to be problematic.

4. The ______ is an ongoing community-wide experiment on the comparative evaluation of protein-protein docking for structure prediction.
a) Chronological Assignment of Prediction of Interactions (CAPRI)
b) Chronological Assessment of Prediction of Interactions (CAPRI)
c) Critical Assignment of Prediction of Interactions (CAPRI)
d) Critical Assessment of Prediction of Interactions (CAPRI)

View Answer

Answer: d [Reason:] Clearly there is an ongoing need for assessment of methods. There is currently no one universal method or scoring function that will work on all occasions.

5. An understanding of the importance of different factors in a particular interaction is important if confidence in the results is required. Which of the following are not those factors?
a) Shape
b) Hydrophobicity
c) Electrostatics
d) pH

View Answer

Answer: d [Reason:] Not that the parameter pH is not related at all but is quite distinctly related to the mentioned factors. Also, pH is relative to the state of the molecule at given instant, other factors being evolutionary relationships and conformational flexibility.

6. Information about shape, hydrophobicity, electrostatics, evolutionary relationships and conformational flexibility is not always available and the search for a universally applicable scoring function as well as an adequate treatment of conformational flexibility is ongoing.
a) True
b) False

View Answer

Answer: a [Reason:] The technique of virtual screening of small molecule drugs by computer has become commonplace and a very useful tool in narrowing down the very large number of drugs that might need to be screened by experimental methods. In one study virtual screening provided lead compounds where conventional experimental random screening had failed.

7. In practical circumstances there exists an experimental structure of the complex.
a) True
b) False

View Answer

Answer: a [Reason:] In practical circumstances an experimental structure of the complex does not already exist. Some other evaluation criteria are needed. This is by necessity experimental validation. The hypothetical protein-biomolecular complex predicts a mode of interaction between the two molecules that can be tested experimentally.

8. Visualization methods are very important in viewing molecular properties on molecules. Of particular note is the rendering of molecular surfaces according to their various properties (that can be expressed numerically).
a) True
b) False

View Answer

Answer: a [Reason:] This approach has been popularized by the GRASP program. Also Virtual Reality Modeling Language (VRML) viewers can provide similar displays.

9. The popular program RasMol can be made to view molecular properties by assigning those properties to the temperature factor column of the sdf file in question only.
a) True
b) False

View Answer

Answer: b [Reason:] The popular program RasMol can also be made to view molecular properties by assigning those properties to the temperature factor column of the PDB file in question. The molecules are then best viewed in Spacefill mode, however, the color scheme used is not flexible.

10. The GRASP and VRLM have been incorporated into the GRASS server.
a) True
b) False

View Answer

Answer: a [Reason:] This allows a Web-based interactive exploration of molecules in the PDB allowing the molecular properties to be viewed on the molecular surface. There are also several other popular molecular graphics programs that allow a similar visualization.

Set 3

1. Related sequences are identified through the database similarity searching and as the process generates multiple matching sequence pairs, it is often necessary to convert the numerous pair wise alignments into a single alignment.
a) True
b) False

View Answer

Answer: a [Reason:] A natural extension of pair wise alignment is multiple sequence alignment, which is to align multiple related sequences to achieve optimal matching of the sequences. Related sequences are identified through the database similarity searching. As the process generates multiple matching sequence pairs, it is often necessary to convert the numerous pair wise alignments into a single alignment, which arranges sequences in such a way that evolutionarily equivalent positions across all sequences are matched.

2. There is a unique advantage of multiple sequence alignment because it reveals more biological information than many pair wise alignments can.
a) True
b) False

View Answer

Answer: a [Reason:] It is truly an advantage of multiple sequence alignment. For example, it allows the identification of conserved sequence patterns and motifs in the whole sequence family, which are not obvious to detect by comparing only two sequences.

3. Which of the following cannot be related to multiple sequence alignment?
a) Many conserved and functionally critical amino acid residues can be identified in a protein multiple alignment
b) Multiple sequence alignment is also an essential prerequisite to carrying out phylogenetic analysis of sequence families and prediction of protein secondary and tertiary structures
c) Multiple sequence alignment also has applications in designing degenerate polymerase chain reaction (PCR) primers based on multiple related sequences
d) This method does not contribute much to degenerate polymerase chain reaction (PCR) primers creation

View Answer

Answer: d [Reason:] In practice, heuristic approaches are most often used. Multiple sequence alignment has applications in designing degenerate (PCR) primers based on multiple related sequences.

4. The scoring function for multiple sequence alignment is based on the concept of sum of pairs (SP).
a) True
b) False

View Answer

Answer: a [Reason:] Multiple sequence alignment is to arrange sequences in such a way that a maximum number of residues from each sequence are matched up according to a particular scoring function and is based on the concept of sum of pairs (SP). As the name suggests, it is the sum of the scores of all possible pairs of sequences in a multiple alignment based on a particular scoring matrix.

5. Which of the following scores are not considered while calculating the SP scores?
a) All possible pair wise matches
b) All possible mismatches
c) All possible gap costs
d) Number of gap penalties

View Answer

Answer: d [Reason:] In calculating the SP scores, each column is scored by summing the scores for all possible pair wise matches, mismatches and gap costs. The score of the entire alignment is the sum of all of the column scores. The score of the entire alignment is the sum of all of the column scores. In that case, option d becomes irrelevant choice here.

6. Given a multiple alignment of three sequences, the sum of scores is calculated as the sum of the dissimilarity scores of every pair of sequences at each position.
a) True
b) False

View Answer

Answer: b [Reason:] Given a multiple alignment of three sequences, the sum of scores is calculated as the sum of the similarity scores of every pair of sequences at each position. The scoring is based on the BLOSUM62 matrix. If the total score for the alignment is 5, which means that the alignment is 25 = 32 times more likely to occur among homologous sequences than by random chance.

7. There are two approaches viz. exhaustive and heuristic approaches used in multiple sequence alignment.
a) True
b) False

View Answer

Answer: a [Reason:] The exhaustive alignment method involves examining all possible aligned positions simultaneously. Similar to dynamic programming in pair wise alignment, which involves the use of a two-dimensional matrix to search for an optimal alignment, to use dynamic programming for multiple sequence alignment, extra dimensions are needed to take all possible ways of sequence matching into consideration.

8. In a multidimensional search matrix, for aligning N sequences, an (N+2)-dimensional matrix is needed to be filled with alignment scores.
a) True
b) False

View Answer

Answer: b [Reason:] In a multidimensional search matrix, for aligning N sequences, an N-dimensional matrix is needed to be filled with alignment scores. For instance, for three sequences, a three-dimensional matrix is required to account for all possible alignment scores. Back-tracking is applied through the three-dimensional matrix to find the highest scored path that represents the optimal alignment.

9. As the amount of computational time and memory space required increases exponentially with the number of sequences, it makes the multidimensional search matrix method computationally prohibitive to use for a large data set.
a) True
b) False

View Answer

Answer: a [Reason:] This is indeed the drawback of that method. For this reason, full dynamic programming is limited to small datasets of less than ten short sequences. For the same reason, few multiple alignment programs employing this “brute force” approach are publicly available.

10. Which of the following is untrue about DCA?
a) It stands for Divide-and-Conquer Alignment
b) It works by breaking each of the sequences into two smaller sections
c) The breaking points during the process are determined based on regional similarity of the sequences
d) If the sections are not short enough, further divisions are restricted as well

View Answer

Answer: d [Reason:] Is a web-based program that is in fact semi exhaustive because certain steps of computation are reduced to heuristics. If the sections are not short enough, further divisions are carried out. When the lengths of the sequences reach a predefined threshold, dynamic programming is applied for aligning each set of subsequences. The resulting short alignments are joined together head to tail to yield a multiple alignment of the entire length of all sequences.

Set 4

1. In scoring matrices, for convenience, odds scores are converted to log odds scores.
a) True
b) False

View Answer

Answer: a [Reason:] The odds scores are converted to log odds scores so that the values for amino acid pairs in an alignment may be summed to obtain the log odds score of the alignment. In this case, the logarithms are calculated to the base 2 and multiplied by 2 to give values designated as half-bits (a bit is the unit of an odds score that has been converted to a logarithm to the base 2). The value of 4 indicates that the 4 amino acid alignment is 2(4/2) = 4 four-fold more likely than expected by chance.

2. Which of the following doesn’t describe PAM matrices?
a) This family of matrices lists the likelihood of change from one amino acid to another in homologous protein sequences during evolution
b) There is presently no other type of scoring matrix that is based on such sound evolutionary principles as are these matrices
c) Even though they were originally based on a relatively small data set, the PAM matrices remain a useful tool for sequence alignment
d) It stands for Percent Altered Mutation

View Answer

Answer: d [Reason:] PAM stands for Percent Accepted Mutation. In this, each matrix gives the changes expected for a given period of evolutionary time, evidenced by decreased sequence similarity as genes encoding the same protein diverge with increased evolutionary time.

3. The assumption in this evolutionary model is that the amino acid substitutions observed over short periods of evolutionary history can be extrapolated to longer distances.
a) True
b) False

View Answer

Answer: a [Reason:] The BLOSUM matrices are based on scoring substitutions found over a range of evolutionary periods and reveal that substitutions are not always as predicted by the PAM model. The purpose of assumption in this evolutionary model is to make predictions.

4. Which of the following is untrue about the modification of PAM matrices?
a) At one time, the PAM250 scoring matrix was modified in an attempt to improve the alignment obtained
b) All scores for matching a particular amino acid were normalized to the same mean and standard deviation, and all amino acid identities were given the same score to provide an equal contribution for each amino acid in a sequence alignment
c) This took place in 1976
d) These modifications were included as the default matrices for the GCG sequence alignment programs in versions 8 and earlier and are optional in later versions

View Answer

Answer: c [Reason:] This event took place in 1986 by Gribskov and Burgess. However, they are not recommended because they will not give an optimal alignment that is in accordance with the evolutionary model.

5. The Dayhoff model of protein evolution is not a Markov process.
a) True
b) False

View Answer

Answer: b [Reason:] The Dayhoff Model of Protein Evolution as Used in PAM Matrices is a Markov process. In Analysis of the Dayhoff Model, each amino acid site in a protein can change at any time to any of the other 20 amino acids with probabilities given by the PAM table, and the changes that occur at each site are independent of the amino acids found at other sites in the protein and depend only on the current amino acid at the site.

6. Which of the following is true regarding the assumptions in the method of constructing the
Dayhoff scoring matrix?
a) it is assumed that each amino acid position is equally mutable
b) it is assumed that each amino acid position is not equally mutable
c) it is assumed that each amino acid position is not mutable at all
d) sites do not vary in their degree of mutability

View Answer

Answer: a [Reason:] In this process, first, it is assumed that each amino acid position is equally mutable, whereas, in fact, sites vary considerably in their degree of mutability. Mutagenesis hot spots are well known in molecular genetics, and variations in mutability of different amino acid sites in proteins are well known.

7. The more conserved amino acids in similar proteins from different species are ones that play an essential role in structure and function and the less conserved are in sites that can vary without having a significant effect on function.
a) True
b) False

View Answer

Answer: a [Reason:] there are many factors that influence both the location and types of amino acid changes that occur in proteins. Wilbur (1985) has tested the Markov model of evolution and has shown that it can be valid if certain changes are made in the way that the PAM matrices are calculated.

8. A gap opening penalty for any gap (g) and a gap extension penalty for each element in the gap (r) are most often used, to give a total gap score wx, according to the equation ______
a) wx – rx = -g
b) wx = g – rx
c) wx = g + rx
d) wx + g + rx = 0

View Answer

Answer: c [Reason:] wx = g + rx is the equation where x is the length of the gap. in some formulations of the gap penalty, the equation wx = g + r (x – 1) is used. Thus, the gap extension penalty is not added to the gap opening penalty until the gap size is 2.

9. In the GCG and FASTA program suites, the scoring matrix itself is formatted in a way that includes default ______
a) gap additions
b) alignment scores
c) score penalties
d) gap penalties

View Answer

Answer: d [Reason:] These program suites include default gap penalties. When deciding gap penalties for local alignment programs, a consideration is that the penalties should be large enough to provide a local alignment of the sequences.

10. In case of the varying alignment, penalizing gaps heavily might occur. Then the best scoring local alignment between the sequences will be one that optimizes the score between matches and mismatches, without any gaps.
a) True
b) False

View Answer

Answer: a [Reason:] If both mismatches and gaps are heavily penalized, the resulting alignment will also be a local alignment that contains the longest region of exact matches. In the above two cases, the alignment score of the highest-scoring local alignment will increase as the logarithm of the length of the sequences. Under these same conditions, the score of the corresponding global alignment between the sequences will be negative.

Set 5

1. Which of the following is not a correct about FASTA?
a) Its stands for FAST ALL
b) It was in fact the first database similaritysearch tool developed, preceding the development of BLAST
c) FASTA uses a ‘hashing’ strategy to find matches for a short stretch of identical residues with a length of k
d) The string of residues is known as blocks

View Answer

Answer: d [Reason:] The string of residues is known as ktuples or ktups, which are equivalent to words inBLAST, but are normally shorter than the words. Typically, a ktup is composed of tworesidues for protein sequences and six residues for DNA sequences.

2. The first step in FASTA alignment id to arrange the sequences in matrices’ rows and columns in order to be analyzed.
a) True
b) False

View Answer

Answer: b [Reason:] The first step in FASTA alignment is to identify ktups between two sequences by using the hashing strategy. This strategy works by constructing a look up table that shows the position of each ktup for the two sequences under consideration. The mentioned method is similar to that of BLAST.

3. The positional difference for each word between the two sequences is obtained by _____the position of the ____ sequence from that of the ____ sequence and is expressed as the offset.
a) subtracting, second, first
b) adding, second, first
c) adding, first, second
d) subtracting, first, second

View Answer

Answer: d [Reason:] The positional difference for each word between the two sequences is obtained by subtracting the position of the first sequence from that of the second sequence and is expressed as the offset. The ktups that have the same offset values are then linked to reveal a contiguous identical sequence region that corresponds to a stretch of diagonal in a two-dimensional matrix.

4. The second step in FASTA is to narrow down the high similarity regions between the two sequences.
a) True
b) False

View Answer

Answer: a [Reason:] Normally, many diagonals between the two sequences can be identified in the hashing step. The top ten regions with the highest density of diagonals are identified as high similarity regions. The diagonals in these regions are scored using a substitution matrix.

5. In FASTA, neighboring high-scoring segments along the same diagonal are selected and joined to form a single alignment.
a) True
b) False

View Answer

Answer: a [Reason:] This step allows introducing gaps between the diagonals while applying gap penalties. The score of the gapped alignmentis calculated again. In step 3, the gapped alignment is refined further using theSmith–Waterman algorithm to produce a final alignment.

6. The last step is to perform a statistical evaluation of the final alignment as in BLAST, which produces the E-value.
a) True
b) False

View Answer

Answer: a [Reason:] The last step of FASTA is similar to that of BLAST. The determination of E-value is the most important part of the whole analysis as it gives the degree of alignment between the sequences.

7. The web-based FASTA program is offered by the European Bioinformatics Institute.
a) True
b) False

View Answer

Answer: a [Reason:] Similar to BLAST, FASTA has a number of subprograms. The web-based FASTA program offered by the European Bioinformatics Institute allows the use of either DNA or protein sequences as the query to search against a protein database or nucleotide database.

8. FASTX, which compares a protein query sequence to a translated DNA database.
a) True
b) False

View Answer

Answer: b [Reason:] Some available variants of the program are FASTX and TFASTX. FASTX translates a DNA sequence and uses the translated protein sequence to query a protein database. TFASTX compares a protein query sequence to a translated DNA database.

9. FASTA doesn’t use bit scores.
a) True
b) False

View Answer

Answer: b [Reason:] FASTA uses E-values and bit scores as well. Estimation of the two parameters in FASTA is essentially the same as in BLAST. However, the FASTA output provides one morestatistical parameter, the Z-score.

10. Z-score describes the number of standard deviations from the mean score for the database search.
a) True
b) False

View Answer

Answer: a [Reason:] Because most of the alignments with the query sequence are with unrelated sequences, the higher the Z-score for a reported match, the further away from the mean of the score distribution, hence, the more significant the match. For a Z-score > 15, the match can be considered extremely significant, with certainty of a homologous relationship. If Z is in the range of 5 to 15, the sequence pair can be described as highly probable homologs. If Z<5, their relationship is described as less certain.