Multiple choice question for engineering
1. Which of the following is incorrect regarding pair wise sequence alignment?
a) The most fundamental process in this type of comparison is sequence alignment
b) It is an important first step toward structural and functional analysis of newly determined sequences
c) This is the process by which sequences are compared by searching for common character patterns and establishing residue–residue correspondence among related sequences
d) It is the process of aligning multiple sequences
Answer: d [Reason:] Pair wise sequence alignment is the process of aligning two sequences and is the basis of database similarity searching and multiple sequence alignment. As new biological sequences are being generated at exponential rates, sequence comparison is becoming increasingly important to draw functional and evolutionary inference of a new protein with proteins already existing in the database.
2. Which of the following is incorrect about evolution?
a) The macromolecules can be considered molecular fossils that encode the history of millions of years of evolution
b) The building blocks of these biological macromolecules, nucleotide bases, and amino acids form linear sequences that determine the primary structure of the molecules
c) DNA and proteins are products of evolution
d) The molecular sequences barely undergo changes
Answer: d [Reason:] During this time period, the molecular sequences undergo random changes, some of which are selected during the process of evolution. As the selected sequences gradually accumulate mutations and diverge over time, traces of evolution may still remain in certain portions of the sequences to allow identification of the common ancestry.
3. The presence of evolutionary traces is because some of the residues that perform key functional and structural roles tend to be preserved by natural selection; other residues that may be less crucial for structure and function tend to mutate more frequently.
Answer: a [Reason:] the residues that perform key functional and structural roles tend to be preserved by natural selection. For example, active site residues of an enzyme family tend to be conserved because they are responsible for catalytic functions. Therefore, by comparing sequences through alignment, patterns of conservation and variation can be identified.
4. The degree of sequence variation in the alignment reveals evolutionary relatedness of different sequences, whereas the conservation between sequences reflects the changes that have occurred during evolution in the form of substitutions, insertions, and deletions.
Answer: b [Reason:] The degree of sequence conservation in the alignment reveals evolutionary relatedness of different sequences, whereas the variation between sequences reflects the changes that have occurred during evolution in the form of substitutions, insertions, and deletions. Identifying the evolutionary relationships between sequences helps to characterize the function of unknown sequences. When a sequence alignment reveals significant similarity among a group of sequences, they can be considered as belonging to the same family.
5. If the two sequences share significant similarity, it is extremely ______ that the extensive similarity between the two sequences has been acquired randomly, meaning that the two sequences must have derived from a common evolutionary origin.
Answer: a [Reason:] Sequence alignment provides inference for the relatedness of two sequences under study. Regions that are aligned but not identical represent residue substitutions; regions where residues from one sequence correspond to nothing in the other represent insertions or deletions that have taken place on one of the sequences during evolution.
6. Sometimes, it is also possible that two sequences have derived from a common ancestor, but may have diverged to such an extent that the common ancestral relationships are not recognizable at the sequence level.
Answer: a [Reason:] There are examples of such paralogous genes that have distinct functions but similar origin. In that case, the distant evolutionary relationships have to be detected using other methods.
7. Which of the following is incorrect regarding sequence homology?
a) Two sequences can homologous relationship even if have do not have common origin
b) It is an important concept in sequence analysis
c) When two sequences are descended from a common evolutionary origin, they are said to have a homologous relationship
d) When two sequences are descended from a common evolutionary origin, they are said to share homology
Answer: a [Reason:] homologous relationships are more certain when the sequences have common evolutionary origin. A related but different term is sequence similarity, which is the percentage of aligned residues that are similar in physiochemical properties such as size, charge, and hydrophobicity.
8. Sequence similarity can be quantified using ________ homology is a ______ statement.
a) percentages, quantitative
b) percentages, qualitative
c) ratios, qualitative
d) ratios, quantitative
Answer: b [Reason:] similarity is a direct result of observation from the sequence Alignment. For example, one may say that two sequences share 40% similarity. It is incorrect to say that the two sequences share 40% homology. They are either homologous or nonhomologous.
9. Shorter sequences require higher cutoffs for inferring homologous relationships than longer sequences.
Answer: a [Reason:] For determining a homology relationship of two protein sequences, for example, if both sequences are aligned at full length, which is 100 residues long, an identity of 30% or higher can be safely regarded as having close homology. If their identity level falls between 20% and 30%, determination of homologous relationships in this range becomes less certain.
10. Sequence similarity and sequence identity are synonymous for nucleotide sequences and protein sequences as well.
Answer: b [Reason:] Sequence similarity and sequence identity are synonymous for nucleotide sequences. For protein sequences, however, the two concepts are very different. In a protein sequence alignment, sequence identity refers to the percentage of matches of the same amino acid residues between two aligned sequences. Similarity refers to the percentage of aligned residues that have similar physicochemical characteristics and can be more readily substituted for each other.
1. There are a large number of protein folds available, compared to millions of protein sequences
Answer: b [Reason:] There are only small number of protein folds available (<1,000), compared to millions of protein sequences. This means that protein structures tend to be more conserved than protein sequences. Consequently, many proteins can share a similar fold even in the absence of sequence similarities. This allowed the development of computational methods to predict protein structures beyond sequence similarities.
2. Threading or structural fold recognition predicts the structural fold of an unknown protein sequence by fitting the sequence into a structural database and selecting the best-fitting fold.
Answer: a [Reason:] To determine whether a protein sequence adopts a known three-dimensional structure fold relies on threading and fold recognition methods. The comparison emphasizes matching of secondary structures, which are most evolutionarily conserved. Therefore, this approach can identify structurally similar proteins even without detectable sequence similarity.
3. The algorithms used here can be classified into two categories, pairwise energy based and profile based.
Answer: a [Reason:] The pairwise energy–based method was originally referred to as threading and the profile-based method was originally defined as fold recognition. However, the two terms are now often used interchangeably without distinction in the literature.
4. In the pairwise energy based method, a protein sequence is searched for in a structural fold database to find the best matching structural fold using ______ criteria.
Answer: c [Reason:] The detailed procedure involves aligning the query sequence with each structural fold in a fold library. The alignment is performed essentially at the sequence profile level using dynamic programming or heuristic approaches. Local alignment is often adjusted to get lower energy and thus better fitting. The adjustment can be achieved using algorithms such as double-dynamic programming.
5. The next step in the pairwise energy based method is to build a crude model for the target sequence by replacing aligned residues in the template structure with the corresponding residues in the query.
Answer: a [Reason:] After the mentioned step, the last step is to calculate the energy terms of the raw model, which include pairwise residue interaction energy, solvation energy, and hydrophobic energy. Finally, the models are ranked based on the energy terms to find the lowest energy fold that corresponds to the structurally most compatible fold.
6. Which of the following is untrue about profile method?
a) A profile is constructed for a group of related protein structures
b) The propensity of amino acids in not in picture of this method
c) Statistical information from these aligned residues is then used to construct a profile
d) The structural profile is generated by superimposition of the structures to expose corresponding residues
Answer: b [Reason:] The profile contains scores that describe the propensity of each of the twenty amino acid residues to be at each profile position. The profile scores contain information for secondary structural types, the degree of solvent exposure, polarity, and hydrophobicity of the amino acids. To predict the structural fold of an unknown query sequence, the query sequence is first predicted for its secondary structure, solvent accessibility, and polarity.
7. Because threading and fold recognition detect structural homologs ________ relying on sequence similarities, they have been shown to be _______ than PSI-BLAST in finding distant evolutionary relationships
a) without completely, far more sensitive
b) completely, far more sensitive
c) completely, less sensitive
d) without completely, less sensitive
Answer: a [Reason:] In many cases, they can identify more than twice as many distant homologs than PSI-BLAST. However, this high sensitivity can also be their weakness because high sensitivity is often associated with low specificity. The predictions resulting from threading and fold recognition often come with very high rates of false positives. Therefore, much caution is required in accepting the prediction results.
8. Which of the following is untrue about threading and fold recognition?
a) It assess the compatibility of an amino acid sequence with a known structure in a fold library
b) If the protein fold to be predicted does not exist in the fold library, the method won’t necessarily fail
c) If the protein fold to be predicted does not exist in the fold library, the method will fail
d) Threading and fold recognition do not generate fully refined atomic models for the query sequences
Answer: b [Reason:] A disadvantage compared to homology modeling lies in the fact that threading and fold recognition do not generate fully refined atomic models for the query sequences. This is because accurate alignment between distant homologs is difficult to achieve. Instead, threading and fold recognition procedures only provide a rough approximation of the overall topology of the native structure.
9. Which of the following is untrue about 3D-PSSM?
a) It is a web-based program that employs the structural profile method to identify protein folds
b) The profiles for each protein superfamily are constructed by combining multiple smaller profiles
c) A protein structural superfamily doesn’t have sequence-based PSI-BLAST profile
d) In initial steps, protein structures in a superfamily based on the SCOP classification are superimposed
Answer: c [Reason:] First, protein structures in a superfamily based on the SCOP classification are superimposed and are used to construct a structural profile by incorporating secondary structures and solvent accessibility information for corresponding residues. In addition, each member in a protein structural superfamily has its own sequence-based PSI-BLAST profile computed. These sequence profiles are used in combination with the structure profile to forma large superfamily profile in which each position contains both sequence and structural information.
10. Which of the following is true about Gen Threader?
a) It is a web-based program that uses a hybrid of the profile and pairwise energy methods
b) It is a web-based program that uses profile methods only
c) It is a web-based program that uses pairwise energy methods only
d) The initial step is quite dissimilar to 3D-PSSM
Answer: a [Reason:] The initial step is similar to 3D-PSSM; the query protein sequence is subject to three rounds of PSI-BLAST. The resulting multiple sequence hits are used to generate a profile. Its secondary structure is predicted using PSIPRED. Both are used as input for threading computation based on a pairwise energy potential method. The threading results are evaluated using neural networks that combine energy potentials, sequence alignment scores, and length information to create a single score representing the relationship between the query and template proteins.
1. RNA structures can be experimentally determined using _____
a) x-ray crystallography techniques only
b) NMR techniques only
c) x-ray crystallography or NMR techniques
d) gel electrophoresis
Answer: c [Reason:] However, the two approaches are extremely time consuming and expensive. As a result, computational prediction has become an attractive alternative. Option d, here, becomes irrelevant as it comes to the structure of RNA.
2. Which of the following is not a form of RNA?
Answer: c [Reason:] It is known that RNA is a carrier of genetic information and exists in three main forms. They are messenger RNA (mRNA), ribosomal RNA (rRNA), and transfer RNA (tRNA). Their main roles are as follows: mRNA is responsible for directing protein synthesis; rRNA provides structural scaffolding within ribosomes; and tRNA serves as a carrier of amino acids for polypeptide synthesis.
3. Unlike DNA, which is mainly double stranded, RNA is double stranded, although an
RNA molecule can self-hybridize at certain regions to form partial double-stranded structures.
Answer: b [Reason:] RNA is single stranded, although an RNA molecule can self-hybridize at certain regions to form partial double-stranded structures. Generally, mRNA is more or less linear and non-structured, whereas rRNA and tRNA can only function by forming particular secondary and tertiary structures.
4. Structures are not much important when it comes to study of the functions.
Answer: b [Reason:] knowledge of the structures of these molecules is particularly important for understanding their functions. Difficulties in experimental determination of RNA structures make theoretical prediction a very desirable approach. In fact, computational-based analysis is a main tool in RNA-based drug design in pharmaceutical industry. In addition, knowledge of the secondary structures of rRNA is key for RNA-based phylogenetic analysis.
5. Which of the following is not a base of RNA?
a) Thymine (T)
b) Adenine (A)
c) Cytosine (C)
d) Guanine (G)
Answer: a [Reason:] RNA structures can be described at three levels as in proteins: primary, secondary, and tertiary. The primary structure is the linear sequence of RNA, consisting of four bases, adenine (A), cytosine (C), guanine (G), and uracil (U).
6. Base pairing, in RNA, is A–G and U–C.
Answer: b [Reason:] The secondary structure refers to the planar representation that contains base-paired regions among single-stranded regions. The base pairing is mainly composed of traditional Watson–Crick base pairing, which is A–U and G–C.
7. In addition to the canonical base pairing, there often exists non-canonical base pairing such as ___ and __ base paring.
a) G, U
b) G, C
c) U, C
d) A, C
Answer: a [Reason:] there often exists non-canonical base pairing such as G and U base paring. The G–U base pair is less stable and normally occurs within a double-strand helix surrounded by Watson–Crick base pairs. Finally, the tertiary structure is the three-dimensional arrangement of bases of the RNA molecule.
8. Four main subtypes of secondary structures can be identified. They are hairpin loops, bulge loops, interior loops, and multi-branch loops.
Answer: a [Reason:] Because the RNA tertiary structure is very difficult to predict, attention has been mainly focused on secondary structure prediction. Based on the arrangement of helical base pairing in secondary structures, the mentioned four main subtypes of secondary structures can be identified.
9. The ______ refers to a structure with two ends of a single-stranded region (loop) connecting a base-paired region (stem).
a) helical junctions
b) hairpin loop
c) bulge loop
d) interior loop
Answer: b [Reason:] The bulge loop refers to a single stranded region connecting two adjacent base-paired segments so that it “bubbles” out in the middle of a double helix on one side. The multi-branch loop is also called helical junctions.
10. The ____ refers to two single-stranded regions on opposite strands connecting two adjacent base-paired segments.
a) hairpin loop
b) interior loop
c) pseudoknot loop
d) helical junctions
Answer: b [Reason:] In addition to the traditional secondary structural elements, base pairing between loops of different secondary structural elements can result in a higher level of structures such as pseudoknots, kissing hairpins, and hairpin–bulge contact. A pseudoknot loop refers to base pairing formed between loop residues within a hairpin loop and residues outside the hairpin loop.
1. Which of the following is untrue regarding the transmembrane proteins?
a) Constitute up to 30% of all cellular proteins
b) They are responsible for performing a wide variety of important functions in a cell, such as signal transduction, cross-membrane transport, and energy conversion
c) The membrane proteins are also of tremendous biomedical importance
d) They are not drug targets or receptors
Answer: d [Reason:] The membrane proteins are also of tremendous biomedical importance, as they often serve as drug targets for pharmaceutical development. There are two types of integral membrane proteins: α-helical type and β-barrel type. Most transmembrane proteins contain solely α-helices, which are found in the cytoplasmic membrane. A few membrane proteins consist of β-strands forming a β- barrel topology, a cylindrical structure composed of antiparallel β-sheets.
2. The structures of this group of proteins, however, are comparatively a lot difficult to resolve either by x-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy.
Answer: a [Reason:] For this group of proteins, prediction of the transmembrane secondary structural elements and their organization is particularly important. Fortunately, the prediction process is somewhat easier because of the hydrophobic environment of the lipid bilayers, which restricts the transmembrane segments to be hydrophobic as well.
3. Which of the following is untrue regarding Prediction of Helical Membrane Proteins?
a) For membrane proteins consisting of transmembrane α–helices, these transmembrane helices are predominantly hydrophobic with a specific distribution of positively charged residues
b) The α-helices generally run perpendicular to the membrane plane
c) The α-helices generally run parallel to the membrane plane
d) The α-helices have an average length between seventeen and twenty-five residues
Answer: c [Reason:] The hydrophobic helices are normally separated by hydrophilic loops with average lengths of fewer than sixty residues. The residues bordering the transmembrane spans are more positively charged. Another feature indicative of the presence of transmembrane segments is that residues at the cytosolic side near the hydrophobic anchor are more positively charged than those at the lumenal or periplasmic side. This is known as the positive-inside rule.
4. The early algorithms based their prediction on hydrophobicity scales.
Answer: a [Reason:] A number of algorithms for identifying transmembrane helices have been developed where the early algorithms based their prediction on hydrophobicity scales. They typically scan a window of seventeen to twenty-five residues and assign membrane spans based on hydrophobicity scores. Some are also able to determine the orientation of the membrane helices based on the positive-inside rule.
5. Predictions solely based on hydrophobicity profiles have lowest error rates.
Answer: b [Reason:] Predictions solely based on hydrophobicity profiles have high error rates. As with the third-generation predictions for globular proteins, applying evolutionary information with the help of neural networks or HMMs can improve the prediction accuracy significantly.
6. The presence of ______ signal peptides can significantly compromise the prediction _______ because the programs tend to confuse hydrophobic signal peptides with membrane helices.
a) hydrophobic, accuracy
b) hydrophobic, error
c) hydrophilic, accuracy
d) hydrophilic, error
Answer: a [Reason:] Predicting transmembrane helices is relatively easy. The accuracy of Some of the best predicting programs, such as TMHMM or HMMTOP, can exceed 70%. To minimize errors, the presence of signal peptides can be detected using a number of specialized programs and then manually excluded.
7. Which of the following is untrue regarding TMHMM?
a) It is a web-based program based on an HMM algorithm
b) It is trained to recognize transmembrane helical patterns
c) It is not trained to recognize transmembrane helical patterns
d) When a query sequence is scanned, the probability of having an α-helical domain is given
Answer: c [Reason:] It is trained to recognize transmembrane helical patterns based on a training set of 160 well-characterized helical membrane proteins. The orientation of the α-helices is predicted based on the positive-inside rule. The prediction output returns the number of transmembrane helices, the boundaries of the helices, and a graphical representation of the helices. This program can also be used to simply distinguish between globular proteins and membrane proteins.
8. Which of the following is untrue regarding Phobius ?
a) It is a web-based program designed to overcome false positives caused by the presence of signal peptides
b) The program incorporates distinct HMM models for signal peptides only
c) The program incorporates distinct HMM models for signal peptides as well as transmembrane helices
d) After distinguishing the putative signal peptides from the rest of the query sequence, prediction is made on the remainder of the sequence
Answer: b [Reason:] In addition to the given data, it has been shown that the prediction accuracy can be significantly improved compared to TMHMM (94% by Phobius compared to 70% by TMHMM). In addition to the normal prediction mode, the user can also define certain sequence regions as signal peptides or other nonmembrane sequences based on external knowledge.
9. Which of the following is true regarding Prediction of β-Barrel Membrane Proteins?
a) For membrane proteins with β-strands only, the β-strands forming the transmembrane segment are amphipathic in nature
b) For membrane proteins with β-strands only, the β-strands forming the transmembrane segment are only hydrophilic in nature
c) For membrane proteins with β-strands only, the β-strands forming the transmembrane segment are only hydrophobic in nature
d) They contain six to nine residues
Answer: a [Reason:] As stated, for membrane proteins with β-strands only, the β-strands forming the transmembrane segment are amphipathic in nature. They contain ten to twenty-two residues with every second residue being hydrophobic and facing the lipid bilayers whereas the other residues facing the pore of the β-barrel are more hydrophilic.
10. Scanning a sequence by hydrophobicity does not reveal transmembrane β-strands.
Answer: a [Reason:] These programs for predicting transmembrane α-helices are not applicable for this unique type of membrane proteins. To predict the β-barrel type of membrane proteins, a small number of algorithms have been made available based on neural networks and related techniques.
1. Which of the given statements is incorrect about Grouping Sequences?
a) The problem of deciding which sequences to include in the same group or cluster and which to separate into different groups or clusters is a recurring one
b) Divergence is necessary, but the sequences chosen should be clearly related based on inspection of each pair-wise alignment and a statistical analysis
c) The conservative approach is to group distinct sequences
d) The adventurous approach is to choose a set of marginally alignable sequences to pursue the difficult task of making a multiple sequence alignment and then to make profile models that may recognize divergence but will also give false predictions
Answer: c [Reason:] The conservative approach is to group only very similar sequences together. However, in making a conservative multiple sequence alignment with only very alike sequences, it is not possible to analyze the evolutionary divergence that may have occurred in a family of proteins. Furthermore, if a matrix or profile model is made from this alignment, that model will not be useful for identifying more divergent members of a family.
2. Which of the given statements is incorrect about Clusters of orthologous groups?
a) Using the protein from one of the organisms to search the proteome of the other for high-scoring matches should identify the ortholog as the highest- scoring match, or best hit
b) When entire proteomes of the two organisms are available, orthologs may be identified
c) a pair of orthologous genes in two organisms share so much sequence similarity that they may be assumed to have arisen from a common ancestor gene
d) each of the orthologs belongs to a family composed of paralogous sequences but irrelevant or not related to each other
Answer: d [Reason:] In many cases, each of the orthologs belongs to a family composed of paralogous sequences related to each other by gene duplication events. Hence, in the above database search, the ortholog will not only match the orthologous sequence in the second proteome but also these other paralogous sequences. The objective of the clusters of orthologous groups (COG) approach is to identify all matching proteins in the organisms; defined as an orthologous group related by both speciation and gene duplication events.
3. Which of the given statements is incorrect about Clusters of orthologous groups?
a) Paralogs may include a best hit or a high-scoring match of one of the sequences by another, but the reciprocal match can have low similarity that does not have to be significant
b) Paralogs defined by sets of three matching sequences in the selected organisms were kept separated from the clusters
c) Orthologous pairs were first defined by the best hits in reciprocal searches
d) To produce COGs, similarity searches were performed among the proteomes of phylogenetically distinct clades of prokaryotes
Answer: b [Reason:] Paralogs defined by sets of three matching sequences in the selected organisms were also added to these clusters. Sixty percent of the original set of 720 COGs does not include paralogs, or includes paralogs from one lineage only, suggesting that there has not been extensive duplication of this group.
4. Which of the given statements is incorrect about the Comparison of proteomes to EST databases of an organism?
a) ESTs are single DNA sequence reads that contain a small fraction of incorrect base assessments, insertions, and deletions
b) Many sequences arise from near the 5’ end of the mRNA, although every effort is usually made to read as far 3’ as possible into the upstream portion of the cDNA
c) EST libraries are useful for preliminary identification of genes by database similarity searches
d) An EST database of an organism can be analyzed for the presence of gene families,
orthologs, and paralogs
Answer: b [Reason:] Many sequences arise from near the 3’ end of the mRNA, although every effort is usually made to read as far 5’ as possible into the upstream portion of the cDNA. Because not all of the genes may be expressed in the tissues chosen for analysis, the library will often not be complete.
5. Which of the given statements is incorrect about Searching for orthologs to a protein family in an EST database?
a) Searches of EST databases for matches to a query sequence routinely produce minimal amounts of output that must be searched manually for significant hits
b) ESTs with a high percent identity with the query sequence, a long alignment with the query sequence, and a very low E value of the alignment score represent groups of paralogous and orthologous genes
c) To identify orthologs as the most closely related sequence, ESTs were aligned using the amino acid alignment as a guide
d) To identify orthologs as the most closely related sequence, a phylogenetic tree was produced by the maximum likelihood method
Answer: a [Reason:] The Searches of EST databases for matches to a query sequence routinely produce large amounts of output that must be searched manually for significant hits. an automatic method was described in 1999 utilizing a computer script, FAST-PAN, that scans EST databases with multiple queries from a protein family, sorts the alignment scores, and produces charts and alignments of the matches found.
6. Which of the given statements is incorrect about Family and Domain Analysis?
a) Gene identification of predicted proteins in the genome is designed to discover the metabolic features of an organism
b) In a particular organism or group of organisms, one particular domain can be expanded to perform a particular function
c) Comparison of the domain content of an entire proteome with that of another proteome cannot help in revealing the biological roles of diverse domains in different organisms
d) Different proteins are mosaics of domains that occur in different combinations in a given protein
Answer: b [Reason:] In a particular organism or group of organisms, various domains can be expanded to perform a particular function. More than 2000 fly and worm proteins are multidomain proteins, compared to about one-third this number in yeast.
7. Which of the given statements is incorrect about Ancient Conserved Regions?
a) The method involves database similarity searches of the SwissProt database with human, worm, yeast, or E. coli genes and identification of matches with sequences from a different phylum than the query sequence
b) An analysis of ACRs that predate the radiation of the major animal phyla some 580–540 million years ago suggested that 50–60% of coding sequences are ACRs
c) These ACRs may represent proteins present at the time of the prokaryotic–eukaryotic divergence
d) Phylogenetically diverse groups of organisms have been analyzed for the presence of conserved proteins and protein domains that have been conserved over long periods of evolutionary time, called ancient conserved regions or ACRs
Answer: b [Reason:] The analysis of ACRs 580–540 million years ago suggested that 20–40% of coding sequences are ACRs. For example, a search with 1916 E. coli proteins detected 266 ACRs found in 439 sequences, roughly one-quarter of the SwissProt database.
8. Which of the given statements is incorrect about Horizontal Gene Transfer?
a) The genomes of most organisms are derived by vertical transmission, the inheritance of chromosomes from parents to offspring from one generation to the next
b) It is the acquisition of genetic material from a different organism
c) The transferred material becomes a temporary addition to the recipient genome
d) An extreme example is the proposed endosymbiont origin of mitochondria in eukaryotic cells and chloroplasts in plants
Answer: c [Reason:] The transferred material becomes a permanent addition to the recipient genome. Although these exchanges do not occur very often on a generation-to-generation basis, a significant number can occur over a period of hundreds of millions of years.
9. Which of the given statements is incorrect about Horizontal Gene Transfer?
a) It is a significant source of genome variation in bacteria, allowing them to exploit new environments
b) Such transfer is rendered possible by a variety of natural mechanisms in bacteria for transferring DNA from one species to another
c) Detection of HT is made possible by the fact that each genome of each bacterial species has a unique base composition
d) The time of transfer of DNA cannot be estimated by the composition of the HT DNA
Answer: d [Reason:] The time of transfer of DNA may be estimated by the degree to which the composition of the HT DNA has blended into that of the recipient genome. Transfer of a portion of a genome from one organism to another can generally be detected as an island of sequence of different composition in the recipient. If the amino acid composition of transferred genes is typical, these islands may be detected by a codon usage analysis.
10. Annotation is based on finding significant alignment to sequences of known function in database similarity searches.
Answer: a [Reason:] Accurate annotation of genome sequences is an important first step in genome analysis. Matches of lesser significance provide only a tentative or hypothetical prediction and should be used as a working hypothesis of function.