Multiple choice question for engineering
1. The chromosomes comprised linear DNA molecules in a tightly compact form that was wrapped around protein complexes, called the nucleosome.
Answer: a [Reason:] Nuclei and chromosomes were not observed in bacteria (a prokaryotic cell), but when bacterial DNA was eventually detected, the molecule was usually circular and was also in a compacted form. The following sections outline the structure and composition of prokaryotic and eukaryotic genomes.
2. The first bacterial genome to be sequenced was that of ________ a mild human pathogen
a) Hemophilus influenzae
c) Vibrio cholarae
d) Clostridium botulinum
Answer: a [Reason:] This project was carried out at the Institute of Genomics Research. It was carried out in part to prove a new genome sequencing method—the shotgun method.
3. While sequencing of the first bacterial genome–A large number of random overlapping fragments were sequenced and then a consensus sequence of the entire ______ chromosome of
Hemophilus was assembled by computer.
a) 8.6 x 109 bp
b) 1.8 x 106 bp
c) 6.9 x 105 bp
d) 1.8 x 104 bp
Answer: b [Reason:] It was done excepting several regions that had to be assembled manually. Once available, open reading frames were identified, and these were compared to the existing proteins by a database similarity search.
4. While sequencing of The first bacterial genome—Approximately ____ of the ____ predicted genes matched genes of another species, the bacterial species E. coli K-12 that had been the subject of many years of genetic and biochemical research.
a) 46%, 1500
b) 58%, 1496
c) 72%, 1743
d) 58%, 1743
Answer: d [Reason:] The function of the other 42% of the Hemophilus genes could not be identified, although some of them were similar to the 38% of E. coli genes that were also of unknown function. Other unique sequences that appeared to be associated with the ability of the organism to behave as a human pathogen were also found.
5. After sequencing the Hemophilus genome and Organisms were selected for sequencing based on minimum criteria. Which of the following is not one of them?
a) They had been subjected to a good deal of biological analysis
b) They were model eukaryotic organisms
c) They were an important human pathogen, e.g., Mycobacterium tuberculosis (tuberculosis)
d) They were of phylogenetic interest
Answer: b [Reason:] There were model prokaryotic organisms. For e.g., E. coli and Bacillus subtilis. The success of sequencing the Hemophilus genome in a relatively short time and with a modest budget heralded the sequencing of a large number of additional prokaryotic organisms.
6. Analysis of the ribosomal RNA molecules of prokaryotes and eukaryotes had led to the prediction of three main branches in the tree of life.
Answer: a [Reason:] The three branches are represented by Archaea, the Bacteria, and the Eukarya. Analysis of the ribosomal for genome sequencing projects, organisms have been sampled from throughout the tree, including some that are in deeper branches of the tree and that have growth properties reminiscent of an ancient environment.
7. Annotation involves identifying open reading frames in the genome sequence.
Answer: a [Reason:] It is done by using the predicted protein as query sequences in a database similarity search. It further adds significant matches to the genome sequence entry in the sequence database.
8. A simple way to retrieve sequences of viral and other extra-chromosomal genetic elements such as organelles is through the National Center for Biotechnology Information (NCBI)
Answer: a [Reason:] Prior to the sequencing of H. influenzae, the first free-living organism to be sequenced, a large number of viruses had been sequenced. Many of these organisms also serve as model systems for studying replication and gene expression. As an example, the nucleotide sequence of bacteriophage lambda was completed by Sanger.
9. Which of the given is wrongly matched?
a) Escherichia coli – Bacteria
b) Methanococcus jannaschii – Archaea
c) Synechocystis sp. – Archaea
d) Aquifex aeolicus – Bacteria
Answer: c [Reason:] Synechocystis sp. – Bacteria is the correct pair. It is an ancient organism that produces oxygen by light-harvesting.
10. In examining the results of analysis, it is important to look for the method used the statistical significance of the result, and the overall degree of confidence in the alignments.
Answer: a [Reason:] The analysis should be repeated if necessary. Annotation errors occur when the above criteria are not followed.
1. The genome annotation process involves two steps: gene prediction and functional assignment.
Answer: a [Reason:] Before the assembled sequence is deposited into a database, it has to be analyzed for useful biological features. The genome annotation process provides comments for the features.
2. Which of the following is incorrect regarding gene annotation?
a) The gene annotation of the human genome employs a combination of theoretical prediction and experimental verification
b) Gene structures are first predicted by ab initio exon prediction programs
c) The predicted genes are compared with experimentally determined cDNA and EST sequences
d) The pairwise alignment programs are not involved
Answer: d [Reason:] The predictions are verified by BLAST searches against a sequence database. The predicted genes are further compared with experimentally determined cDNA and EST sequences using the pairwise alignment programs such as GeneWise, Spidey, SIM4, and EST2 Genome.
3. Which of the following is incorrect regarding gene ontology?
a) It exists because there is a need to standardize protein functional descriptions
b) It uses a limited vocabulary to describe molecular functions
c) Biological processes are not described though
d) The cellular components are described using limited vocabulary
Answer: c [Reason:] The controlled vocabulary is organized such that a protein function is linked to the cellular function through a hierarchy of descriptions with increasing specificity. The top of the hierarchy provides an overall picture of the functional class, whereas the lower level in the hierarchy specifies more precisely the functional role. This way, protein functionality can be defined in a standardized and unambiguous way.
4. Which of the following is incorrect regarding gene ontology?
a) There is standardization of the names and activities
b) There is no standardization of associated pathways
c) It provides consistency in describing overall protein functions
d) It facilitates grouping of proteins of related functions
Answer: b [Reason:] A GO description of a protein provides three sets of information: biological process, cellular component, and molecular function, each of which uses a unique set of non-overlapping vocabularies. The standardization of the names, activities, and associated pathways provides consistency in describing overall protein functions.
5. Which of the following is incorrect regarding Automated Genome Annotation?
a) It exists because of the need to develop fast and automated methods to annotate the genomic sequences
b) The automated approach relies on homology detection
c) The automated approach doesn’t rely on heuristic sequence similarity searching
d) Automation brings speed in gene annotation process
Answer: c [Reason:] If a newly sequenced gene or its gene product has significant matches with a database sequence beyond a certain threshold, a transfer of functional assignment is taking place. In addition to sequence matching at the full length, detection of conserved motifs often offers additional functional clues.
6. Conserved functional sites can be identified by profile and hidden Markov model–based motif and domain search tools such as SMART and InterPro.
Answer: a [Reason:] Detecting remote homologs typically involves combined searches of protein motifs and domains and prediction for secondary and tertiary structures. The prediction can also be performed using structure-based approaches such as threading and fold recognition.
7. The remote homology detection helps to shed light on the possible functions of the proteins that previously have no functional information at all.
Answer: a [Reason:] The bioinformatic analysis can spur an important advance in knowledge in many cases. Some hypothetical proteins, because of their novel structural folds, still cannot be predicted even with the advanced bioinformatics approaches and remain challenges for both experimental and computational work.
8. Which of the following is incorrect regarding genome economy?
a) It is a phenomenon of synthesizing more proteins from fewer genes
b) This is a major strategy that eukaryotic organisms use to achieve a myriad of genotypic diversities only
c) This is a major strategy that eukaryotic organisms use to achieve a myriad of phenotypic diversities
d) There are numerous underlying genetic mechanisms to help account for genome economy
Answer: b [Reason:] A major mechanism responsible for the protein diversity is alternative splicing, which refers to the splicing event that joins different exons from a single gene to form different transcripts. A related mechanism, known as exon shuffling, which joins exons from different genes to generate more transcripts, is also common in eukaryotes. It is known that, in humans, about two thirds of the genes exhibit alternative splicing and exon shuffling during expression, generating 90% of the total proteins.
9. In some circumstances, one mRNA transcript can lead to the translation of more than one protein.
Answer: a [Reason:] For example, human dentin phosphoprotein and dentin sialoprotein are proteins involved in tooth formation. An mRNA transcript that includes coding regions from both proteins is translated into a precursor protein that is cleaved to produce two different mature proteins.
10. Which of the following is incorrect regarding GeneQuiz?
a) It is a web server for protein DNA annotation
b) It is a web server for protein sequence annotation
c) It compares a query sequence against databases using BLAST and FASTA to identify homologs with high similarities
d) It performs domain analysis using the PROSITE and Blocks databases
Answer: a [Reason:] It performs domain analysis using the PROSITE and Blocks databases as well as analysis of secondary structures and super-secondary structures that includes prediction of coiled coils and transmembrane helices. Multiple search and analysis results are compiled to produce a summary of protein function with an assigned confidence level (clear, tentative, marginal, and negligible).
1. Which of the following is untrue about the genome mapping?
a) It doesn’t lead to the understanding a genome structure
b) It involves identifying relative locations of genes
c) It involves identifying traits
d) It involves identifying mutations
Answer: a [Reason:] The first step to understanding a genome structure is through genome mapping, which is a process of identifying relative locations of genes, mutations or traits on a chromosome. A low-resolution approach to mapping genomes is to describe the order and relative distances of genetic markers on a chromosome.
2. Genetic markers are ______ portions of a _______ whose inheritance patterns can be followed.
a) unidentifiable, genes
b) unidentifiable, chromosome
c) identifiable, chromosome
d) identifiable, genes
Answer: c [Reason:] For many eukaryotes, genetic markers represent morphologic phenotypes. In addition to genetic linkage maps, there are also other types of genome maps such as physical maps and cytologic maps, which describe genomes at different levels of resolution.
3. Genetic linkage maps, also called genetic maps, identify the relative positions of genetic markers on a chromosome and are based on how frequent the markers are inherited together.
Answer: a [Reason:] The rationale behind genetic mapping is that the closer the two genetic markers are, the more likely it is that they are inherited together and are not separated in a genetic crossing event. The distance between the two genetic markers is measured in centiMorgans (cM), which is the frequency of recombination of genetic markers.
4. One centiMorgan is defined as ____ percentage of the total recombination events.
Answer: a [Reason:] One centiMorgan is one percentage of the total recombination events when separation of the two genetic markers is observed in a genetic crossing experiment. One centiMorgan is approximately 1 Mb in humans and 0.5 Mb in Drosophila.
5. Physical maps are maps of locations of identifiable landmarks on a genomic DNA _______ inheritance patterns.
a) remotely related to
b) related to
c) regardless of
d) associated with
Answer: c [Reason:] The distance between genetic markers is measured directly as kilobases (Kb) or megabases (Mb). Because the distance is expressed in physical units, it is more accurate and reliable than centiMorgans used in genetic maps.
6. Physical maps are constructed by using a chromosome walking technique.
Answer: a [Reason:] It uses a number of radio labeled probes to hybridize to a library of DNA clone fragments. By identifying overlapping clones probed by common probes, a relative order of the cloned fragments can be established.
7. Which of the following is untrue about cytologic maps?
a) They cannot be directly observed under microscope
b) They refer to banding patterns
c) They can be viewed on stained chromosomes
d) They can be directly observed under microscope
Answer: a [Reason:] Cytologic maps refer to banding patterns seen on stained chromosomes, which can be directly observed under a microscope. The observable light and dark bands are the visually distinct markers on a chromosome.
8. Cytologic maps can be considered to be of _____ resolution and hence somewhat ______ physical maps.
a) very high, inaccurate
b) very low, accurate
c) very high, accurate
d) very low, inaccurate
Answer: d [Reason:] The banding patterns, however, are not always constant and are subject to change depending on the extent of chromosomal contraction. Thus, cytologic maps can be considered to be of very low resolution and hence somewhat inaccurate physical maps. The distance between two bands is expressed in relative units (Dustin units).
9. In medical applications, the ultimate goal of gene mapping is to disease genes.
Answer: a [Reason:] Once the gene is cloned, the determination of DNA sequence is possible. Further, the study of target protein is carried out.
10. One of the fundamental events that occur in meiosis is crossing over in which homologous chromosomes exchange segments causing a reshuffling of genes.
Answer: a [Reason:] If genes are far apart on the same chromosome, it is likely that recombination occurs. Conversely, if they are very close together, they are more likely to be transmitted as a block.
1. The major challenges in genome assembly are sequence errors, contamination by bacterial vectors, and repetitive sequence regions.
Answer: a [Reason:] Sequence errors can often be corrected by drawing a consensus from an alignment of multiple overlapped sequences. Bacterial vector sequences can be removed using filtering programs prior to assembly. To overcome the problem of sequence repeats, programs such as RepeatMasker can be used to detect and mask repeats. Additional constraints on the sequence reads can be applied to avoid miss-assembly caused by repeat sequences.
2. When a sequence is generated from ____ ends of a single clone, the distance between the two opposing fragments of a clone is fixed to ________ meaning that they are always separated by a distance defined by a _____ length (normally 1,000 to 9,000 bases).
a) both, an uncertain range, clone
b) one, an uncertain range, clone
c) both, a certain range, clone
d) both, a certain range, gene
Answer: c [Reason:] A commonly used constraint to avoid errors caused by sequence repeats is the so called forward–reverse constraint. When the constraint is applied, even when one of the fragments has a perfect match with a repetitive element outside the range, it is not able to be moved to that location to cause miss-assembly.
3. Which of the following is untrue about base calling and assembly programs?
a) The first step toward genome assembly includes derive base calls
b) The first step toward genome assembly includes assigning associated quality scores
c) One of the steps is to assemble the sequence reads into contiguous sequences
d) There is no identifying overlap between sequence fragments
Answer: d [Reason:] One of the steps includes identifying overlaps between sequence fragments, assigning the order of the fragments and deriving a consensus of an overall sequence.
Assembling all shotgun fragments into a full genome is a computationally very challenging step. There are a variety of programs available for processing the raw sequence data.
4. Which of the following is incorrect?
a) Initial DNA sequencing reactions generate short sequence reads from DNA clones
b) To assemble a whole genome sequence, these short fragments are joined to form larger fragments
c) The average length of the reads is about 50 bases
d) A number of overlapping contigs can be further merged to form scaffolds
Answer: c [Reason:] The average length of the reads is about 500 bases. To assemble a whole genome sequence, these short fragments are joined to form larger fragments after removing overlaps. These longer, merged sequences are termed contigs, which are usually 5,000 to 10,000 bases long. A number of overlapping contigs can be further merged to form scaffolds (30,000–50,000 bases, also called supercontigs), which are unidirectionally oriented along a physical map of a chromosome.
5. Which of the following is incorrect about Phred?
a) It is a UNIX program
b) It doesn’t give a probability score in output
c) It is used for base calling
d) It uses a Fourier analysis to resolve fluorescence traces and predict actual peak locations of bases
Answer: b [Reason:] It also gives a probability score for each base call that may be attributable to error. The commonly accepted score threshold is twenty, which corresponds to a 1% chance of error.
The higher the score, the better the quality of the sequence reads. If the score value falls below the threshold, human intervention is required.
6. Which of the following is incorrect about Phrap?
a) It aligns individual fragments in a pairwise fashion using the Smith–Waterman algorithm
b) It doesn’t take input from Phred
c) It is used for sequence assembly
d) It is a UNIX program
Answer: b [Reason:] It takes Phred base-call files with quality scores as input and aligns individual fragments in a pairwise fashion using the Smith–Waterman algorithm. The base quality information is taken into account during the pairwise alignment. After all the pair wise sequence similarity is identified, the program performs assembly by progressively merging sequence pairs with decreasing similarity scores while removing overlapped regions. Consensus contigs are derived after joining all possible overlapped reads.
7. VecScreen is a primarily aimed for sequence assembly.
Answer: b [Reason:] is a web-based Program that helps detect contaminating bacterial vector sequences. It scans an input nucleotide sequence and compares it with a database of known vector sequences by using the BLAST program.
8. Which of the following is incorrect about EULER?
a) It is an assembly algorithm
b) It uses a Eulerian Superpath approach, which is a polynomial algorithm
c) In this approach, a sequence fragment is broken down to tuples of five nucleotides
d) The tuples are distributed in a diagram with numerous nodes that are all interconnected
Answer: c [Reason:] The tuples are converted to binary vectors in the nodes. By using a Viterbi algorithm, the shortest path among the vectors can be found, which is the best way to connect the tuples into a full sequence. Because this approach does not directly rely on detecting overlaps, it may be advantageous in assembling sequences with repeat motifs.
9. TIGR Assembler is a UNIX program from TIGR for assembly of large shotgun sequence fragments.
Answer: a [Reason:] It treats the sequence input as clean reads without consideration of the sequence quality. A main feature of the program is the application of the forward–reverse constraints to avoid miss-assembly caused by sequence repeats. The sequence alignment in the assembly stage is performed using the Smith–Waterman algorithm.
10. Which of the following is incorrect about ARACHNE?
a) It accepts base calls with associated quality scores assigned by Phred as input
b) It is a free UNIX program
c) It is for the assembly of whole-genome shotgun reads
d) It doesn’t involve heuristic approach
Answer: d [Reason:] Its unique features include using a heuristic approach similar to FASTA to align overlapping fragments, evaluating alignments using statistical scores, correcting sequencing errors based on multiple sequence alignment, and using forward–reverse constraints. It accepts base calls with associated quality scores assigned by Phred as input and produces scaffolds or a fully assembled genome.
1. The _____ resolution genome map is the genomic DNA sequence that can be considered as a type of ______ map describing a genome at the single base-pair level.
a) highest, physical
b) lowest, physical
c) highest, cytological
d) lowest, cytological
Answer: a [Reason:] Cytological maps have quite low resolution, when compared to physical maps. They can be viewed under microscopes as well.
2. Which of the following is untrue about DNA sequencing?
a) It is now routinely carried out using the Sanger method
b) This doesn’t make use of DNA polymerases
c) This involves synthesis of DNA chains of varying length
d) The DNA synthesis is stopped by adding dideoxynucleotides
Answer: b [Reason:] DNA polymerases are used to synthesize DNA chains. The dideoxynucleotides are labeled with fluorescent dyes, which terminate the DNA synthesis at positions containing all four bases, resulting in nested fragments that vary in length by a single base. When the labeled DNA is subjected to electrophoresis, the banding patterns in the gel reveal the DNA sequence.
3. In DNA sequencing, the fluorescent traces of the DNA sequences are read by a computer program that assigns bases for each peak in a chromatogram.
Answer: a [Reason:] This process is called base calling. Automated base calling may generate errors and human intervention is often required to correct the sequence calls.
4. The shotgun approach _______ sequences clones from _____ of cloned DNA.
a) randomly, one end
b) randomly, both ends
c) specifically, both ends
d) specifically, one end
Answer: b [Reason:] There are two major strategies for whole genome sequencing: the shotgun approach and the hierarchical approach. The shotgun approach generates a large number of sequenced DNA fragments. The number of random fragments has to be very large, so large that the DNA fragments overlap sufficiently to cover the entire genome.
5. The shotgun approach does not require knowledge of physical mapping of the clone fragments, but rather a robust computer assembly program to join the pieces of random fragments into a single, whole-genome sequence.
Answer: a [Reason:] Generally, the genome has to be redundantly sequenced in such a way that the overall length of the fragments covers the entire genome multiple times. This is designed to minimize sequencing errors and ensure correct assembly of a contiguous sequence. Overlapping sequences with an overall length of six to ten times the genome size are normally obtained for this purpose.
6. Despite the multiple coverage, sometimes certain genomic regions remain unsequenced, mainly owing to cloning difficulties.
Answer: a [Reason:] In such mentioned cases, the remainder gap sequences can be obtained through extending sequences from regions of known genomic sequences using a more traditional PCR technique. That which requires the use of custom primers and performs genome walking in a stepwise fashion. This step of genome sequencing is also known as finishing, which is followed by computational assembly of all the sequence data into a final complete genome.
7. The hierarchical genome sequencing approach is ______
a) entirely dissimilar to the shotgun approach
b) dissimilar to the shotgun approach
c) similar to the shotgun approach, but on a larger scale
d) similar to the shotgun approach, but on a smaller scale
Answer: d [Reason:] In this, the chromosomes are initially mapped using the physical mapping strategy. Longer fragments of genomic DNA (100 to 300 kB) are obtained and cloned into a high-capacity bacterial vector called bacterial artificial chromosome (BAC).
8. In hierarchical genome sequencing approach, based on the results of _______ mapping, _______of the BAC clones on a chromosome can be determined.
a) physical, the locations and orders
b) physical, only the locations
c) cytological, only the locations
d) physical, only the orders
Answer: a [Reason:] By successively sequencing adjacent BACclone fragments, the entire genome can be covered. The complete sequence of each individual BAC clone can be obtained using the shotgun approach. Overlapping BAC clones are subsequently assembled into an entire genome sequence.
9. The hierarchical approach is ____ and _____ than the shotgun approach because it involves an initial clone-based physical mapping step.
a) slower, less costly
b) faster, more costly
c) faster, less costly
d) slower, more costly
Answer: d [Reason:] During the era of human genome sequencing, there was a heated debate on the merits of each of the two strategies. Despite the mentioned fact, once the map is generated, assembly of the whole genome becomes relatively easy and less error prone.
10. The whole genome shotgun approach can produce a draft sequence very rapidly because it is based on the direct sequencing approach.
Answer: a [Reason:] However, it is computationally very demanding to assemble the short random fragments. Although the approach has been successfully employed in sequencing small microbial genomes, for a complex eukaryotic genome that contains high levels of repetitive sequences, such as the human genome, the full shotgun approach becomes less accurate and tends to leave more “holes” in the final assembled sequence than the hierarchical approach. Current genome sequencing of large organisms often uses a combination of both approaches.