Database MCQ Set 1
1. In which mode of search is the search string parsed into words and the search looks for rows?
a) Natural language
b) Boolean mode
c) Query expansion
d) Cross mode
Answer
Answer: a [Reason:] In MySQL, a full text search capability is provided, which enables to look for words or phrases without using pattern-matching operations. There are three kinds of full text searches.
2. The indicator of presence or absence of a word in search is used in which mode?
a) Natural language
b) Boolean mode
c) Query expansion
d) Cross mode
Answer
Answer: b [Reason:] A full text search capability is provided in MySQL. It facilitates to look for words or phrases without using pattern-matching operations. Boolean search is one of the three modes.
3. Which search mode uses natural language search as a subroutine?
a) Natural language
b) Boolean mode
c) Query expansion
d) Cross mode
Answer
Answer: c [Reason:] The query expansion search occurs in two phases. The natural language search is applied in the first phase. Query expansion search is one of the three modes of fulltext searching.
4. FULLTEXT indexes can be created only for ______
a) MyISAM
b) InnoDB
c) MEMORY
d) TRANSITION
Answer
Answer: a [Reason:] Full-text searches are based on the FULLTEXT indexes. They can be created for MyISAM tables only. In MySQL, a full text search enables to look for words or phrases without using pattern-matching operations.
5. A FULLTEXT index cannot be created for multiple columns.
a) True
b) False
Answer
Answer: b [Reason:] In MySQL, a full-text search is based on the FULLTEXT indexes. A FULLTEXT index can be created both for a single column and for multiple columns. Searching is possible across columns.
6. Which keyword is used to search through natural language fulltext?
a) MATCH
b) AGAINST
c) SEARCH
d) FIND
Answer
Answer: a [Reason:] In MySQL, fulltext searches are carried in three modes. Natural language fulltext search is carried out using the MATCH and AGAINST clause. MATCH keyword specifies the column to be searched.
7. What is xyz in the following statement?
SELECT * FROM my_table WHERE MATCH(abc) AGAINST('xyz');
a) column name
b) table name
c) search string
d) database name
Answer
Answer: c [Reason:] The natural language fulltext search in MySQL tables is carried out using the MATCH and AGAINST clause. The MATCH clause is used to specify the column and AGAINST specifies the search string.
8. What is abc in the following statement?
SELECT * FROM my_table WHERE MATCH(abc) AGAINST('xyz');
a) column name
b) table name
c) search string
d) database name
Answer
Answer: a [Reason:] The MATCH clause is used to specify the column and AGAINST specifies the search string in the natural language full text search syntax. It is one of the three modes of full-text search.
9. In Boolean search, results are sorted by relevance.
a) True
b) False
Answer
Answer: b [Reason:] Boolean mode FULLTEXT search is one of the three modes of FULLTEXT searches in MySQL. Boolean searches deal with the presence or absence of a word in a particular row to carry out the searches.
10. Which parameter determines the shortest words to index in FULLTEXT indexes?
a) ft_min_word_len
b) ft_min_len_word
c) ft_word_min_len
d) ft_word_len_word
Answer
Answer: a [Reason:] There are many full-text parameters that are configurable. They can be modified by setting the system variables accordingly. The ‘ft_min_word_len’ parameter determine the shortest words to index in FULLTEXT indexes.
Database MCQ Set 2
1. Which of these is also known as a virtual table?
a) SCHEMA
b) DATABASE
c) JOIN
d) VIEW
Answer
Answer: d [Reason:] In MySQL, a ‘View’ is also called a virtual table. It does not contain any data but only appears as a table. It is an alternative way to show or describe a table similar to the original table.
2. A View can be used to select a subset of the table columns.
a) True
b) False
Answer
Answer: a [Reason:] In MySQL, a ‘View’ is used to present a table in the database. It is a virtual table. It is also used to present a subset of the columns that are present in the original table of the database.
3. What is xyz in the following statement?
CREATE VIEW xyz AS SELECT a, b FROM t;
a) table
b) column
c) view
d) database
Answer
Answer: c [Reason:] A ‘View’ is created in MySQL by writing the clause ‘CREATE VIEW’ followed by the view name. Then follows the list of column names of the table that are to be presented as the view.
4. What is abc in the following statement?
CREATE VIEW xyz (abc) AS SELECT a FROM t;
a) row name
b) column name
c) view
d) database
Answer
Answer: b [Reason:] Column names for a view can be explicitly provided in the ‘CREATE VIEW’ clause itself. It presents the table view with the aliased column names instead of the original column names.
5. A view can refer to multiple tables via ____
a) UNION
b) JOIN
c) GROUP
d) SELECT
Answer
Answer: b [Reason:] In MySQL, a ‘View’ can refer to multiple tables. This makes it a lot simpler to execute the queries involving joins. When a selection is made from the view, the join is executed producing the results.
6. Views are not updatable.
a) True
b) False
Answer
Answer: b [Reason:] In MySQL, some ‘Views’ are updatable. The operators ‘INSERT’, ‘DELETE’ and ‘UPDATE’ can be used. They can be operated on the ‘Views’ to make changes to it. A table can be created and set as a view.
7. A view can be deleted using the command ____
a) REMOVE
b) DELETE
c) CLEAR
d) DROP
Answer
Answer: d [Reason:] In MySQL, ‘Views’ act as virtual tables. Many operations can be performed on views like the operations on tables. A view can be created, displayed and dropped. They can also be used to perform calculations.
8. What cannot be done on a view?
a) display
b) filter
c) index
d) drop
Answer
Answer: c [Reason:] In MySQL, ‘Views’ act as virtual tables. It is not possible to create indexes on a view. However, they can be used for the views that are processed using the merge algorithm.
9. What can be used to check for views that have been invalidated by DROP or ALTER operations?
a) CREATE TABLE
b) VERIFY TABLE
c) DETAILS TABLE
d) CHECK TABLE
Answer
Answer: d [Reason:] A ‘View’ is created in MySQL by writing the clause ‘CREATE VIEW’ followed by the view name. Once a view is dropped or altered after being created, it can be checked by the ‘CHECK TABLE’ clause.
10. Which of the following does not support materialized view?
a) MySQL
b) Oracle
c) PostgreSQL
d) SybaseSQL
Answer
Answer: a [Reason:] In MySQL, ‘Views’ are treated a little differently than Oracle, PostgreSQL and SybaseSQL since they support materialized views and MySQL does not use the materialized view.
Database MCQ Set 3
1. Which column attribute provides unique numbers for identification?
a) AUTO_INCREMENT
b) UNSIGNED
c) IDENTIFY
d) DESCRIBE
Answer
Answer: a [Reason:] In MySQL, the mechanism for providing unique numbers is through the AUTO_INCREMENT column attribute. It enables the generation of sequential numbers automatically. This facilitates identification.
2. There can be only one column per table with the AUTO_INCREMENT attribute.
a) True
b) False
Answer
Answer: a [Reason:] In MySQL, there can only be one column per table with the AUTO_INCREMENT attribute. It should also have an integer data type. The AUTO_INCREMENT is also allowed for floating point types.
3. The AUTO_INCREMENT sequences normally begin at ____
a) 0
b) 1
c) -1
d) 2
Answer
Answer: b [Reason:] The AUTO_INCREMENT column attribute provides unique numbers for column identification. AUTO_INCREMENT sequences normally begin at 1 and increase monotonically like 1, 2, 3, and so on.
4. The value of recently generated sequence number can be obtained by ______
a) LAST_INSERT_ID()
b) LATEST_INSERT_ID()
c) INITIAL_INSERT_ID()
d) INSERT_ID()
Answer
Answer: a [Reason:] The value of most recently generated sequence number can be obtained by calling the LAST_INSERT_ID() function. This enables to reference the AUTO_INCREMENT value in the subsequent statement.
5. When no AUTO_INCREMENT value has been generated during the current connection, LAST_INSERT_ID() returns ______
a) -1
b) 0
c) 1
d) 2
Answer
Answer: b [Reason:] The ‘LAST_INSERT_ID()’ function returns zero when no ‘AUTO_INCREMENT’ value has been generated during the current connection with the server. It is tied to the current connection.
6. The ‘LAST_INSERT_ID()’ is tied only to the ‘AUTO_INCREMENT’ values generated during the current connection to the server.
a) True
b) False
Answer
Answer: a [Reason:] The ‘LAST_INSERT_ID()’ is tied only to AUTO_INCREMENT values that are generated during the current connection to the server. It is not affected by AUTO_INCREMENT tied with the other clients.
7. Suppose the last row has the AUTO_INCREMENT column value 32. Suppose a new row is added by setting AUTO_INCREMENT value equal to 100. The next row added will have value _______
a) 32
b) 33
c) 100
d) 101
Answer
Answer: d [Reason:] If the new value is larger than the current next sequence number, then the sequence is reset to continue with the next value after that for the following rows. In this way, “bumping up” the counter is done.
8. The number of rows in the table is 10. Suppose all rows are deleted. The new row starts with sequence number _______
a) 11
b) 1
c) 100
d) 101
Answer
Answer: b [Reason:] When the row containing the largest value in an AUTO_INCREMENT column is deleted, that value is reused the next time a new value is generated. In this case the sequence number is 1.
9. In a MyISAM table, if the maximum value of an AUTO_INCREMENT increment column is 12 and that row is deleted, the next value generated is _______
a) 12
b) 13
c) 1
d) 14
Answer
Answer: b [Reason:] In the MyISAM tables, AUTO_INCREMENT sequences normally are monotonic. The values in an automatically generated series are strictly increasing. They are not reused when rows are deleted.
10. In MyISAM tables, when a table is emptied with the TRUNCATE TABLE, the counter begins at _______
a) 0
b) 1
c) -1
d) arbitrary
Answer
Answer: b [Reason:] In the MyISAM tables, whenever a table is truncated with the ‘TRUNCATE TABLE’ clause, the ‘AUTO_INCREMENT’ counter again gets reset. It begins with the value one and increases monotonically.
Database MCQ Set 4
1. While scanning for similarities in motifs, how regular expressions’ techniques work?
a) It represents a sequence family by a string of characters and further compares them
b) An algorithm similar to dynamic programming is used
c) Dot matrix analysis is used in this type of sequence analysis
d) Matrix analysis methods are used in this type
Answer
Answer: a [Reason:] In regular expressions’ techniques Pattern matching is defined as true or false in answer or outcome. In other words, if the pattern described in regex is found in a string of letters, the answer is true.
2. Which of the following best defines regular expressions?
a) They are made up of terms, operators and modifiers
b) They describe string or set of strings to find matching patterns
c) They are strictly restricted to alignment and corresponding score
d) They consist of set of rules for the connotations of various amino acid residues
Answer
Answer: b [Reason:] Regular expressions are powerful notable algebra that describe string or set of strings to find matching patterns. Pattern matching is defined as true or false in answer or outcome. And it is true that they are made up of terms, operators and modifiers but they are terminologies further used in matching process.
3. In regular expressions, which of the following pair of pattern is wrongly matched with its significance?
a) [ ] – Or
b) { } – Not
c) ( ) – Repeats
d) Z – Any
Answer
Answer: d [Reason:] Regular Expression Symbols have their own significances in regular expressions system means [GA] .g.e rFo ‘G or A’, {V,P} means not P or V, x(4) means (xxxx). Likewise, X denotes any character.
4. In terminologies related to regular expressions which of the following is false about terms and operators?
a) Terms are strings or substrings
b) Operators combine terms and expressions
c) Operators do not have precedence
d) Operators have precedence like arithmetic operators
Answer
Answer: c [Reason:] For harmonious, efficient and error-free functioning of the matching preocess, operators have precedence in order to set the priority of the operations to be carried out during the alignment.
5. In regular expressions, which of the following pair of pattern is wrongly matched with its significance?
a) ‘-’ – separator
b) < – N-terminal
c) > – C-terminal
d) ‘>>’ – end
Answer
Answer: d [Reason:] Regular Expression Symbols have their own significances in regular expressions’ system. For e.g. x(2,3) means x-x or x-x-x. Similarly, ‘.’ means end.
6. Emotif uses which databases for alignment of sequences?
a) BLOCKS and PRINTS databases
b) PROSITE
c) BLOCKS
d) PRINTS
Answer
Answer: a [Reason:] Emotif is a motif database that uses multiple sequence alignments from both the BLOCKS and PRINTS databases with an alignment collection much larger than PROSITE. It identifies patterns by allowing fuzzy matching of regular expressions. Therefore, it produces fewer false negatives than PROSITE.
7. While analysing motif sequences, what is the major disadvantageous feature of PROSITE?
a) The database constructs profiles to complement some of the sequence patterns
b) The functional information of these patterns is primarily based on published literature
c) Some of the sequence patterns are too short to be specific
d) Lack of specificity about probability and variation and relation between them
Answer
Answer: c [Reason:] The major pitfall with the PROSITE patterns is that some of the sequence patterns are too short to be specific. Rest of the options are advantages. The problem with these short sequence patterns is that the resulting match is very likely to be a result of random events. Overall, PROSITE has a greater than 20% error rate. Thus, either a match or non-match in PROSITE should be treated with caution.
8. Which of the following is not a characteristic of Fuzzy or approximate matches in regular expression?
a) This method is able to include more variant forms of a motif with a conserved function
b) the rule of matching is based on observations, not actual assumptions
c) with the more relaxed matching, there is increase of the noise level and false positives
d) the rule of matching is based on assumptions not actual observations
Answer
Answer: b [Reason:] The rule of matching is based on assumptions not actual observations in Fuzzy or approximate matches in regular expression. This provides more permissive matching by allowing more flexible matching of residues of similar biochemical properties. For example, if an original alignment only contains phenylalanine at a particular position, fuzzy matching allows other aromatic residues (including unobserved tyrosine and tryptophan) in a sequence to match with the expression.
9. Which of the following is not a characteristic of exact matches in regular expression?
a) There must be a strict match of sequence patterns
b) Any variations in the query sequence from the predefined patterns are not allowed
c) Provide more permissive matching by allowing more flexible matching of residues of similar biochemical properties
d) Searching a motif database using this approach results in either a match or non-match
Answer
Answer: c [Reason:] In this type of matching, there has to be a strict match of sequence patterns. This way of searching has a good chance of missing truly relevant motifs that have slight variations, thus generating false-negative results. As new sequences of motif are being accumulated, the rigid regular expression tends to become obsolete if not updated regularly to reflect the changes.
10. What does this representation mean- R.L.[EQD] ?
a) An arginine- Amino acid- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine
b) An arginine- Leucine- Either Apartic acid, glutamic acid or glutamine
c) An arginine- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine
d) An arginine- Leucine- Apartic acid and glutamic acid and glutamine
Answer
Answer: a [Reason:] This is an example of pexel motif. Here, the ‘.’ represents the ‘end’ i.e. the amino acid as mentioned in the answer and the [ ] means ‘or’ i.e. either of the mentioned residue is present in the given postion.
Database MCQ Set 5
1. Which of the following is not an advantage of Statistical models’ methods in analyzing protein motifs?
a) Sequence information is preserved from a multiple sequence alignment and expresses it with probabilistic models
b) Statistical models allow partial matches and compensate for unobserved sequence patterns using pseudo-counts
c) Statistical models have stronger predictive power than the regular expression based approach, even when they are derived from a limited set of sequences
d) The comparative flexibility is less in case of these methods when compared to regular expressions methods
Answer
Answer: d [Reason:] The major limitation of regular expressions is that this method does not take into account sequence probability information about the multiple alignment from which it is modeled making them less flexible. If a regular expression is derived from an incomplete sequence set, it has less predictive power because many more sequences with the same type of motifs are not represented. Unlike regular expressions, position-specific scoring matrices (PSSMs), profiles, and HMMs preserve the sequence information from a multiple sequence alignment and express it with probabilistic models.
2. For motif scanning which of the following programs or databases is for regulated sites curated from scientific literature?
a) ENSEMBL
b) ORegAnno
c) MAST
d) Clover
Answer
Answer: b [Reason:] Clover identifies overrepresented motifs in protein sequences whereas; MAST allows users to scan different databases for matches to motifs. ENSEMBL is another online genomic sequence repository which also includes online tools for data mining as well as BLAST searches.
3. Which of the following is not an advantageous feature or algorithm of the database PRINTS?
a) This program breaks down a motif into even smaller non-overlapping units called ‘fingerprints’, which are represented by unweighted PSSMs
b) To define a motif, at least a majority of fingerprints are required to match with a query sequence
c) A query that has simultaneous high-scoring matches to a majority of fingerprints belonging to a motif is a good indication of containing the functional motif
d) The difficulty to recognize short motifs when they reach the size of single fingerprints
Answer
Answer: d [Reason:] PRINTS is a protein fingerprint database containing ungapped, manually curated alignments corresponding to the most conserved regions among related sequences. The drawbacks of PRINTS are–the difficulty to recognize short motifs when they reach the size of single fingerprints and a relatively small database, which restricts detection of many motifs.
4. In which of the following multipurpose packages Gibbs sampling algorithm is used?
a) Consensus
b) BEST
c) AlignACE
d) PhyloCon
Answer
Answer: c [Reason:] The Gibbs sampling algorithm can identify multiple motifs in a sequence in a sequence set using iterative masking procedure. It is used in AlignACE whereas BEST is a suite of four motif discovery tools integrated in a graphical user interface. Also, Consensus program finds motifs in a set of unaligned sequences and PhyloCon builds on this framework by modeling conservation across orthologous genes from multiple species.
5. Which of the following is untrue in case of the database BLOCKS?
a) The alignments are automatically generated using the same data sets used for deriving the BLOSUM matrices
b) The derived ungapped alignments are called ‘blocks’, which are usually longer than motifs, are subsequently converted to PSSMs
c) A weighting scheme and pseudo counts are subsequently applied to the PSSMs to account for underrepresented and unobserved residues in alignments
d) The functional annotation of blocks is not consistent with that for the motifs
Answer
Answer: d [Reason:] BLOCKS is a database that uses multiple alignments derived from the most conserved, ungapped regions of homologous protein sequences. Because blocks often encompass motifs, the functional annotation of blocks is thus consistent with that for the motifs. A query sequence can be used to align with pre-computed profiles in the database to select the highest scored matches. Because of the use of the weighting scheme, the signal-to-noise ratio is improved relative to PRINTS.
6. Which of the following is false in case of the database Pfam and its algorithm?
a) Each motif or domain is represented by an HMM profile generated from the seed alignment of a number of conserved homologous proteins
b) Since the probability scoring mechanism is more complex in HMM than in a profile-based approach the use of HMM yields further increases in sensitivity of the database matches
c) Pfam-B only contains sequence families not covered in Pfam
d) The functional annotation of motifs in Pfam-A is often related to that in UNIPROT
Answer
Answer: d [Reason:] Pfam is a database with protein domain alignments derived from sequences in SWISSPROT and TrEMBL. The Pfam database is composed of two parts, Pfam-A and Pfam-B. Pfam-A involves manual alignments and Pfam-B, automatic alignment in a way similar to ProDom. The functional annotation of motifs in Pfam-A is often related to that in PROSITE. Because of the automatic nature, Pfam-B has a much larger coverage but is also more error prone because some HMMs are generated from unrelated sequences.
7. Which of the following is false in case of the database SMART and its algorithm?
a) Contains HMM profiles constructed from manually refined protein domain alignments
b) Alignments in the database are built based on tertiary structures whenever available or based on PSI-BLAST profiles
c) Alignments are further checked but not refined by human annotators before HMM profile construction
d) SMART stands for Simple Modular Architecture Research Tool
Answer
Answer: c [Reason:] Alignments are further checked and refined by human annotators before HMM profile construction. Protein functions are also manually curated. Thus, the database may be of better quality than Pfam with more extensive functional annotations. Compared to Pfam,
The SMART database contains an independent collection of HMMs, with emphasis on signaling, extracellular, and chromatin-associated motifs and domains. Sequence searching in this database produces a graphical output of domains with well-annotated information with respect to cellular localization, functional sites, super-family, and tertiary structure.
8. Which of the following is false in case of the database InterPro and its algorithm?
a) InterPro is an integrated pattern database designed to unify multiple databases for protein domains and functional sites
b) This database integrates information from PROSITE, Pfam, PRINTS, ProDom, and SMART databases
c) Only overlapping motifs and domains in a protein sequence derived by all five databases are included
d) All the motifs and domains in a protein sequence derived by all five databases are included
Answer
Answer: d [Reason:] The only overlapping motifs and domains in a protein sequence derived by all five databases are included in the database. The InterPro entries use a combination of regular expressions, fingerprints, profiles, and HMMs in pattern matching. However, an InterPro search does not obviate the need to search other databases because of its unique criteria of motif inclusion and thus may have lower sensitivity than exhaustive searches in individual databases. A popular feature of this database is a graphical output that summarizes motif matches and has links to more detailed information.
9. Which of the following is false in case of the CDART and its algorithm?
a) CDART is a domain search program that combines the results from RPS-BLAST, SMART, and Pfam
b) The program is now an integral part of the regular BLAST search function
c) CDART is substitute for individual database searches
d) It stands for Conserved Domain Architecture
Answer
Answer: c [Reason:] CDART is a domain search program that combines the results from various database searches. As with InterPro, CDART is not a substitute for individual database searches because it often misses certain features that can be found in SMART and Pfam.
10. Point out the wrong or irrelevant mathematical method in motif analysis.
a) Enumeration
b) Probabilistic Optimization
c) Deterministic Optimization
d) Literature mining
Answer
Answer: d [Reason:] All the rest of the options are indeed valid and proven mathematical methods that contain efficient algorithms in finding motifs in protein sequences. Literature mining is not a mathematical algorithm or tool as such to be used in identifying motifs. But it is definitely a part of research when it comes to find a function of various protein sequences.