Decision Science book summary and important keywords

Decision Science book summary and important keywords for mcq and descriptive type question

Study Decision Science mcq, book summary, important keywords. Yes, you can also study sample important question answer, complete mcq practice set (Decision Science) with answer as well as quiz for your exam.

Business communication MCQs
Essentials of HRM MCQs
Business Law MCQs
Strategic Management MCQs
Operations Management MCQs
Decision Science MCQs

Book summary of decision science

  • Statistics is an important decision-making tool in business and is used in virtually every area of business. In this course, the word statistics is defined as the science of gathering, analyzing, interpreting, and presenting numerical data.
  • The study of statistics can be subdivided into two main areas: descriptive statistics and inferential statistics. Descriptive statistics result from gathering data from a body, group, or population and reaching conclusions only about that group. Inferential statistics are generated by gathering sample data from a group, body, or population and reaching conclusions about the larger group from which the sample was drawn.
  • Most business statistics studies contain variables, measurements, and data. A variable is a characteristic of any entity being studied that is capable of taking on different values. Examples of variables might include monthly household food spending, time between arrivals at a restaurant, and patient satisfaction rating. A measurement is when a standard process is used to assign numbers to particular attributes or characteristics of a variable. Measurements on monthly household food spending might be taken in dollars, time between arrivals might be measured in minutes, and patient satisfaction might be measured using a 5-point scale. Data are recorded measurements. It is data that are analyzed by business statisticians in order to learn more about the variables being studied.
  • The appropriate type of statistical analysis depends on the level of data measurement, which can be (1) nominal, (2) ordinal, (3) interval, or (4) ratio. Nominal is the lowest level, representing classification only of such data as geographic location, sex, or Social Security number. The next level is ordinal, which provides rank ordering measurements in which the intervals between consecutive numbers do not necessarily represent equal distances. Interval is the next to highest level of data measurement in which the distances represented by consecutive numbers are equal. The highest level of data measurement is ratio, which has all the qualities of interval measurement, but ratio data contain an absolute zero and ratios between numbers are meaningful. Interval and ratio data sometimes are called metric or quantitative data. Nominal and ordinal data sometimes are called nonmetric or qualitative data.
  • Two major types of inferential statistics are (1) parametric statistics and (2) nonparametric statistics. Use of parametric statistics requires interval or ratio data and certain assumptions about the distribution of the data. The techniques presented in this text are largely parametric. If data are only nominal or ordinal in level, nonparametric statistics must be used.
  • The two types of data are grouped and ungrouped. Grouped data are data organized into a frequency distribution. Differentiating between grouped and ungrouped data is important, because statistical operations on the two types are computed differently.
  • Constructing a frequency distribution involves several steps. The first step is to determine the range of the data, which is the difference between the largest value and the smallest value. Next, the number of classes is determined, which is an arbitrary choice of the researcher. However, too few classes over-aggregate the data into meaningless categories, and too many classes do not summarize the data enough to be useful. The third step in constructing the frequency distribution is to determine the width of the class interval.
  • Dividing the range of values by the number of classes yields the approximate width of the class interval. The class midpoint is the midpoint of a class interval. It is the average of the class endpoints and represents the halfway point of the class interval.
  • Relative frequency is a value computed by dividing an individual frequency by the sum of the frequencies. Relative frequency represents the proportion of total values that is in a given class interval.
  • The cumulative frequency is a running total frequency tally that starts with the first frequency value and adds each ensuing frequency to the total.
  • Two types of graphical depictions are quantitative data graphs and qualitative data graphs.
  • A histogram is a vertical chart in which a line segment connects class endpoints at the value of the frequency. Two vertical lines connect this line segment down to the x-axis, forming a rectangle.
  • Ogives are cumulative frequency polygons. Points on an ogive are plotted at the class endpoints.
  • A pie chart is a circular depiction of data. The amount of each category is represented as a slice of the pie proportionate to the total. The researcher is cautioned in using pie charts because it is sometimes difficult to differentiate the relative sizes of the slices.
  • The bar chart or bar graph uses bars to represent the frequencies of various qualitative categories. The bar chart can be displayed horizontally or vertically.
  • Cross tabulation is a process for producing a two-dimensional table that displays the frequency counts for two variables simultaneously. The scatter plot is a two-dimensional plot of pairs of points from two numerical variables. It is used to graphically determine whether any apparent relationship exists between the two variables.
  • Statistical descriptive measures include measures of central tendency, measures of variability, and measures of shape.
  • Measures of central tendency and measures of variability are computed differently for ungrouped and grouped data. Measures of central tendency are useful in describing data because they communicate information about the more central portions of the data. The most common measures of central tendency are the mode, median, and mean. In addition, percentiles and quartiles are measures of central tendency.
  • The mode is the most frequently occurring value in a set of data. Among other things, the mode is used in business for determining sizes.
  • The median is the middle term in an ordered array of numbers containing an odd number of terms. For an array with an even number of terms, the median is the average of the two middle terms. A median is unaffected by the magnitude of extreme values. This characteristic makes the median a most useful and appropriate measure of location in reporting such things as income, age, and prices of houses.
  • The arithmetic mean is widely used and is usually what researchers are referring to when they use the word mean. The arithmetic mean is the average. The population mean and the sample mean are computed in the same way but are denoted by different symbols. The arithmetic mean is affected by every value and can be inordinately influenced by extreme values.
  • Percentiles divide a set of data into 100 groups. There are 99 percentiles. Quartiles divide data into four groups. The three quartiles are Q1, which is the lower quartile; Q2, which is the middle quartile and equals the median; and Q3, which is the upper quartile.
  • Measures of variability are statistical tools used in combination with measures of central tendency to describe data. Measures of variability provide information about the spread of the data values. These measures include the range, mean absolute deviation, variance, standard deviation, interquartile range, z scores, and coefficient of variation for ungrouped data.
  • One of the most elementary measures of variability is the range. It is the difference between the largest and smallest values. Although the range is easy to compute, it has limited usefulness. The interquartile range is the difference between the third and first quartile. It equals the range of the middle 50% of the data.
  • The mean absolute deviation (MAD) is computed by averaging the absolute values of the deviations from the mean. The mean absolute deviation has limited usage in statistics but is occasionally used in the field of forecasting as a measure of error.
  • Variance is widely used as a tool in statistics but is used little as a stand-alone measure of variability. The variance is the average of the squared deviations about the mean.
  • The square root of the variance is the standard deviation. It also is a widely used tool in statistics, but it is used more often than the variance as a stand-alone measure. The standard deviation is best understood by examining its applications in determining where data are in relation to the mean. The empirical rule and Chebyshev’s theorem are statements about the proportions of data values that are within various numbers of standard deviations from the mean.
  • The empirical rule reveals the percentage of values that are within one, two, or three standard deviations of the mean for a set of data. The empirical rule applies only if the data are in a bell-shaped distribution.
  • Chebyshev’s theorem also delineates the proportion of values that are within a given number of standard deviations from the mean. However, it applies to any distribution.
  • The z score represents the number of standard deviations a value is from the mean for normally distributed data.
  • The coefficient of variation is a ratio of a standard deviation to its mean, given as a percentage. It is especially useful in comparing standard deviations or variances that represent data with different means.
  • Some measures of central tendency and some measures of variability are presented for grouped data. These measures include mean, median, mode, variance, and standard deviation. Generally, these measures are only approximate for grouped data because the values of the actual raw data are unknown.
  • Two measures of shape are skewness and kurtosis. Skewness is the lack of symmetry in a distribution. If a distribution is skewed, it is stretched in one direction or the other. The skewed part of a graph is its long, thin portion.
  • Kurtosis is the degree of peakedness of a distribution. A tall, thin dis­tribution is referred to as leptokurtic. A flat distribution is platykurtic, and a distribution with a more normal peakedness is said to be mesokurtic.
  • A box-and-whisker plot is a graphical depiction of a distribution. The plot is constructed by using the median, the lower quartile, the upper quartile, the smallest value, and the largest value. It can yield information about skewness and outliers.
  • The study of probability addresses ways of assigning probabilities, types of probabilities, and laws of probabilities. Probabilities support the notion of inferential statistics. Using sample data to estimate and test hypotheses about population parameters is done with uncertainty. If samples are taken at random, probabilities can be assigned to outcomes of the inferential process.
  • Three methods of assigning probabilities are (1) the classical method, (2) the relative frequency of occurrence method, and (3) subjective probabilities. The classical method can assign probabilities a priori, or before the experiment takes place. It relies on the laws and rules of probability. The relative frequency of occurrence method assigns probabilities based on historical data or empirically derived data. Subjective probabilities are based on the feelings, knowledge, and experience of the person deter­mining the probability.
  • Certain special types of events necessitate amendments to some of the laws of probability: mutually exclusive events and independent events. Mutually exclusive events are events that cannot occur at the same time, so the probability of their intersection is zero. With independent events, the occurrence of one has no impact or influence on the occurrence of the other. Some experiments, such as those involving coins or dice, naturally produce independent events. Other experiments produce independent events when the experiment is conducted with replacement. If events are independent, the joint probability is computed by multiplying the marginal probabilities, which is a special case of the law of multiplication.
  • Three techniques for counting the possibilities in an experiment are the mn counting rule, the Nn possibilities, and combinations. The mn counting rule is used to determine how many total possible ways an experiment can occur in a series of sequential operations. The Nn formula is applied when sampling is being done with replacement or events are independent. Combinations are used to determine the possibilities when sampling is being done without replacement.
  • Four types of probability are marginal probability, conditional probability, joint probability, and union probability. The general law of addition is used to compute the probability of a union. The general law of multiplication is used to compute joint probabilities. The conditional law is used to compute conditional probabilities.
  • Bayes’ rule is a method that can be used to revise probabilities when new information becomes available; it is a variation of the conditional law. Bayes’ rule takes prior probabilities of events occurring and adjusts or revises those probabilities on the basis of information about what subsequently occurs.
  • Probability experiments produce random outcomes. A variable that contains the outcomes of a random experiment is called a random variable. Random variables such that the set of all possible values is at most a finite or countably infinite number of possible values are called discrete random variables. Random variables that take on values at all points over a given interval are called continuous random variables. Discrete distributions are constructed from discrete random variables. Continuous distributions are constructed from continuous random variables. Three discrete distributions are the binomial distribution, Poisson distribution, and hypergeometric distribution.
  • The binomial distribution fits experiments when only two mutually exclusive outcomes are possible. In theory, each trial in a binomial experiment must be independent of the other trials. However, if the population size is large enough in relation to the sample size (n < 5% N), the binomial distribution can be used where applicable in cases where the trials are not independent. The probability of getting a desired outcome on any one trial is denoted as p, which is denoted as the probability of getting a success. The binomial formula is used to determine the probability of obtaining x outcomes in n trials. Binomial distribution problems can be solved more rapidly with the use of binomial tables than by formula.
  • The Poisson distribution usually is used to analyze phenomena that produce rare occurrences over some interval. The only information required to generate a Poisson distribution is the long-run average, which is denoted by lambda (l). The assumptions are that each occurrence is independent of other occurrences and that the value of lambda remains constant throughout the experiment. Poisson probabilities can be determined by either the Poisson. The Poisson distribution can be used to approximate binomial distribution problems when n is large (n > 20), p is small, and n p ≤ 7.
  • The hypergeometric distribution is a discrete distribution that is usually used for binomial-type experiments when the population is small and finite and sampling is done without replacement. Because using the hypergeometric distribution is a tedious process, using the binomial distribution whenever possible is generally more advantageous.
  • This chapter discussed three different continuous distributions: the uniform distribution, the normal distribution, and the exponential distribution. With continuous distributions, the value of the probability density function does not yield the probability but instead gives the height of the curve at any given point. In fact, with continuous distributions, the probability at any discrete point is .0000. Probabilities are determined over an interval. In each case, the probability is the area under the curve for the interval being considered. In each distribution, the probability or total area under the curve is 1.
  • Probably the simplest of these distributions is the uniform distribution, sometimes referred to as the rectangular distribution. The uniform distribution is determined from a probability density function that contains equal values along some interval between the points a and b. Basically, the height of the curve is the same everywhere between these two points. Probabilities are determined by calculating the portion of the rectangle between the two points a and b that is being considered.
  • The most widely used of all distributions is the normal distribution. Many phenomena are normally distributed, including characteristics of most machine-produced parts, many measurements of the biological and natural environment, and many human characteristics such as height, weight, IQ, and achievement test scores. The normal curve is continuous, symmetrical, unimodal, and asymptotic to the axis; actually, it is a family of curves.
  • The parameters necessary to describe a normal distribution are the mean and the standard deviation. For convenience, data that are being analyzed by the normal curve should be standardized by using the mean and the standard deviation to compute z scores. A z score is the distance that an x value is from the mean, μ, in units of standard deviations. With the z score of an x value, the probability of that value occurring by chance from a given normal distribution can be determined by using a table of z scores and their associated probabilities.
  • The normal distribution can be used to work certain types of binomial distribution problems. Doing so requires converting the n and p values of the binomial distribution to μ and s of the normal distribution. When worked by using the normal distribution, the binomial dis­tribution solution is only an approximation. If the values of µ±3s are within a range from 0 to n, the approximation is reasonably accurate. Adjusting for the fact that a discrete distribution problem is being worked by using a continuous distribution requires a correction for continuity. The correction for continuity involves adding or subtracting .50 to the x value being analyzed. This correction usually improves the normal curve approximation.
  • Another continuous distribution is the exponential distribution. It complements the discrete Poisson distribution. The exponential distribution is used to compute the probabilities of times between random occurrences. The exponential distribution is a family of distributions described by one parameterl. The distribution is skewed to the right and always has its highest value at x = 0.
  • For much business research, successfully conducting a census is virtually impossible and the sample is a feasible alternative. Other reasons for sampling include cost reduction, potential for broadening the scope of the study, and loss reduction when the testing process destroys the product.
  • To take a sample, a population must be identified. Often the researcher cannot obtain an exact roster or list of the population and so must find some way to identify the population as closely as possible. The final list or directory used to represent the population and from which the sample is drawn is called the frame.
  • The two main types of sampling are random and nonrandom. Random sampling occurs when each unit of the population has the same probability of being selected for the sample. Nonrandom sampling is any sampling that is not random. The four main types of random sampling discussed are simple random sampling, stratified sampling, systematic sampling, and cluster or area, sampling.
  • In simple random sampling, every unit of the population is numbered. A table of random numbers or a random number generator is used to select n units from the population for the sample.
  • Stratified random sampling uses the researcher’s prior knowledge of the population to stratify the population into subgroups. Each subgroup is internally homogeneous but different from the others. Stratified random sampling is an attempt to reduce sampling error and ensure that at least some of each of the subgroups appears in the sample. After the strata are identified, units can be sampled randomly from each stratum. If the proportions of units selected from each subgroup for the sample are the same as the proportions of the subgroups in the population, the process is called proportionate stratified sampling. If not, it is called disproportionate stratified sampling.
  • With systematic sampling, every kth item of the population is sampled until n units have been selected. Systematic sampling is used because of its convenience and ease of administration.
  • Cluster or area sampling involves subdividing the population into non-overlapping clusters or areas. Each cluster or area is a microcosm of the population and is usually heterogeneous within. A sample of clusters is randomly selected from the population. Individual units are then selected randomly from the clusters or areas to get the final sample. Cluster or area sampling is usually done to reduce costs. If a set of second clusters or areas is selected from the first set, the method is called two-stage sampling.
  • Four types of non-random sampling were discussed: convenience, judgment, quota, and snowball. In convenience sampling, the researcher selects units from the population to be in the sample for convenience. In judgment sampling, units are selected according to the judgment of the researcher. Quota sampling is similar to stratified sampling, with the researcher identifying subclasses or strata. However, the researcher selects units from each stratum by some nonrandom technique until a specified quota from each stratum is filled. With snowball sampling, the researcher obtains additional sample members by asking current sample members for referral information.
  • Sampling error occurs when the sample does not represent the population. With random sampling, sampling error occurs by chance. Nonsampling errors are all other research and analysis errors that occur in a study. They can include recording errors, input errors, missing data, and incorrect definition of the frame.
  • According to the central limit theorem, if a population is normally distributed, the sample means for samples taken from that population also are normally distributed regardless of sample size. The central limit theorem also says that if the sample sizes are large (n≥30) the sample mean is approximately normally distributed regardless of the distribution shape of the population. This theorem is extremely useful because it enables researchers to analyze sample data by using the normal distribution for virtually any type of study in which means are an appropriate statistic, as long as the sample size is large enough. The central limit theorem states that sample proportions are normally distributed for large sample sizes.
  • Correlation measures the degree of relatedness of variables. The most well-known measure of correlation is the Pearson product-moment coefficient of correlation, r. This value ranges from −1 to 0 to +1. An r value of +1 is perfect positive correlation and an r value of −1 is perfect nega­tive correlation. Positive correlation means that as one variable increases in value, the other variable tends to increase. Negative correlation means that as one variable increases in value, the other variable tends to decrease. For r values near zero, little or no correlation is present.
  • Regression is a procedure that produces a mathematical model (function) that can be used to predict one variable by other variables. Simple regression is bivariate (two variables) and linear (only a line fit is attempted). Simple regression analysis produces a model that attempts to predict a y variable, referred to as the dependent variable, by an x variable, referred to as the independent variable. The general form of the equation of the simple regression line is the slope-intercept equation of a line. The equation of the simple regression model consists of a slope of the line as a coefficient of x and a y-intercept value as a constant.
  • After the equation of the line has been developed, several statistics are available that can be used to determine how well the line fits the data. Using the historical data values of x, predicted values of y (denoted as y_) can be calculated by inserting values of x into the regression equation. The predicted values can then be compared to the actual values of y to determine how well the regression equation fits the known data. The difference between a specific y value and its associated predicted y value is called the residual or error of prediction. Examination of the residuals can offer insight into the magnitude of the errors produced by a model. In addition, residual analysis can be used to help determine whether the assumptions underlying the regression analysis have been met.
  • A single value of error measurement called the standard error of the estimate, se, can be computed. The standard error of the estimate is the standard deviation of error of a model. The value of se can be used as a single guide to the magnitude of the error produced by the regression model as opposed to examining all the residuals.
  • Another widely used statistic for testing the strength of a regression model is r2, or the coefficient of determination. The coefficient of determination is the proportion of total variance of the y variable accounted for or predicted by x. The coefficient of determination ranges from 0 to 1. The higher the r2 is, the stronger is the predictability of the model.
  • Testing to determine whether the slope of the regression line is different from zero is another way to judge the fit of the regression model to the data.
  • If the population slope of the regression line is not different from zero, the regression model is not adding significant predictability to the dependent variable. A t statistic is used to test the significance of the slope. The overall significance of the regression model can be tested using an F statistic. In simple regression, because only one predictor is present, this test accomplishes the same thing as the t test of the slope and F = t2.
  • One of the most prevalent uses of a regression model is to predict the values of y for given values of x. Recognizing that the predicted value is often not the same as the actual value, a confidence interval has been developed to yield a range within which the mean y value for a given x should fall. A prediction interval for a single y value for a given x value also is specified. This second interval is wider because it allows for the wide diversity of individual values, whereas the confidence interval for the mean y value reflects only the range of average y values for a given x.
  • Time-series data are data that are gathered over a period of time at regular intervals. Developing the equation of a forecasting trend line for time-series data is a special case of simple regression analysis where the time factor is the predictor variable. The time variable can be in units of years, months, weeks, quarters, and others.
  • Multiple regression analysis is a statistical tool in which a mathematical model is developed in an attempt to predict a dependent variable by two or more independent variables or in which at least one predictor is nonlinear. Because doing multiple regression analysis by hand is extremely tedious and time-consuming, it is almost always done on a computer.
  • Residuals, standard error of the estimate, and R2 are also standard computer regression output with multiple regression. The coefficient of determination for simple regression models is denoted r2, whereas for multiple regression it is R2. The interpretation of residuals, standard error of the estimate, and R2 in multiple regression is similar to that in simple regression. Because R2 can be inflated with non-significant variables in the mix, an adjusted R2 is often computed. Unlike R2, adjusted R2 takes into account the degrees of freedom and the number of observations.
  • One way to establish the validity of a forecast is to examine the forecasting error. The error of a forecast is the difference between the actual value and the forecast value. Computing a value to measure forecasting error can be done in several different ways. This chapter presents mean absolute deviation and mean square error for this task.
  • Regression analysis with either linear or quadratic models can be used to explore trend. Regression trend analysis is a special case of regression analysis in which the dependent variable is the data to be forecast and the independent variable is the time periods numbered consecutively from 1 to k, where k is the number of time periods. For the quadratic model, a second independent variable is constructed by squaring the values of the first independent variable, and both independent variables are included in the analysis.
  • One group of time-series forecasting methods contains smoothing techniques. Among these techniques are naïve models, averaging techniques, and simple exponential smoothing. These techniques do much better if the time series data are stationary or show no significant trend or seasonal effects. Naïve forecasting models are models in which it is assumed that the more recent time periods of data represent the best predictions or forecasts for future outcomes.
  • Simple averages use the average value for some given length of previous time periods to forecast the value for the next period. Moving averages are time period averages that are revised for each time period by including the most recent value(s) in the computation of the average and deleting the value or values that are farthest away from the present time period. A special case of the moving average is the weighted moving average, in which different weights are placed on the values from different time periods.
  • Simple (single) exponential smoothing is a technique in which data from previous time periods are weighted exponentially to forecast the value for the present time period. The forecaster has the option of selecting how much to weight more recent values versus those of previous time periods.
  • Autocorrelation or serial correlation occurs when the error terms from forecasts are correlated over time. In regression analysis, this effect is particularly disturbing because one of the assumptions is that the error terms are independent. One way to test for autocorrelation is to use the Durbin-Watson test. There are a number of methods that attempt to overcome the effects of autocorrelation on the data.
  • Autoregression is a forecasting technique in which time-series data are predicted by independent variables that are lagged versions of the original dependent variable data. A variable that is lagged one period is derived from values of the previous time period. Other variables can be lagged two or more periods.
  • Decision analysis is a branch of quantitative management in which mathematical and statistical approaches are used to assist decision makers in reach­ing judgments about alternative opportunities. Three types of decisions are (1) decisions made under certainty, (2) decisions made under uncertainty, and (3) decisions made with risk. Several aspects of the decision-making situation are decision alternatives, states of nature, and payoffs. Decision alternatives are the options open to decision makers from which they can choose. States of nature are situations or conditions that arise after the decision has been made over which the decision maker has no control. The payoffs are the gains or losses that the decision maker will reap from various decision alternatives. These three aspects (decision alternatives, states of nature, and payoffs) can be displayed in a decision table or payoff table.
  • Decision making under certainty is the easiest of the three types of decisions to make. In this case, the states of nature are known, and the decision maker merely selects the decision alternative that yields the highest payoff.
  • Decisions are made under uncertainty when the likelihoods of the states of nature occurring are unknown. Four approaches to making decisions under uncertainty are maximax criterion, maximin criterion, Hurwicz criterion, and minimax regret. The maximax criterion is an optimistic approach based on the notion that the best possible outcomes will occur. In this approach, the decision maker selects the maximum possible payoff under each decision alternative and then selects the maximum of these. Thus, the decision maker is selecting the maximum of the maximums.
  • The maximin criterion is a pessimistic approach. The assumption is that the worst case will happen under each decision alternative. The decision maker selects the minimum payoffs under each decision alternative and then picks the maximum of these as the best solution. Thus, the decision maker is selecting the best of the worst cases, or the maximum of the minimums.
  • The Hurwicz criterion is an attempt to give the decision maker an alternative to maximax and maximin that is somewhere between an optimistic and a pessimistic approach. With this approach, decision makers select a value called alpha between 0 and 1 to represent how optimistic they are. The maximum and minimum payoffs for each decision alternative are examined. The alpha weight is applied to the maximum payoff under each decision alternative and 1–a is applied to the minimum payoff. These two weighted values are combined for each decision alternative, and the maximum of these weighted values is selected.
  • Minimax regret is calculated by examining opportunity loss. An opportunity loss table is constructed by subtracting each payoff from the maximum payoff under each state of nature. This step produces a lost opportunity under each state. The maximum lost opportunity from each decision alternative is determined from the opportunity table. The minmum of these values is selected, and the corresponding decision alternative is chosen. In this way, the decision maker has reduced or minimized the regret, or lost opportunity.
  • In decision making with risk, the decision maker has some prior knowledge of the probability of each occurrence of each state of nature. With these probabilities, a weighted payoff referred to as expected monetary value (EMV) can be calculated for each decision alternative. A person who makes decisions based on these EMVs is called an EMVer. The expected monetary value is essentially the average payoff that would occur if the decision process were to be played out over a long period of time with the probabilities holding constant.
  • The expected value of perfect information can be determined by comparing the expected monetary value if the states of nature are known to the expected monetary value. The difference in the two is the expected value of perfect information.
  • Utility refers to a decision maker’s propensity to take risks. People who avoid risks are called risk avoiders. People who are prone to take risks are referred to as risk takers. People who use EMV generally fall between these two categories. Utility curves can be sketched to ascertain or depict a decision maker’s tendency toward risk.

Important Keywords for Decision Science

Adjusted R2: Sometimes additional independent variables add no significant information to the regression model, yet R2 increases. To overcome this problem, statisticians have developed an adjusted R2 which takes into consideration both the additional information each new independent variable brings to the regression model and the changed degrees of freedom of regression.

Autocorrelation: Autocorrelation occurs in data when the error terms of a regression forecasting model are correlated.

Autoregression: Autoregression is a multiple regression technique in which the independent variables are time-lagged versions of the depend­ent variable.

Averaging models: These are computed by averaging data from several time periods and using the average as the forecast for the next time period.

Bar graph: A graph or chart with two or more categories along one axis and a series of bars, one for each category, along the other axis.

Binomial distribution: As the word binomial indicates, any single trial of a binomial experiment contains only two possible outcomes. These two outcomes are labeled success or failure.

Census: The process of gathering data from the whole population for a given measurement of interest is a Census.

Central limit theorem: According to the central limit theorem, if a population is normally distributed, the sample means for samples taken from that population also are normally distributed regardless of sample size. The central limit theorem also says that if the sample sizes are large (n ≥ 30), the sample mean is approximately normally distributed regard-less of the distribution shape of the population.

Cluster (or area) sampling: Cluster (or area) sampling involves dividing the population into nonoverlapping areas, or clusters. Clusters are tend to be internally heterogeneous.

Coefficient of determination: The coefficient of determination is the proportion of variability of the dependent variable (y) accounted for or explained by the independent variable (x).

Coefficient of multiple determination (R2): Coefficient of multiple determination represents the proportion of variation of the dependent variable, y, accounted for by the independent variables in the regression model.

Column chart: A vertical bar graph is referred as column chart

Conditional probability: The probability that A will occur given that B has already occurred is the ‘conditional probability of A given B’ and is denoted by P(A|B).

Continuous distributions: The Continuous Distributions take on values at every point over a given interval. Thus continuous random variables have no gaps or unassumed values.

Continuous random variables: Continuous random variables are gener­ated from experiments in which things are “measured” not “counted.”

Convenience sampling: In convenience sampling, elements for the sample are selected for the convenience of the researcher.

Correction for continuity: This correction ensures that most of the binomial problem’s information is correctly transferred to the normal curve analysis.

Cross tabulation: It is a process of producing a two-dimensional table that displays the frequency counts for two variables simultaneously.

Cycles: Cycles are patterns of highs and lows through which data move over time periods usually of more than a year.

Cyclical effects: These effects are shown by time-series data that extend over a long period of time with enough “history” to show.

Data: Data are recorded measurements.

Decision alternatives: The various choices or options available to the decision maker in any given problem situation.

Decision analysis: A category of quantitative business techniques particularly targeted at clarifying and enhancing the decision-making process.

Decision making under certainty: A decision-making situation in which the states of nature are known.

Decision making under risk: A decision-making situation in which it is uncertain which states of nature will occur but the probability of each state of nature occurring has been determined.

Decision making under uncertainty: A decision-making situation in which the states of nature that may occur are unknown and the probabil­ity of a state of nature occurring is also unknown.

Decision table: A matrix that displays the decision alternatives, the states of nature, and the payoffs for a particular decision-making problem; also called a payoff table.

Decision trees: A flowchart-like depiction of the decision process that includes the various decision alternatives, the various states of nature, and the payoffs.

Decomposition: is a technique for isolating the effects of seasonality.

Dependent variable: In simple regression, the variable to be predicted is called the dependent variable and is designated as y.

Descriptive statistics: If a business analyst uses data gathered on a group to describe or reach conclusions about that same group, the statistics are called descriptive statistics.

Discrete distributions: Constructed from discrete random variables.

Discrete random variables: A random variable is a discrete random variable if the set of all possible values is at most a finite or a countably infinite number of possible values.

Disproportionate stratified: Disproportionate stratified random sampling occurs where the proportions of the strata in the sample are different from the proportions of the strata in the population.

Distribution: A distribution is the arrangement of data by the values of one variable in order, from low to high. This arrangement, and its characteristics such as shape and spread, provide information about the underlying sample.

Durbin-Watson test: Durbin-Watson is a test to determine whether auto­correlation is present in a time-series regression analysis.

EMVer: A person who uses an expected monetary value (EMV) approach to making decisions under risk.

Error of an individual forecast: is the difference between the actual value and the forecast of that value.

Event: A subset of the outcomes of a process.

Expected monetary value (EMV): A value of a decision alternative com­puted by multiplying the probability of each state of nature by the state’s associated payoff and summing these products across the states of nature.

Expected value of perfect information: The difference between the expected monetary payoff that would occur if the decision maker knew which states of nature would occur and the payoff from the best deci­sion alternative when there is no information about the occurrence of the states of nature.

Expected value of sample information: The difference between the expected monetary value with information and the expected monetary value without information.

Experimental probability: The estimated probability of an event; obtained by dividing the number of successful trials by the total number of trials.

Exponential distribution: The exponential distribution is continuous and describes a probability distribution of the times between random occurrences.

Exponential smoothing: This is used to weight data from previous time periods with exponentially decreasing importance in the forecast.

Finite correction factor: for finite population, a statistical adjustment can be made to the z formula for sample mean, The adjustment is called Finite correction factor.

First-difference approach: This approach requires each value of X is subtracted from each succeeding time period value of X and these “differences” become the new and transformed X variable.

Forecasting error: This error is the difference between the actual value and the forecast of that value.

Forecasting: is the art or science of predicting the future.

Frame: The final list or directory used to represent the population and from which the sample is drawn is called the frame.

Frequency distribution: A frequency distribution is a summary of data presented in the form of class intervals and frequencies.

Frequency polygon: It is a graphical display of class frequencies where each class frequency is plotted as a dot at the class midpoint, and the dots are connected by a series of line segments.

Grouped data: Data that have been organized into a frequency distribution are called grouped data.

Histogram: A histogram is a series of contiguous rectangles that represent the frequency of data in given class intervals.

Hurwicz criterion: An approach to decision making in which the maxi­mum and minimum payoffs selected from each decision alternative are used with a weight, between 0 and 1 to determine the alternative with the maximum weighted average. The higher the value of, the more optimistic is the decision maker.

Hypergeometric distribution: The hypergeometric distribution is a discrete probability distribution that applies only to experiments in which the trials are done without replacement.

Independent event: Events A and B are independent if the outcome of event A has no effect on the outcome of event B.

Independent variable: The predictor is called the independent variable, or explanatory variable, and is designated as x.

Inferential statistics: If a researcher gathers data from a sample and uses the statistics generated to reach conclusions about the population from which the sample was taken, the statistics are inferential statistics.

Interquartile range (IQR): The IQR is the difference between the score delineating the 75th percentile and the 25th percentile, the third and first quartiles, respectively.

Intersection: the intersection of two sets is the set of elements that are in both sets.

Interval-level data: Is a level of data in which the distances between consecutive numbers have meaning and the data are always numerical.

Judgment sampling: In judgment sampling, units are selected according to the judgment of the researcher.

Least squares analysis: Least squares analysis is a process whereby a regression model is developed by producing the minimum sum of the squared error values. This analysis is concerned with minimizing the sum of squares of error in a regression model.

Maximax criterion: An optimistic approach to decision making under uncertainty in which the decision alternative is chosen according to which alternative produces the maximum overall payoff of the maximum payoffs from each alternative.

Maximin criterion: A pessimistic approach to decision making under uncertainty in which the decision alternative is chosen according to which alternative produces the maximum overall payoff among the minimum payoffs from each alternative.

Mean absolute deviation (MAD): Mean absolute deviation is the mean, or average, of the absolute values of the errors.

Mean squared error (MSE): This error is computed by squaring each error (thus creating a positive number) and averaging the squared errors.

Mean, or expected value: The mean or expected value of a discrete distribution is the long-run average of occurrences.

Mean: The arithmetic mean is simply the average add up all the scores and divide by the number of observations.

Measurement: A measurement is when a standard process is used to assign numbers to particular attributes or characteristics of a variable.

Median: The median is defined as the “middle score.” It is the observed score (or an extrapolation of observed scores) that splits the distribution in half so that 50% of the remaining scores are less than the median and 50% of the remaining scores exceed the median.

Minimax regret: A decision-making strategy in which the decision maker determines the lost opportunity for each decision alternative and selects the decision alternative with the minimum of lost opportunity or regret.

Mode: The mode is the most frequent score in the set. Multimodal refers to a distribution with more than one mode; bimodal refers to a distribution with 2 modes.

Moving average: Moving average is an average that is updated or rec­omputed for every new time period being considered.

Multiple regression: Multiple regression is a regression analysis with two or more independent variables or with at least one nonlinear predictor.

Mutually exclusive event: Events A and B are mutually exclusive if the occurrence of event A implies event B can’t occur.

Naïve forecasting methods: These are simple models in which it is assumed that the more recent time periods of data represent the best predictions or forecasts for future outcomes.

Nominal-level data: The lowest level of data measurement is the nominal level. Numbers representing nominal-level data can be used only to classify or categorize.

Nonrandom sampling techniques: Sampling techniques used to select elements from the population by any mechanism that does not involve a random selection process are called nonrandom sampling techniques.

Nonrandom sampling: Nonrandom sampling is any sampling that is not random. In this method of sampling every unit of the population doesn’t have the same probability of being getting selected.

Nonsampling errors: All errors other than sampling errors are nonsampling errors. The many possible nonsampling errors include missing data, recording errors, input processing errors, and analysis errors.

Normal distribution: Normal distribution is a function that represents the distribution of many random variables as a symmetrical bell-shaped graph curve.

Ogive: An ogive is a cumulative frequency polygon.

Opportunity loss table: A decision table constructed by subtracting all payoffs for a given state of nature from the maximum payoff for that state of nature and doing this for all states of nature; displays the lost opportunities or regret that would occur for a given decision alternative if that particular state of nature occurred.

Ordinal-level data: This measurement is higher than the nominal level. In addition to the nominal-level capabilities, ordinal-level measurement can be used to rank or order people or objects

Outcomes: The possible results of an event is called outcomes.

Outliers: Outliers are data points that lie apart from the rest of the points.

Parameter: A descriptive measure of the population is called a parameter.

Partial regression coefficient: Partial regression coefficient of an in-dependent variable represents the increase that will occur in the value of the dependent variable from a one-unit increase in that independent variable if all other independent variables are held constant.

Payoff table: A matrix that displays the decision alternatives, the states of nature, and the payoffs for a particular decision-making problem; also called a decision table.

Payoffs: The benefits or rewards that result from selecting a particular decision alternative.

Percentile: The pth percentile is that score such that p percent of the observations score lower or equal to the score.

Pie chart: A pie chart is a circular depiction of data where the area of the whole pie represents 100% of the data and slices of the pie represent a percentage breakdown of the sublevels.

Poisson distribution: Focuses on the number of discrete occurrences over some interval or continuum. The average of Poisson distribution is denoted by lamda (l).

Population variance: The population variance is defined as the average squared deviation from the mean and in many statistical procedures is also called a mean square or MS.

Population: A population is a collection of persons, objects, or items of interest.

Probabilistic model: Probabilistic model is one that includes an error term that allows the y values to vary for any given value of x.

Probability: A measure of the likeliness that an event will happen.

Proportionate stratified random: Proportionate stratified random sampling occurs when the percentage of the sample taken from each stratum is proportionate to the percentage that each stratum is within the whole population.

Quota sampling: Quota sampling is similar to stratified sampling, with the researcher identifying subclasses or strata. However, the researcher selects units from each stratum by some nonrandom technique until a specified quota from each stratum is filled.

Random experiment: It is an experiment whose outcome cannot be pre­dicted with certainty, before the experiment is run.

Random sampling: Random sampling occurs when each unit of the population has the same probability of being selected for the sample.

Random variable: A variable that contains the outcomes of a random experiment is called a random variable.

Range: It the difference between the largest and smallest numbers in the data set.

Ratio-level data: This measurement is the highest level of data measurement. Ratio data have the same properties as interval data, but ratio data have an absolute zero, and the ratio of two numbers is meaningful.

Rectangular distribution: Rectangular distribution is also known as uniform distribution.

Regression analysis: Regression analysis is the process of constructing a mathematical model or function that can be used to predict or determine one variable by another variable or other variables.

Residual: The residual of the regression model is the difference between the y value and the predicted y value.

Response plane: In a multiple regression model with two independent first-order variables, the response surface is a response plane which is fit in a three-dimensional space.

Response surface: Response surface explores the relationships between independent and dependent variables.

Response variable: Response variable is a variable whose value is predicted in a regression model.

Risk avoider: A decision maker who avoids risk whenever possible and is willing to drop out of a game when given the chance even when the payoff is less than the expected monetary value.

Risk taker: A decision maker who enjoys taking risk and will not drop out of a game unless the payoff is more than the expected monetary value.

Sample proportion: Sample proportion is computed by dividing the frequency with which a given characteristic occurs in a sample by the number of items in the sample. We use this method if research results in count-able items.

Sample: A sample is a portion of the whole and, if properly taken, is representative of the whole.

Sampling error: Sampling error occurs when the sample is not representative of the population.

Sampling: sampling is a method to select a subset from a population which have some properties of population itself.

Scatter plot: Scatter plot is a plot in which the values of two variables are plotted along two axes, the pattern of the resulting points revealing any correlation present. It is useful for regression analysis also

Serial correlation: This correlation occurs in data when the error terms of a regression forecasting model are correlated.

Set: A set is a well-defined collection of objects.

Simple average model: With this model, the forecast for time period t is the average of the values for a given number of previous time periods.

Simple average: Simple average is the total value of all the observations divided by the number of observations.

Simple random sampling: In simple random sampling, every unit of the population is numbered. A table of random numbers or a random number generator is used to select n units from the population for the sample.

Simple regression: The most elementary regression model is called simple regression or bivariate regression involving two variables in which one variable is predicted by another variable

Skew: When there are more scores toward one end of the distribution than the other, this results in skew.

Smoothing techniques: These techniques are used to produce fore­casts based on “smoothing out” the irregular fluctuation effects in the time-series data.

Snowball sampling: snowball sampling, the researcher obtains additional sample members by asking current sample members for referral information.

Standard deviation: The standard deviation of a distribution is the average deviation between individual distribution scores and the distribution’s mean.

Standard error of the estimate (se): The standard error of the estimate, denoted se, is a standard deviation of the error of the regression model.

Standardized normal distribution: A standard normal distribution is a normal distribution with mean 0 and standard deviation 1. Areas under this curve can be found using a standard normal table.

States of nature: The occurrences of nature that can happen after a deci­sion has been made that can affect the outcome of the decision and over which the decision maker has little or no control.

Stationary: Time-series data that contain no trend, cyclical, or seasonal effects are said to be stationary.

Statistic: A descriptive measure of a sample is called a statistic.

Stratified random sampling: In stratified random sampling the population is divided into non-overlapping subpopulations called strata. The researcher then extracts a random sample from each of the subpopulations (strata).

Subset: set A is a subset of set B if all of the elements of set A are con­tained in set B.

Sum of squares of error (SSE): Sum of square of error is the sum of the squared residuals in a regression model.

Sum of squares: The total of the residuals squared is called the sum of squares of error (SSE).

Systematic sampling: With systematic sampling, every kth item is selected to produce a sample of size n from a population of size N.

Time-series data: These are data that have been gathered at regular intervals over a period of time.

Tree diagram: A tree diagram displays all the possible outcomes of an event.

Trend: Trend is the long-term general direction of data.

Two-stage sampling: When clusters are too large, and a second set of clusters is taken from each original cluster then this technique is called two-stage sampling.

Ungrouped data: Raw data, or data that have not been summarized in any way.

Uniform distribution: It is sometimes referred to as the rectangular distribution. The uniform distribution is determined from a probability density function that contains equal values along some interval between the points a and b.

Union: combining the elements of two or more sets.

Variable: Variable is a characteristic of any entity being studied that is capable of taking on different values

Variance: Variance is the statistical average of the dispersion of scores in a distribution.

Venn diagram: It is a graphic means of showing intersection and union of sets by representing them as bounded regions.

Weighted moving average: This is a moving average in which some time periods are weighted differently than others

z distribution: The z distribution is a normal distribution with a mean of 0 and a standard deviation of 1.

z score: A z score is the number of standard deviations that a value, x, is above or below the mean. If the value of x is less than the mean, the z score is negative; if the value of x is more than the mean, the z score is positive.

ed010d383e1f191bdb025d5985cc03fc?s=120&d=mm&r=g

DistPub Team

Distance Publisher (DistPub.com) provide project writing help from year 2007 and provide writing and editing help to hundreds student every year.