Course Name: PAD 627 – Quantitative Methods in Public Administration – Week Three
Voiceover Preference: Male
Specific Contractor (If Applicable):
Slide 1 | Welcome to the week Three lecture for PAD 627 Quantitative Methods in Public Administration. |
Slide 2 | This is the main navigational menu. This week you will be covering Estimation, Hypothesis Writing, One Sample Hypotheses, and Two Sample Hypotheses. Click on Estimation to begin. |
Slide 3 | Use of the z and t distributions to analyze research results Note: Formulas are shown FYI. You will solve problems using StatDisk. Click here to view when to use z and when to use t. |
Slide 4 | A point estimate is a single value (statistic) used to estimate a population value (parameter). Since sample information is only a subset of all the information in a population, a point estimate from a sample (like X-bar) probably won’t be identical to its corresponding population parameter. A confidence interval is a range of values around the sample statistic within which the population parameter is likely to occur at a specified probability called the level of confidence. |
Slide 5 | The level of confidence (say, 95%) states that we are 95% confident that the interval around a sample mean contains the population mean. 95% of an infinite number of sample means with that interval will actually contain the population mean. 5% of those sample means, while drawn from the population, will not contain the population mean within that given interval. We can never be 100% confident in our research results. This common-sense conclusion reflects the shortcomings of human knowledge even with the best of research. |
Slide 6 | The factors that determine the width of a confidence interval are: The sample size, n.The variability in the population, either the population standard deviation or estimated by the sample standard deviationThe desired level of confidence, z ort |
Slide 7 | If the population standard deviation is known, then use the z distribution regardless of the size of the sample or shape of the population distribution. |
Slide 8 | If the population standard deviation is not known but the sample size is n = 30 or larger, then use the z distribution with s |
Slide 9 | The 3 issues to consider are: Is the population standard deviation known?Is the sample size 30 or larger?Is the population normally distributed? |
Slide 10 | |
Slide 11 | An Interval Estimate states the range within which a population parameter probably lies. The interval within which a population parameter is expected to occur is called a confidence interval.The two confidence intervals that are used extensively are the 95% and the 99%. Click here to view a Confidence Interval Table. |
Slide 12 | For a 95% confidence interval, 95% of the similarly constructed intervals will contain the parameter being estimated. Also 95% of the sample means for a specified sample size will lie within 1.960 standard deviations of the population mean for the z-distribution. For the 99% confidence interval, 99% of the sample means for a specified sample size will lie within 2.576 standard deviations of the population mean for the z-distribution. Once you have finished viewing this example, click on the return button in the bottom right hand corner. |
Slide 13 | The standard error of the sample mean is the standard deviation of the sampling distribution of sample means. When the population standard deviation is known, the standard error is computed by: |
Slide 14 | If standard deviation is not known, the standard deviation of the sample, s, can be used to approximate the population standard deviation when the sample size is large or when the population from which the sample is randomly drawn is known to be normally distributed. |
Slide 15 | Reducing Standard Error occurs when the sampling distribution of the sample means has a tighter fit to their mean (which is also the population mean by the Central Limit Theorem (CLT). This increases the accuracy of the interval estimate. A smaller interval would result, or a higher level of confidence could be used with the same size interval. Standard Error can only be reduced by either reducing the standard deviation of the sample (stratified sampling can help here) or increasing the sample size. |
Slide 16 | |
Slide 17 | In general, a confidence interval for the mean using the z-distribution is computed by using s when the sample size is large. |
Slide 18 | The Dean of the Business School wants to estimate the mean number of hours worked per week by students. A sample of 49 students showed a mean of 24 hours with a standard deviation of 4 hours. What is the population mean? The value of the population mean is not known. Our best estimate of this value is the sample mean of 24 hours. This value is called a point estimate. Find the 95 percent confidence interval for the population mean. Confidence interval range is 22.8800 to 25.1200 hrs. About 95 percent of similarly constructed intervals include the population parameter, so we are 95% confident that the population mean is between 23 and 25 hours. |
Slide 19 | The head basketball coach at Fresno State wants to estimate the mean height of 18 year old men in Fresno, CA. A sample of six students showed these heights in cm: 167, 170, 172, 174, 177, and 181. Calculate the sample mean & sample standard deviation. > What is the population mean? The value of the population mean is not known. Our best estimate of this value is the sample mean. This value is called a point estimate. The t-distribution will be used because the population standard deviation is unknown, the sample size is small (less than n = 30), and the population characteristic of height is normally distributed in adult men. With a sample size of six, the degrees of freedom is n-1 or df=5. From Appendix B.2 (the t-distribution table), find the column for the Confidence Interval, 95% Go to the row under “df” {for degrees of freedom}of 5 The value for t is 2.571 with level of confidence of 95% and sample size of six (which is 5 degrees of freedom) Find the 95 percent confidence interval for the population mean. Confidence interval is 168.2423 to 178.7577 cm. About 95 percent of similarly constructed intervals include the population parameter, so we are 95% confident that the population mean is between 168 and 179 cm. |
Slide 20 | There are 3 factors that determine the size of a sample, none of which has any direct relationship to the size of the population. They are: The degree of confidence selected.The maximum allowable error.The variation in the population. |
Slide 21 | To find the sample size, n, for a variable: where: E stands for Epsilon, which is the maximum allowable error, z is the z-value corresponding to the selected level of confidence, and s is the sample standard deviation (if available, can also use σ) |
Slide 22 | A consumer group would like to estimate the mean monthly electricity charge for a single family house in July within $5 using a 99 percent level of confidence. Based on similar studies the standard deviation is estimated to be $20.00. How large a sample is required? |
Slide 23 | In practice, there are 3 factors that influence the size of a sample. They are: The need for n is greater than or equal to 30 so z can be used when it’s uncertain whether the population is normally distributed.The need for a reduced Standard Error because greater accuracy is necessary.However, there is increased cost in time, money and complexity when the sample size is larger. |
Slide 24 | |
Slide 25 | Hypothesis Writing. The trick is to write the alternate (research) hypothesis first because H1 will reflect the wording in the problem. Then write the null hypothesis next because H0 will have the opposite math operator from the H1. Remember that the H1 will always have a statement of inequality as the research hypothesis (because its Rejection Region lies outside the Critical Value) while the H0 will always have a statement of equality as the null hypothesis (because its Critical Region includes the Critical Value). |
Slide 26 | Here’s an Example. |
Slide 27 | Here’s a Second Example. |
Slide 28 | One Sample Hypotheses. What is a Hypothesis? A Hypothesis is a statement about the value of a population parameter developed for the purpose of testing. It expresses a belief that the mean of a sampling distribution is a certain value as opposed to being another value. Examples of hypotheses made about a population parameter are:The mean monthly income for all systems analysts is $3,625.Twenty percent of all customers at Bovine’s Chop House return for another meal within a month. |
Slide 29 | Hypothesis testing is a procedure, based on sample evidence and probability theory, used to determine whether the null hypothesis is reasonably likely and should not be rejected, or is it reasonably unlikely and therefore should be rejected. A Null Hypothesis can never be accepted because the Level of Confidence is never 100 %. We can only reject or fail to reject it. |
Slide 30 | Click to reveal the steps in Hypothesis testing. |
Slide 31 | Null Hypothesis is A statement about the value of a population parameter. It’s the mean of the sampling distribution where z or t equals zero. Alternative Hypothes isA statement that is preferred if the sample data provide evidence that the null hypothesis is false. Level of Significance is The probability of rejecting the null hypothesis when it is actually true. Alpha is the symbol. Alpha = 1 – Confidence Level Type I Error means rejecting the null hypothesis when it is actually true. Type I Error = Alpha = LOS. Type II Error is Failing to reject the null hypothesis when it is actually false. Beta is the symbol. The Power of a statistical test is 1 – Beta. Power is the ability to reject a false null hypothesis. Power is increased & Beta decreased by reducing standard error. Test statistic: A value, determined from sample information, used to determine whether or not to reject the null hypothesis. Examples of test statistics are z and t. Critical Value: The dividing point between the region where the alternative hypothesis is preferred and the region where the null hypothesisis not rejected. The Critical Value is part of the null hypothesis region. |
Slide 32 | A test is two-tailed when no direction is specified in the alternate hypothesis, such as the following: |
Slide 33 | The Rejection Region table is used to find the boundaries for the null hypothesis, at a given Level of Significance, for example 0.05, for the two-tailed test. |
Slide 34 | Here is a graph of the Sampling Distribution for the Statistic Z for a Two-Tailed test, .05 Level of Significance. |
Slide 35 | A test is one-tailed when the alternate hypothesis, states a direction, such as the following: |
Slide 36 | The Rejection Region table is used to find the boundary for the null hypothesis, at a given Level of Significance, for example 0.05, for the one-tailed test. |
Slide 37 | Here is a graph of the Sampling Distribution for the Statistic Z for a One-Tailed test, .05 Level of Significance. |
Slide 38 | When testing for the population mean when the population standard deviation is known, the test statistic is given by the following: |
Slide 39 | Information about a population can come from these sources: the population is relatively small and the parameters are known; orlong-term historical experience can substitute for population information; ormanufacturing or design/engineering specifications |
Slide 40 | EXAMPLE 1. The processors of Fries’ Catsup indicate on the label that the bottle contains 16 ounces of catsup. The standard deviation of the process is 0.5 ounces. A sample of 36 bottles from last hour’s production revealed a mean weight of 16.12 ounces per bottle. At the 0.05 significance level is the process out of control? That is, can we conclude that the mean amount per bottle is different from 16 ounces? |
Slide 41 | Let’s go through the Hypothesis Testing steps. |
Slide 42 | EXAMPLE 2. Roder’s Discount Store chain issues its own credit card. Lisa, the credit manager, wants to find out if the mean monthly unpaid balance is more than $400. The level of significance is set at .05. A random check of 172 unpaid balances revealed the sample mean to be $407 and the sample standard deviation to be $38. Should Lisa conclude that the population mean is greater than $400, or is it reasonable to assume that the difference of $7 ($407-$400) is due to chance? Again, let’s go through the Hypothesis Testing steps. |
Slide 43 | |
Slide 44 | EXAMPLE 3. The current rate for producing 5-amp fuses at Neary Electric Co. is 250 per hour. A new, high quality machine has been purchased and installed that, according to the supplier, will increase the production rate. A sample of 10 randomly selected hours from last month revealed the mean hourly production on the new machine was 256 units, with a sample standard deviation of 6 per hour. At the 0.05 significance level can Neary conclude that the new machine is faster? |
Slide 45 | Here is information on how to read a t-table before we move onto Step 4. |
Slide 46 | Again, read down the column, ‘Proportion in One Tail 0.05’ Degrees of Freedom (df) is n – 1 and for this example, with n = 10, df will equal 9. Read across from the row under ‘df ’ the number, 9. Reading down from Proportion in One Tail 0.05 and reading across from df of 9 you will see the t value of 1.833 which is the Critical Value for Step 4. |
Slide 47 | |
Slide 48 | For two sample hypotheses, there are independent samples and dependent samples involved. Independent samples are samples that are not related. Dependent samples are samples that are paired on a variable or related in a before/after design. For example: If you wished to buy a car you might look at the same car at two different dealerships and compare the prices. If you wished to measure the effectiveness of a new diet you would weigh each dieter at the start and at the finish of their program to see how much weight, if any, each of them lost. |
Slide 49 | Independent samples are usually larger and generally use the z statistic. Because the samples are not related, they usually have more variation and a somewhat larger standard error. Dependent samples are usually smaller and generally use the t statistic. Because the samples are related, they tend to have a smaller standard error because of less variation within their samples. |
Slide 50 | |
Slide 51 | Example. An independent testing agency is comparing the daily rental cost for renting a compact car from Hertz and Avis. A random sample of eight cities revealed the following information. At the 0.05 significance level can the testing agency conclude that there is a difference in the rental charged between Hertz and Avis? |
Slide 52 | |
Slide 53 | |
Slide 54 | |
Slide 55 | |
Slide 56 | For independent samples, we start with 2 sets of sample data, each from different populations. If the populations are similar regarding a specified variable, then the sample means will be about the same. So we wish to know whether the distribution of the differences in sample means has a mean of 0. If both samples contain at least 30 observations, we use z as the test statistic. |
Slide 57 | No assumptions about the shape of the two populations are required. The samples are from independent populations. Independent samples are samples that are not related in any way. Here’s the formula for computing the value of z. |
Slide 58 | Example 1: Two cities, Bradford and Kane are separated only by the Conewango River. There is competition between the two cities. The local paper recently reported that the mean household income in Bradford is $38,000 with a standard deviation of $6,000 for a sample of 40 households. The same article reported the mean income in Kane is $35,000 with a standard deviation of $7,000 for a sample of 35 households. At the .01 significance level can we conclude the mean income in Bradford is more than Kane? |
Slide 59 | The decision is not to reject the null hypothesis because the test statistic, z = 1.9781 is not larger than the critical value of z = 2.326 at the .01 LOS. |
Slide 60 | The t distribution is used as the test statistic if one or both of the samples have less than 30 observations. The required assumptions are: 1. Both populations must follow the normal distribution. 2. The populations must have equal standard deviations. 3. The samples are from independent populations. |
Slide 61 | Finding the value of the test statistic requires two steps. 1. Pool the sample standard deviations. 2. Determine the value of t from the following formula. |
Slide 62 | Example 2: A recent EPA study compared the highway fuel economy of domestic and imported passenger cars. A sample of 15 domestic cars revealed a mean of 33.7 mpg with a standard deviation of 2.4 mpg. A sample of 12 imported cars revealed a mean of 35.7 mpg with a standard deviation of 3.9. At the .05 significance level can the EPA conclude that the average mpg of the domestic cars is lower? The domestic cars will be sample one and the imported cars will be sample two. |
Slide 63 | |
Slide 64 | Check Your Understanding. Answer the question below. True |
Slide 65 | You have concluded with the Week Three Interactive Presentation. Please proceed back to Week Three in Blackboard to continue the curriculum for Week Three. |