Statistics Assignment – Inferential and Hypothesis Testing
General Guidelines
In this graded assignment, you will be asked questions based on the concepts that you’ve learnt in Inferential Statistics and Hypothesis Testing. This is a subjective assignment and hence, you are required to write the answers and submit them in a PDF file. For writing normal text, you can use MS Word — or any other similar software which can convert word files to the .pdf format. For writing equations, drawing figures, etc. you can do so on a blank sheet of paper, photograph the images and upload them in the same word document.
Please limit your answers to 200-300 words per question. While calculating values, ensure you write all the necessary steps and formulae. Also, use the correct terminology to present the solution.
The final submission will be in one PDF file. A sample PDF file that illustrates the submission format is given below. You are advised to go through it at least once — to gain further understanding of the solution methodology that you are expected to follow.
Note: The solution files will be tested using automatic plagiarism checkers, and they will attract heavy penalties if plagiarism is detected in the solution. Therefore, try to answer the question using your own words and avoid copy-pasting answers from online resources. Also in the case of photographs, please make sure that your writing is legible and easy to understand. You may lose marks if your image is not clear/writing is illegible so please be careful.
Problem Statement
Comprehension
The pharmaceutical company Sun Pharma is manufacturing a new batch of painkiller drugs, which are due for testing. Around 80,000 new products are created and need to be tested for their time of effect (which is measured as the time taken for the drug to completely cure the pain), as well as the quality assurance (which tells you whether the drug was able to do a satisfactory job or not).
Question 1:
The quality assurance checks on the previous batches of drugs found that — it is 4 times more likely that a drug is able to produce a satisfactory result than not.
Given a small sample of 10 drugs, you are required to find the theoretical probability that at most, 3 drugs are not able to do a satisfactory job.
a.) Propose the type of probability distribution that would accurately portray the above scenario, and list out the three conditions that this distribution follows.
b.) Calculate the required probability.
Question 2:
For the effectiveness test, a sample of 100 drugs was taken. The mean time of effect was 207 seconds, with the standard deviation coming to 65 seconds. Using this information, you are required to estimate the range in which the population mean might lie — with a 95% confidence level.
a.)Discuss the main methodology using which you will approach this problem. State all the properties of the required method. Limit your answer to 150 words.
b.)Find the required range.
Question 3:
a) The painkiller drug needs to have a time of effect of at most 200 seconds to be considered as having done a satisfactory job. Given the same sample data (size, mean, and standard deviation) of the previous question, test the claim that the newer batch produces a satisfactory result and passes the quality assurance test. Utilize 2 hypothesis testing methods to make your decision. Take the significance level at 5 %. Clearly specify the hypotheses, the calculated test statistics, and the final decision that should be made for each method.
b) You know
that two types of errors can occur during hypothesis testing — namely Type-I
and Type-II errors — whose probabilities are denoted by α and β respectively.
For the current sample conditions (sample size, mean, and standard deviation),
the value of α and β come out to be 0.05 and 0.45 respectively.
Now, a different sampling procedure(with different sample size, mean, and standard deviation) is proposed so that when the same hypothesis test is conducted, the values of α and β are controlled at 0.15 each. Explain under what conditions would either method be more preferred than the other, i.e. give an example of a situation where conducting a hypothesis test having α and β as 0.05 and 0.45 respectively would be preferred over having them both at 0.15. Similarly, give an example for the reverse scenario – a situation where conducting the hypothesis test with both α and β values fixed at 0.15 would be preferred over having them at 0.05 and 0.45 respectively. Also, provide suitable reasons for your choice(Assume that only the values of α and β as mentioned above are provided to you and no other information is available).
Question 4:
Now, once the batch has passed all the quality tests and is ready to be launched in the market, the marketing team needs to plan an effective online ad campaign to attract new customers. Two taglines were proposed for the campaign, and the team is currently divided on which option to use.
Explain why and how A/B testing can be used to decide which option is more effective. Give a stepwise procedure for the test that needs to be conducted.
Assignment Rubrics
Rubric
Below are the weightages for the different questions present in the assignment.
Criterion | Meets expectations | Does not meet expectations |
Question 1 (~15%) | The correct distribution is identified and all the assumptions that led to the decision of choosing the method/probability distribution are mentioned. The solution follows a coherent structure. All the numerical values that are calculated are correct and are derived using the right procedure. The necessary steps are mentioned with correct notations. The final answer matches the correct answer including the intermediate values if any. | The correct distribution is not identified and all the assumptions that led to the decision of choosing the method/probability distribution aren’t mentioned. The solution contains an incoherent structure. Values are chosen at random and the solution is not explained clearly. Random notations or no notations are used. The final answer doesn’t match the correct answer. |
Question 2 (~25%) | The correct methodology is identified and all the assumptions that led to the decision of choosing the method/probability distribution are mentioned. The solution follows a coherent structure. All the numerical values that are calculated are correct and are derived using the right procedure. The necessary steps are mentioned with correct notations. The final answer matches the correct answer. | The correct methodology is not identified and all the assumptions that led to the decision of choosing the method/probability distribution aren’t mentioned. The solution contains an incoherent structure. Values are chosen at random and the solution is not explained clearly. Random notations or no notations are used. The final answer doesn’t match the correct answer. |
Question 3 (~45%) | The correct hypotheses are identified and all the assumptions that led to the decision of choosing them are mentioned. The solution/hypothesis testing procedure follows a coherent structure. All the numerical values that are calculated are correct and are derived using the right procedure. The necessary steps are mentioned with correct notations. The final answer matches the correct answer. The correct conditions under which the given situations are acceptable are mentioned with proper reasoning. | The correct hypotheses aren’t identified and all the assumptions that led to the decision of choosing them aren’t mentioned. The solution contains an incoherent structure. Values are chosen at random and the solution is not explained clearly. Random notations or no notations are used. The final answer doesn’t match the correct answer. The correct conditions in which the given situations are acceptable aren’t mentioned. Also, the reasoning is vague. |
Question 4 (~15%) | A proper explanation is provided with a clear step-wise procedure. | An improper explanation is given with an incoherent procedure |