Overall Instructions.
- Use a font color other than black for your inputs and answers.
Question 1 (2 points)
In “Why Experiments”, we discussed cases in which data correlation does not imply causation. In fact, a specific data pattern (e.g. a correlation) can often be supported by multiple underlying stories. An important part of the data scientist or quant marketer’s job is to pin down which story can best describe all different patterns found in the data. As a manager, when your experience with data grows, the ability to guess what is driving the data pattern will become your second nature. You will automatically question the credibility of a causal claim when it is based on data correlation.
The following examples will train your mind to think about possible explanations of a given data pattern. In each case, a data pattern is shown, and one (causal) claim is provided. If you find the explanation provided is not the only explanation, describe one other possible explanation/story that could drive such data patterns. If you belief the claim is the only explanation for the data pattern, briefly discuss why you support it. You are encouraged to think a bit outside the box for the potential causes of the data pattern as far as they are sensible. The grading will be based on whether your explanation you give can produce the data pattern observed.
Optional background reading: “Correlation is not causation: why the confusion of these concepts has profound implications, from healthcare to business management.”
Data Pattern: (This is an example. No grade is assigned.) Conversion rates are higher for longer sales calls.
Claim: This means keeping a longer conversation with the customers is a good strategy to get them to buy, so the telemarketing salespeople should push for longer conversation time if possible.
Your explanation: Customers who are more interested in purchasing the products stay longer on the line. This is why customers who eventually purchased have a higher average call time than customers who did not purchase.
Data Pattern 1 (0.5 points): An automaker runs a display ads campaign for a new car model, and they summarize customer conversion rate into purchasing based on their reactions to the auto ads.
The data shows that consumers who did not click on any of the auto maker’s ads have a <1% conversion rate into purchasing the new model. Among consumers who have clicked on the dealer ads alone, the conversion rate is 3%; among consumers who have clicked on the carmaker ads alone, the conversion rate is 5%; and among consumers who have clicked on both dealer and car maker ads, the conversion rate is 14%.
Claim: The data pattern suggests that (claim 1) carmaker ads are more effective than dealer ads, and (claim 2) combining carmaker ads and dealer ads together is even more effective in persuading consumers.
Your explanation:
Data Pattern 2 (0.5 points): OperaXYZ usually holds its new music season ticket sales event in May. To prepare for the May sales event for OperaXYZ’s 50th anniversary, its marketing department has redesigned its ticket sales pages. In June, the marketing department looked at the numbers and found that the conversion rate from site visit to purchase actually dropped in May compared with the previous year.
Claim: The marketing department concludes that the new design is a failure, and the new design has led to worse conversion rates.
Your explanation:
Data Pattern 3 (0.5 points): BestHome is a national retailer that sells electronic home products, and it has physical stores across the country. It is planning to launch shop-in-shops (SIS) with Apple for Apple home products. The SIS model dedicates space in BestHome shops for Apple and allows Apple to design the space similar to an Apple store. BestHome is hoping that the Apple SIS will not only increase Apple sales but also attract more Apple fans so they may end up buying other products in the store. BestHome has launched SIS in 50 stores where Apple products have the highest sales. After six months, BestHome compares sales data between stores with Apple SIS and those without. Data shows that 1) Apple products sales are higher in stores with Apple SIS than those without; 2) non-Apple product sales are higher in stores with Apple SIS than those without.
Claim: BestHome concludes that Apple SIS model increases both Apple product sales and has a positive spillover effect on the sales of non-Apple products.
Your explanation:
Data Pattern 4 (0.5 points): GEE is home electronic company. It has four generations of Smart Freezer products. It has recently developed the fifth-generation and heavily marketed the new product. At the same time, the new generation is on average $500 more than the average price of the fourth-generation Smart Freezers. Data shows that the first-month sales of the fifth-generation Smart Freezer is higher than that of the fourth generation.
Claim: GEE concludes that paired with advertising, higher price tags send a signal for a higher quality of the new generation of products, so when advertising is done correctly, a higher price may lead to higher demand.
Your explanation:
Question 2 (4 points)
The purpose of this question is to have you practice the skill to design a randomized controlled experiment to solve marketing challenges. Read the four cases in the PDF document “Dare to Experiment: The Scientific Approach to Consumer Behavior” Answer the question at the back of each case and additional questions listed below. Your answer will be graded based on completeness and correctness.
Case 1: “Bring Back the Billboard”
In the space below, answer the two questions asked at the end of this case. In answering the second question, you do not need to propose new methods, but focus on what needs to be changed. (0.4 points)
If Marek Tkacik, VP of marketing, decides he needs to use randomized experiments to test which image works better. Think about a randomized experiment that can help to solve this problem and write down the experiment details (including but not limited to the following: 1. Sample: random sample of what…, 2. What randomization unit… parameters for randomization, 3. Treatment / Variant… , 4. Outcome Metrics to track… 5. Statistical power of Outcome Metrics… binary vs. continuous, variance, 6. Guardrail Metrics) (0.6 points)
List one advantages and one concern of using a randomized controlled experiment in this particular case. (0.4 points)
Case 2: “Oh Where, Oh Where Have All the Donations Gone?”
In the space below, answer the two questions asked at the end of this case. To be more specific, for the first question, briefly discuss whether the goal Lozanda sets is well defined as the goal of a randomized controlled experiment. If not, in your answer to the second question, try raising an alternative goal and write a design plan for an experiment. If you believe Lozanda’s goal is appropriate, also write down your design plan for an experiment. (0.8 points)
Case 3: “Not So Retiring”
Answer the Case 3 questions in the space below. (0.9 points)
Case 4: “The Problem and the Experiment, or the Experiment and the Problem”
Answer the Case 4 questions in the space below. (0.9 points)
Question 3 Evaluating Advertising Effects with Randomized Experiments (1 point)
We will be using the following case, the purpose is to be familiarized with the setting of this case.
Background
RestaurantGate (RG) is a national restaurant review website. RG sells search advertising packages to restaurants. Restaurants pay $100 per month for the advertising package and are featured in the search advertising slots (the top position on the search result page). RG would like to internally evaluate whether restaurants get more page views and leads by purchasing the advertising package. For this purpose, RG decided to run an experiment to achieve this goal.
The Experiment Design
In this experiment, RG randomly selects a sample of 20,000 restaurants that are not actively advertising on its platform. The restaurant sample is representative of all restaurants in the US. Half of the restaurants in this sample are selected into the treated group. The treatment is the standard search advertising package that RG sells. The experiment will be run for a month.
During the experiment, each treated restaurant will receive an advertising package for free, and each control restaurant does not receive any advertising. None of the restaurants are informed about the experiment, and RG expects the restaurants to run their businesses as usual. RG will track the monthly page views of each restaurant. (Each restaurant has its own page on RG, just like a restaurant’s page on Yelp or Tripadvisor.) The number of calls made to the restaurant from its RG page and the number of reservations made on the restaurant through RG. RG wants to assess the effect of paid search advertising package on these outcomes.
Discussion
(a) List one reason you can think of that motivates RG to select a sample of restaurants “that are not actively advertising on its platform.” (0.3 points)
(b) If you are the manager overseeing the experiment, are there any factors that worry you as threats to a valid/correct experiment? List one of them. (0.4 points)
(c) For RG, one important distinguishing factor that segments the restaurant market is whether a restaurant is a chain restaurant. When designing the experiment, RG has in mind to test the effect of advertising for chain restaurants and independent separately. The best practice to achieve this is to split the sample into chain restaurants and independent restaurants first, then within each subsample, randomly allocate restaurants into the treated and control groups. You can evaluate the advertising effectiveness by comparing outcomes between the treated and control group within each subsample. Based on your intuition, how will search advertising effects differ between chain and independent restaurants? Briefly explain. (0.3 points)