Matthew Lawrence Christy
UXD/HCI Student / Game Designer

The DoorDashers' Stories

A/B research methodoligies

Image alt tag
Introduction

DoorDash has been a lifeline for both restaurants and drivers since the start of the COVID-19 pandemic. The increase in online orders has been helping restaurants keep their doors open, while the compensation and tips received from fulfilling these orders have been helping millions of Dashers facing economic hardships.

We predict that providing a way for users to view personal stories for Door Dashers' will increase the likelihood of tipping and the tip amount because users will have an empathetic viewpoint of Dashers' hardships during the COVID-19 pandemic. We will know this is true if we see an increase in tips and dollar amounts.

Methods

We propose conducting A/B tests on two user Test Cells;  Test Cell B would receive the Dasher story update (an option to view their story and tip), and Test Cell A, the control Test Cell, would continue using the current DoorDash interface that give the user the option of tipping without viewing a story.

Upon opening the DoorDash app, users will be randomized into the control or test Test Cell. The control Test Cell will receive access to the current DoorDash app, while Test Cell B will receive access to DoorDash with the following change implemented: 

The Delivery Status screen of DoorDash will receive a modification to display a personalized quote of how COVID-19 may have affected their Dasher. A "View Story" option will be located next to it so that the user may tap to view the Dashers story page, where users may reconsider adding a tip or increase the existing tip amount.

○      Do they tap the "view story" page and reconsider tipping?
○      Do they increase the amount of their existing tip?

Current Tipping UX
Proposed change to the UX
Data Collection Plan

We have decided to run our A/B tests on Fridays, Saturdays, and Sundays from 5 pm until 3 am for four weekends (approximately one month) as DoorDash analytics suggest that Friday, Saturday, and Sunday evenings are the most common days people order takeout food. The test and the control Test Cells will consist of 250 randomly selected DoorDash users. The data to be collected and the metrics used for analysis are explained below. 

  • Story views and the number of tips and tip amounts received are appropriate metrics to accomplish the objectives of this A/B test. These quantitative data will be automatically collected utilizing a dedicated A/B testing tool with integrated statistical analysis functionality (e.g., Optimizely); this is a form of remote, unmoderated observation. 

  • The following metrics will be collected from Test Cell B: 

    • Primary metric: Whether or not a user used the "view story" button to view Dashers' personal messages.   

    • Secondary metric: The number of tips received after users have viewed Dashers' personal stories.

    • Secondary metric: The dollar amount of the tip received after users have viewed Dashers' personal stories.

  • The control Test Cell A will continue to collect data on whether or not the user tipped and the amount for comparison.You can write here as much as you want, this text will always look nice, whether you write longer paragraphs or just a few words. Click here and try it out.

Data Analysis Plan

There will be two categorical values; the first is whether the user viewed a Dasher's Story or not (Test Cell B), and the second is if there was a tip left (both Test Cells A and B). There is one continuous numerical variable: the tipping amounts (both Test Cells A and B). All data variables are quantitative. 

We will begin by analyzing the conversions in Test Cell B. The data of the users who did not view the story will be discarded. We will then compare the conversions (those in Test Cell B who tipped after viewing the story) to the number of users who left tips in Test Cell A. This will determine the likelihood of users leaving tips and any difference in tipping amounts after reading the story to those who did not have the option to read the story. This will be done in two tests: 

  • The first test will be a proportion test to analyze the proportion of users who tipped after reading the story in Test Cell B to the number of users who tipped in Test Cell A.

  • The second test will be a t-test with a .05 alpha to compare the tip amounts of users tipped in Test Cell B to the amounts of users tipped in Test Cell A.  

We expect that the results from the data collection and the analysis using the data from the two tests supports our hypothesis that users who can view and empathize with a Dasher's personal story will be more likely to leave a tip at a more significant amount. 

Discussion

Interpreting the results

Success: If the analysis of the results of A/B testing proves our hypothesis and the experiment succeeds, a secondary experiment should be conducted to verify the first experiment's results as per our budgetary limits. Edge cases might include demographics such as economic status, cultural viewpoints, etc. Should the second experiment yield the same results, a rollout of the feature would be recommended. We hope that the rollout will not affect users who aren't able to tip but persuade those who can. 

Failure: If the experiment fails, it might be an inherent cultural viewpoint regarding tipping, which may have influenced the results. Cultural views can vary from economics to geographical location. Surveys sent to the test cell could help collect data to determine why the experiment was a failure resulting in redesigning the experiment. An example would be a second experiment that includes the geographical demographics of tippers being tested in both test cells should be considered. This may yield data that would indicate the type of person ordering, e.g., delivery to an area with a high student population, such as a university. 

Design Implications to Consider: 

The stories shared by the Dashers can also come across as pretentious and, in a sensitive environment, might trigger the user to not tip instead of having a positive impact.

The possible risk of a "slight nudge" being a dark pattern remains a problem in the service app industry and with such features that use users' empathy. Ths direction of the results can take a risky turn. What seems like a harmless narration of personal struggle can inadvertently be viewed as a ploy to press users. 

Potential Pitfalls: 

This experiment may not include all the factors that impact tipping habits. To tackle this problem, an experiment on factors that influence tipping in DoorDash can be a proposed experiment '0' for this research. Further exploration of what empathy looks like and how it can be quantified amongst online users can help us understand users and their habits. 

Another consideration is that the "View Story" option might be enticing at first, but users may develop an aversion to tapping the option out of fear of influence, hence negating the entire point of the design change.

Conclusion

As mentioned in the introduction, while there is some research in the field of OFD, the factors that determine tipping behavior in this industry are yet to be determined. While our research aims to determine if sharing personal stories affects tipping behavior, what this suggests about the empathy towards the delivery persons/agents remains unquantified. However, we hope that our hypothesis will help quantify this behavior to some extent. This will establish a relation between tipping behavior and emotional incentive.