a. Revised Experiment Goals
Sharing: Is the process of sharing a piece of news easy for the users, as measured by time and number of mistakes when compared to a similar real world application?
Satisfaction: How satisfied are users with the overall experience?
We have decided to delve deeper into the experimental goal that focuses on sharing, while avoiding the moderation and filtering goals mentioned in a previous blog post, as sharing was a major sticking point among all participants during Milestone II. We hope to run an experiment which will help us better determine what is required to make sharing an intuitive and well received feature. We have decided that comparing our sharing/posting feature with another successful real world product, which performs similar features, will provide adequate feedback as to whether our system is successful. We hope to determine satisfaction levels via a questionnaire that addresses the seven aspects of the user experience honeycomb.
b. Experiment Method
Participants:
We plan on recruiting 10 participants from our social circles into the subject pool. We chose to use the aforementioned approach because our experiment does not have a strict requirement for participants, as the expected user population for our system varies greatly. The ideal candidates for this experiment will have little to no experience with Reddit’s mobile application. We can determine whether this is the case during an intake survey. We hope to find candidates who have little experience with Reddit because we will be comparing our system against Reddit’s mobile application, and if the participants are experienced with Reddit’s mobile application, our system will be at a disadvantage during the experiment.
Conditions:
For the experimental goal of sharing we will use time as a measure of how easy sharing is (a faster time indicating a more intuitive sharing interface) and we will count the number of errors made when the user tries to post an article. The aforementioned performance data will be collected with an identical task from both a real world product (Reddit’s mobile application) and our application.
Tasks:
The user will be asked to explore the first application for 30 seconds. After this exploration period, the participant will be asked to share the same news stories using our application by posting it through our application.
Design:
Our experiment will follow a basic 2-sample (2 condition experiment) experiment. We plan on making a comparison between the performance data from Reddit’s mobile application and our system design.
Procedure:
- The interview should be conducted by two interviewers involving one participant at a time.
- We will first ask the participant to sign a consent form, and brief the participant on the experiment that will follow.
- A camera will be set-up on a tripod, without a person behind the lens. This is a deliberate decision to minimize the awkwardness of introducing a camera to an experiment.
- The interviewer will follow the script (4.c) as closely as possible, in order to provide all the participants with a similar experience.
- The participants will be asked to complete a pre-questionnaire (4.c) that will gage demographic information, and more importantly previous knowledge about similar systems.
- A printed out sheet of tasks will be read aloud and then handed to the participants. This will be done in order to avoid having to remind participants of the task.
- Half of the subject pool will be asked to explore Reddit first, and the other half will be asked to explore our system first. This is done in order to avoid bias that may be introduced by having learned the tasks, considering identical tasks will be completed on both applications.
- Interviewer 1 will be observing and taking free-hand notes, and interviewer 2 will be filling out the coding sheet (4.c). We have decided to use 2 interviewer because we feel that a single interviewer may be unable to capture all the small details of the participants interaction with the system. We also understand that not all interviewers are alike, and some may be more capable of collecting useful data. We hope to minimize lost data by having two interviewers and recording video of the interaction.
- Next, the participant will be asked to explore the news feed in the first application for 30 seconds. The participant will then be asked to complete the first task. Which is to post a predetermine story to the system. This interaction will be timed, and an error rate will be determined.
- After a quick 1 minute break to regroup, the user will be asked to do the same task on the second system. A 30 second exploration period, followed by trying to post a predetermined article on the application.
- Finally the participant will be asked to complete a post-questionnaire (4.c). This questionnaire will determine general satisfaction levels according to the user experience honeycomb and any other feedback the participants have.
Apparatus:
- A camera will be set-up on a tripod in order to record the participants’ interactions with the application for a later, deeper, analysis.
- In order to ensure that conditions remain consistent between participants, the user will be provided with an iPhone that has both applications pre installed. The location of the experiment is an informal setting that the user is comfortable in (E.g. the participants’ work area). This is in order to emulate the location that this application may actually be used in. We feel that conducting the experiment in a lab setting will be unnatural.
- The script will be printed out and available to the interviewer if needed. The tasks will also be printed out, as the participant may need to reference them during the experiment.
- A laptop will be provided, as it will be needed in order to complete both the pre and post questionnaire.
Independent Variable:
Types of interface
- MR. JDK’s prototype
- Reddit’s mobile application
In this experiment, we will compare user performance, on a specified task, of Mr.JDK’s mobile application to the performance of using Reddit’s mobile application.
Dependent Variables:
Participants’ satisfaction level:
- During post experiment interview, we will ask user to mark their level of satisfaction on the Like-Chart which allows participants to choose the answer on a continuum of numbers from 0 to 100
Time of completing the given task:
- We will use a stopwatch to keep track of duration.
Number of errors:
- We will mark each error made by users when they perform the task, and add them up at the time of completion.
- Definition of error: participants click on the wrong button while completing the task.
Hypothesis:
H1:
Sharing/posting news stories on Mr.JDK’s prototype leads to faster user performance (measured as time to complete the task), compared to the Reddit’s mobile application.
H2:
Sharing/ posting news stories on Mr.JDK’s prototype will be less errors prone than sharing/ posting news stories on Reddit.
H3:
Using Mr.JDK’s prototype generates a higher satisfaction level than using Reddit.
Planned statistical analysis:
There are three factors we are measuring in this experiment: number of errors, time of completion, and numerical form of participants’ satisfaction level. We plan to use both T-test and ANOVA, giving us the options to examine the collected statistics from two different angles.
- ANOVA: Due the nature of our experiment design, ANOVA will be an ideal statistical analysis tool. Our design experiment is moderately complex, including independent variables and specific value of independent variable. ANOVA allows us to compare means between two or more factor levels within a single factor and determine whether there is a difference between two sample groups. Using ANOVA to compare the relationships between many factors provides more informed results, considering the interactions between factors in our experiment design.
- T-test: We plan to use two-tailed unpaired T-test in order to establish the confidence level in the difference we will probably find between 2 sample means. Specifically, we will run a between-subjects experiment, where we will count the number of errors under each condition, time the duration of completion and satisfaction level. In order to compare two sets of independent observations, and determine whether if Mr.JDK’s prototype is better than Reddit’s mobile application or not, we as designers need to figure out whether there is a significant difference between the means which T-test can provide valid an answer.
Expected limitations of the planned experiment:
Due to time, and financial restriction, we are only able to recruit 10 participants to conduct our experiment and confirm the quality of our medium fidelity prototype. The limited amount of participants may to lead to less significant difference in data from two sample groups. As designers, we should keep in mind that statistical insignificance does not imply that the difference is not important. We could counterbalance this by taking even trivial difference into account.The fact that we drill down into one aspect of this application – Sharing could lead to a higher error rate due to the lack of horizontal design and the vertical depth in most part of Mr.JDK’s design.