Author Archives: FedericaLuraschi
Milestone III – Blog Update #4
Blog Update #4a – Revised goal(s) of experiment:
- On the level of content curation, the goal is to evaluate how our app compares to Google Maps, TripAdvisor or online travel resources. The comparison would be in terms of user satisfaction and frustration the user may feel during the process of usage but also on the level of the time taken to complete a task on our app vs. other tools.
- More specifically, in what ways is our app better or worse than the currently available resources for travelling.
Blog Update #4b Experiment method: detailed description of the following components:
- Participants – For our experiment, we are planning to have a minimum of 5 participants. These participants should range from non-computer science students who are more likely to believe that the concierge bot is a true AI system to computer science students who may be more skeptical to the technology. The participants should also be people that have experience travelling since they are the people that would actually use our system. To recruit our participants, we plan on sending out the questionnaire we used for our field study which asks general information about whether a person has travelled or not. If they happen to have experience travelling, then we plan on recruiting them for our experiment.
- Conditions – For our experiment, we plan on asking participants to use our interface to complete a task and also to complete the same task using Google and online resources that are available to them through the use of Google. Although the task itself will be the same (e.g. find a museum in the city), the city for our interface will be Tokyo and the city will be Seoul for Google. This is to make sure that the two tasks a user will complete are independent of each other and the user can’t use previous knowledge to complete the second task they are asked. Furthermore, we plan on randomizing whether the participant tests our interface first or Google and other resources first, again to make the results as accurate and independent as possible. We chose Tokyo and Seoul as the cities because they are similar to keep consistency between the two tasks while still producing independent results.
- Tasks – As mentioned we are comparing our interface with Google (and online resources that can be found through Google such as TripAdvisor or even Google Maps) with regards to collecting information about a travel destination. More specifically, the participant will be given a task respective to the interface they are using. When using our interface, the user will be asked to find a museum to visit in Tokyo. When using Google, the user will be asked to find a museum in Seoul. The task is focused and specific enough to direct and guide the user while producing useful data yet gives the participant full freedom to show us their process for museum selection in a new city.
- Design – Regarding our experimental design, participants will be given a phone on which they will need to complete the tasks required. This phone will give access to Google and also have our prototype on it for this experiment. Depending on whether the participant is using our interface or Google, they will need to find a museum to visit in Tokyo or Seoul, respectively. As mentioned, the task they complete first will be randomized. Because our app features a chat-bot, we will be on the other end of the chat answering questions they may ask in order to complete the task. The level of frustration, satisfaction and time taken to complete tasks on both interfaces will be tested and compared. We plan on timing the process and through observation, noting any difficulties the participant may have in either case.
- Procedure – In randomized order participant will complete either 1 or 2 first
- Participant is given the task to search for museum on our interface in Tokyo
- Participant is given the task to search for museum on Google in Seoul
- Participant is asked interview questions regarding frustration levels or overall satisfaction with both processes
- Apparatus –
- Phone with our prototype or access to Google (including other online resources)
- Remote person to answer participant’s chat questions
- Interview Questions
- Script to run identical process for all participants
- Independent and dependent variables- The data collected for statistical analysis will be from the post experiment interview questions we ask which will rate confusion, frustration or preference over the two interfaces. We will also be timing the process of task completion for both interfaces. The dependent variables will be the time spent on task completion, user satisfaction ratings and preference, and the number and severity of confusing points encountered. The independent variable will be the interface type (prototype or Google with access to other online resources). The individual differences in experience with using tools for trip planning will be offset by randomizing the task completion order using either interface within participants.
- Hypotheses – remember to state these in terms of the independent and dependent variables
H0 – There is no significant difference between the two interfaces in terms of time spent, satisfaction/preference, and points of confusion for the users
H1 – Users will be more satisfied with the prototype and prefer it over using Google (including other online resources) for carrying out the task
H2 – Users will experience significantly fewer points of confusion when using the prototype when compared to using Google (including other online resources)
H3 – Users will spend significantly less time in completing the task when compared to using Google (including other online resources)
- Planned statistical analyses
We plan on running t-tests on the data collected to test the validity of the hypotheses. Depending on which hypothesis we are testing, the data collected used will differ. Data will be gathered from post-experiment interview questions.
- Expected limitations of the planned experiment
Our prototype features a map and a chat-bot. This map in our prototype will be hard-coded and cannot be updated dynamically for user requests. Although we plan on covering the cases necessary for the user to complete the task, the lack of dynamic mapping may limit the information that a user can request or receive from the use of our app and therefore, our results. Another limitation is that this chat-bot is not an intelligent AI system as we are letting the user believe, but rather a human. Given this, the responses the user receives may be delayed and may not be as thorough as the ones of a properly implemented AI system. Also, most the data collected for statistical analysis will be from the post-interview questions so there is a chance they may not fully quantify certain aspects of our interface.
Blog Update #4c – Supplemental experiment materials: