Category Archives: Milestone III

Update 5b: Prototype Illustration

The following video details our medium fidelity prototype. For more details regarding the rationale of the design, see Update 5a.

https://www.youtube.com/watch?v=AV7z517uYQg

Update 5a: Rationale of Medium Fidelity Prototyping Approach

Our prototype was built using HTML and Javascript. Originally, our group decided to use Axure as our main prototyping tool, but changed our decision when we realized there would be too many limitations using Axure alone, especially regarding video function. When interacting with the prototype, users can add and scroll through both comments and timestamped annotations, watch an embedded video with standard YouTube player controls and select hyperlinked video segments to traverse different portions of the tutorial.

We learned from our field study that these functions were often used by participants while watching an educational video. Therefore, they would be important to include to thoroughly test Task 1 (completing an entire task in a video). The prototype also contains all the functionality we plan to test for Tasks 2 and 3 (finding specific annotations).

There are both vertical and horizontal aspects of our design. Comments and timestamped annotations can be made with the prototype. However, if a user were to add a timestamped annotation, it does not actually give the current playing time of the video, making this functionality horizontal. We decided that this is acceptable for our design, since testing users on their ability to add comments and annotations is no longer being tested as one of the tasks in our experiment (although we initially planned to test the process of adding comments/annotations, we decided to no longer pursue this due to time constraints and experiment complexity). Furthermore, video segments have been decided in advance by our team. They are not generated based on users access patterns of the video and videos queues (i.e. screen transitions, long pauses), as they would be ideally in a fully-functioning design. We decided that it was also important for video annotations to update automatically when a new segment is reached. We wanted users to see that annotations are specific to each segment and observe if their interaction with these segments affected their ability to complete the tasks in the experiment.

It was also important that our prototype had a somewhat professional appearance. Since users will be tested using both our interface and YouTube, we did not want them to feel that our design was less serious or professional, and judge it based on this fact alone.

Update 4c: Supplemental Experiment Materials

This is the questionnaire that will be given to participants after they have completed all tasks in the experiment on both interfaces:

Experiment_Questionnaire

The following is the consent form the participants will be asked to review and sign before participating in the study:

ConsentForm

Update 4b: Experiment Design

Participants:

We are planning to have have our participants composed of the general population. The inclusion criteria are as follows:

must be able to use a computer and follow along with an educational video.
must not be an expert of the video tutorial subjects used in the study

We plan to recruit participants using word of mouth (convenience sampling) and a call for participants at UBC. We expect to recruit and perform the experiment on 8 participants due to our 4 combination counterbalancing.

Conditions:

In this experiment, we will compare a user’s performance on our prototype versus the user’s performance using YouTube. We will examine how quickly users can perform tasks on both interfaces. This includes how quickly users can find annotations (a timestamped user remark) as well as complete an entire task described in a video tutorial. In addition, we will be looking at a user’s willingness to use each system and their preference.

Tasks:

On each video and interface, participants will be asked to perform the following tasks in the given order:

Complete entire task described in video
Find a specific high visibility annotation
Find a specific low visibility annotation
Complete questionnaire including Likert scales indicating their preferences

Design:

To test the speed of finding annotations, we will use a 2×2 (annotation visibility x interface type) within-subjects factorial design. We will have 2 levels of annotation visibility: high and low. We will also have 2 levels of interface type: YouTube (System Red to participants) and our interface (System Blue). Regarding annotation visibility, high means that the placement of the annotation is in an immediately visible position in the list of annotations without scrolling on our system. A low visibility means that the annotation may only be found after scrolling through the list of annotations on our system. To test the time taken to complete an entire task described in a video, we will use a t-test to compare both interface types.

We will use a counterbalancing method to eliminate order effects. Participants will interact with both interfaces with two different videos. For example, a user might be assigned to the first video on our system followed by the second video on YouTube. There are four possible possible combinations displayed in the table below:

Table 1: Counterbalancing method for our experiment

Combination	First Scenario	Second Scenario
1	YouTube, Video 1	Our System, Video 2
2	YouTube, Video 2	Our System, Video 1
3	Our System, Video 1	YouTube, Video 2
4	Our System, Video 2	YouTube, Video 1

We plan to counterbalance this way because a user cannot watch the same video twice, due to learning effects. After completing a tutorial, a user would become familiar with the steps and anticipate what should be done next, biasing our results. Thus, we are using two different videos regarding knot tying, choosing the videos based on:

Similarity of content. We wanted the videos to be similar enough in content and length to be comparable without introducing video as a factor, but different enough to eliminate significant learning effects.
Number of comments. The videos should have a similar number of comments.
How easily segmentable they are. We want to use videos that have logically segmentable sections, so that the segments can be potentially useful for users.
How likely it is for participants to already know how to complete the prescribed task. We want to use tutorial videos that users will probably be unfamiliar with, to address the potential confounding factor of participant expertise.
How complicated the video is. We want the users to be able to complete the tutorial without struggling greatly, but also make sure the task to be completed is non-trivial.
How lengthy the video is. Since participants will be watching two videos, we do not want to bore them with long videos.

The videos chosen are as follows:

(Video 1) How to Tie the Celtic Tree of Life Knot by TIAT: https://www.youtube.com/watch?v=scU4wbNrDHg

(Video 2) How to Tie a Big Celtic Heart Knot by TIAT: https://www.youtube.com/watch?v=tfPTJdCKzVw

For the Youtube interface, the participant will be directed to the corresponding video hosted on Youtube. For our developed interface, the participant will interact with the interface on a local machine. The comments (non-timestamped remark) and annotations for our developed system will be imported from the Youtube video hosting the same video. These will be randomly assigned in our system to be either a comment or an annotation, with a 50% chance of each. We decided on 50% since there is no precedent for a system like this to provide more accurate data on how comments and annotations should be distributed. Similarly, we assume that annotations’ timestamps are uniformly distributed across time. To make a fair comparison between both interfaces, all comments will be sorted based on most recent to least recent. For our developed interface, the video will be segmented manually beforehand based on places where the video either visually or audially pauses for more than one second or where the video transitions in some way (e.g. a screen transition).

Procedure:

Participants will be informed of the goals of the research experiment and consent will be reviewed.
Participants will be asked about their familiarity with the tutorial subject to exclude participants who are experts in the subject.
On the first interface and video combination as determined by counterbalancing, the participant will be asked to perform the experimental tasks in the order given. Tasks 1 to 3 will be timed and recorded by the researcher using a stopwatch. The researcher will also take any relevant notes as the participant is doing each task, especially when the participant interacts with the segments feature in our developed interface. To determine when a task has been completed for timing purposes, the participants will be asked to indicate to the researcher when they have completed the task. The task will only be accepted once it has been completed correctly.
Repeat step 2 using the other interface and video combination.
Have the participants fill in a questionnaire regarding their demographic information and their preferences regarding the two interfaces that they used.
Ask if the participants have any remaining questions before concluding the study.

Apparatus:

We plan on conducting this experiment in a quiet environment, such as ICICS x360.
The participant will use one of the computers available in the lab room to perform the tasks required while the researcher observes from the side.
Cell phone timers will be used to record the time it takes a participant to complete all tasks.
During the experiment, notes will be taken by team members on either a laptop or on paper.

Hypotheses:

Speed:
H1. Finding a specified annotation is faster using our system compared to Youtube for high visibility annotations.
H2. Finding a specified annotation is no slower using our system compared to Youtube for low visibility annotations.
H3. Completing an entire task prescribed in a video is no slower on our system compared to Youtube.

User Preference:
H4. Users will prefer our system’s comment and annotation system over Youtube’s.
H5. Users will not have a preference towards either system overall.

Priority of Hypotheses:

H4 and H3 are most important since they have more direct and tangible implications for design; H3 is concerned with overall usefulness of the system and H4 is concerned with users’ willingness to use the system.
H1 is important since it tests one of the big potential advantages of our system; however, it is in a more limited scope and applicability than H3.
H2 is reasonably important since it examines the potential tradeoff of having annotations and comments in separate sections.
H5 is least important since it is dependent on a comparison of a fully functional interface and a still-in-development interface. At this stage, it would be beneficial to get a sense of users’ overall opinion, but it is important to recognize that this may change as our interface is developed.

Planned Analysis:

For our statistical analysis, we will be using a 2 factor ANOVA (2 interface types x 2 annotation visibilities) for the time it takes for the participant to find specific annotations in our system compared to Youtube’s interface. A two-tailed paired t-test will also be used to compare the completion time of an entire task between the two interfaces. In order to measure the user’s preference of interface type, descriptive statistics of Likert scale data will be collected from each participant’s questionnaire.

Expected Limitations:

There are various issues that we expect to be limitations in our experiment, including:

The type of video we are testing. Since we are only using one specific type of educational video in our experiment, we may miss out on some interactions users have with different types of video.
Breadth of Comparison. Our experiment will only be testing our system against Youtube. We are not accounting for differences that may exist between our system and other popular video contexts, such as Khan Academy.
Comment/Annotation placement. The way that “existing” comments/annotations are placed in our system is predetermined. We are assuming an equal chance for a user to post an annotation or comment, and that annotations will be uniformly distributed by time. Since there is no precedent for a similar system, we cannot determine the validity of this assumption.
Video Segmentation. We are segmenting the videos based on our own judgement whereas the fully functional system would automatically segment based on user input. This may limit the validity of the video segments that have been chosen.

Update 4a: Revised Goals

Our revised experiment goals are to:

Determine if users prefer our system to YouTube, an existing platform that is widely recognized and used.
Determine if our new system is faster for a user to find annotations (a timestamped user remark) useful for their interaction with the video.
Determine if our system is faster overall for an entire task described in a video.

Update 3e: Proposed Goals of Experiment

The list of potential goals for our evaluation are as follows:

How error-prone is the system and how clear are the steps to perform a task? We could measure how much users hesitate or make with the design in accomplishing a set of tasks. This value could be compared against a set threshold of number of hesitation points and number of errors.
How quickly can a user perform a task? We could measure the time taken for a user to perform a task, as an indication of the efficiency of the system. The time taken can be compared against predetermined threshold times.
What do people think of the new system compared to existing platforms? We could measure this using a likert scale to quantify the user’s willingness to use the design compared to a neutral threshold.

Ranking of Importance:
3,2,1

Item 3 is most important because people need to be willing to use the system in order for it to be a useful system to develop.

Item 2 is next most important because efficiency is related to how usable the interface will be- if the task takes too long to perform, then the interface will suffer.

Item 1 is least important since although users may hesitate at certain steps, the interface is meant to be forgiving and the consequences of hesitation or making an error is minimal.

Ranking of Testability:
2,3,1

Item 2 is the easiest to test because we can measure times and compare the number obtained with the threshold times to complete a task.

Item 3 is more difficult because we are relying on a user’s opinion, which might not always be completely truthful. However, if we collect their opinions using a Likert scale, it will be easier to analyze these results.

Item 1 is the most difficult because it could be hard to determine when a user hesitates.

Based on these ideas, we have decided to focus on goals 2 and 3 for our evaluation, especially since idea 1 ranked lowest on importance and testability.

Update 3d: Task Examples

Our task examples have not changed. The remain the same as in Update 2b.

Update 3c: Walkthrough Report

Through performing our cognitive walkthrough, we learned that some of the terminology used in our paper prototype was confusing to the user. It was unclear what “annotations” signified and how they differed from comments. However, once annotations were explained, the user was easily able to realize how to add a timestamped annotation. The user also noted that the relationship between annotations and comments should be made more clear, both conceptually and in terms of the visual layout of the design. It was suggested that a timeline displaying all of the annotations on a video may be more useful and understandable when watching a video. A positive aspect of the prototype discussed in the walkthrough was that it was straightforward to add comments because of the interface’s similarity to YouTube.

Some specific findings that we found for each task example are as follows:

TE1:
The annotations/comments section of the design solves the user’s problem, which is not having an easy way to find information about a confusing part of a video. The walkthrough showed us that users were able to easily deduce how to make both annotated and regular comments, or start a discussion about something unclear to them. A problem that arose for this task example was the terminology used in the prototype design. The word “annotations” did not provide a clear description to the user.

TE2:
Task example 2 was well supported by our paper prototype. Our prototype granted users the ability to make annotated comments with specific timestamps for a certain part of the video. Additionally, the comments were displayed in a very visible manner and users watching the video could easily see remarks made by other users at that particular timestamp. However, it was unclear how a user could add a video link to a comment as a supplement.

TE3:
For task example 3, we supported a user being able to annotate a video with what he/she may find useful to clarify the content. However, as previously mentioned, there was confusion with the difference between comments and annotations.

Update 3b: Description of Prototype

The purpose of our paper prototype and cognitive walkthrough was to develop a high-level understanding of the interactions between important individual functions derived from our field study. After determining these functions from our field study, we wanted to explore how users would interact with these functions and how these functions would interact with each other when integrated into a single design. An illustration of the prototype can be seen in Update 3a.

We have chosen to support all of our task examples, which can be found in our previous blog post, Update 2b. Our task examples were supported in the design by allowing users to add annotations at specific parts of a video, or to leave a general comment. For task example 1, a user might add a link to a video aid in their annotation, to help other users clarify a specific idea in a tutorial. For task examples 2 and 3, a user can personally make an annotation for a certain timestamp on the video to note errors or suggest alternatives, applicable to both tutorial or informational videos.

There were several key design decisions we made that influenced the design of our paper prototype. Firstly, we wanted to use a familiar video website interface to create a positive transfer effect, to make the interface intuitive for new users. For this, we used the YouTube interface as a basis due to its popularity. Another important design decision was to make the interface simple, without overwhelming the user with too much information at once. Related to this, we wanted to make the interface consistent across different contexts, whether it be a more casual tutorial context as in task example 1, or a more formal educational context as in task example 3. Consequently, we did not include video transcripts in the design, because it was not completely necessary for a casual educational video. Instead, we focused on the commonality across these different contexts for this prototype. Additionally, we incorporated the concept of segmenting the video into sections automatically based on user behaviour (e.g. where they stop, where they rewind) and video content (e.g. transition points, audio pauses) which would help with the process of following along with a video. Since this is a low-fidelity prototype, deciding how to do this was not in the scope of our design, but will be an important consideration in the future.

Update 3a: Low-Fidelity Prototype Illustration

The following video illustrates how the low-fidelity paper prototype we developed was developed. For more details regarding the design of the paper prototype, see Update 3b.

https://youtu.be/0s-CAOaPMt8