Home | | Résumé | | 4serendipity Web Search Utility | | Some of my pages | | Links | |||||||||||
|
|||||||||||||||
|
Usability Test Report: WebBrain Online Search Directory Andrew Stevens and Caren Spencer-Smith August 10, 2000
Table of Contents: I. Overview A. Introduction B. Background C. Test Objectives II. Procedure A. Methods III. Results A. Summary B. Efficiency C. Effectiveness D. Satisfaction IV. Discussion A. Interface Findings B. Search Directory Findings C. Recommendations V. Appendix Table 1: Efficiency Results Figure 1: Efficiency Results Figure 2: Efficiency Mean Times Figure 3: Efficiency Standard Deviation Table 2: Effectiveness Results Figure 4: Effectiveness Results Table 3: Successful Task Completion Table 4: Post-evaluation Survey Table 5: Post-evaluation Comments Data Records I. Overview A. Introduction WebBrain is a dynamic graphical representation of the Open Directory Project (ODP) web-based search directory. The interface is a horizontally split screen where the upper pane displays the search directory that dynamically centers around a user's selection of a directory topic; related information is displayed in a thesaurus-like structure. When a user selects a topic containing website links (as opposed to exclusively subdirectories), those links are displayed in the lower part of the display. A standard search engine box, capable of driving both the visual display and list of resulting page links, separates the two halves. B. Background The problems of the system relate to the navigation and effort associated with the WebBrain interface, as well as the ability of the ODP directory structure to adequately classify information. Thus, a test of this interface both reflects a user's ability to manipulate and navigate the interface, as well as a user's ability to correctly interpret a directory's potential content. We speculate that WebBrain’s unconventional interface and the use of a directory whose hierarchy is often quite ambiguous could produce significant usability problems and inefficiencies. More specifically, we believe that some of this site’s characteristics (dual pane interface, its visual hierarchy representation, and dependence on scrolling) are inherently flawed. C. Test Objectives
II. Procedure A. Methods After completion of a pre-evaluation survey, each of five subjects were asked to enter the testing room of the SLIS Usability Lab, listen to the introduction script, and asked to sequentially complete six tasks. Subjects were asked to limit their search interaction to the graphical portion of the directory during this evaluation, but told they could access other features, including online help, at any time. Results were evaluated on the basis of time taken to complete each task, the degree to which the task was completed, and the total number of "clicks" made by each test subject for each task. Participants were asked to use the "think aloud" protocol during testing. Qualitative expressions were recorded, but not quantified. A post-evaluation survey was used to measure satisfaction and solicit written comments. B. Test Environment User testing was conducted in the SLIS Usability Lab, using the PC computer and the MS Internet Explorer web browser. The lab design follows the classic usability testing lab design with separate testing and observation rooms. A wall-mounted camera was used to record the user's facial expressions, while a microphone was used to record comments they made during the testing procedure. Synchronized data capture of the screen (using a scan converter) was used to record mouse movement and target selection. Tasks were presented on 3x5 cards, and progressed from “easy” to “difficult.” Test subjects were asked to complete a post-evaluation survey immediately following the last task III. Results A. Summary Results were based on the ability of participants to complete all or part of the tasks, the time taken to complete each task as related to previously established benchmarks, satisfaction measures obtained using a post-evaluation survey, and qualitative expressions recorded during testing and as part of the post-evaluation survey. Tasks given to each subject were to find links to the following websites: 1. CNN (Cable News Network (1 level down, scrolling in website window)2. A History of Traditional Games (1 level down, no scrolling required)3. Ultramagic Balloons (3 levels down, no scrolling reqired)4. Quotation Ring (3 levels, no scrolling required)5. Nutrition Remedies for Allergies (5 levels, scrolling in website window required)6. Inner City Handball Association (2 levels, lateral scrolling through category display window required)The results for task 4 indicate a possible problem with task wording rather than user interaction with the interface. User 2 was successful in selecting the appropriate third-level link and completing the task; all other users stopped at an earlier page and selected a website link that was very similar to the target. Although this task was included in result summaries, its validity is questionable. Where applicable, data including and excluding this measure were given. Test subjects were able to select the first category link correctly from the home page 86% of the time, indicating a good organization of the directory’s top level. Subjects, however, experienced difficulty in making selections from subsequent levels, expressed by either wrong choice of category topic or extended time spent on a page reading through category links. Test subjects both demonstrated and expressed difficulty and confusion interpreting the data presented through the interface. Specific problems included the ability to shift focus from one frame to another, subject recognition the significance of the category positioning in the category display, and recognition of scroll bars in the category display window. B. Efficiency Efficiency was based on the time taken to complete each task. Benchmarks for the easy tasks (1 and 2) were: excellent = 15 sec, OK = 16-60 sec, poor = 61-179 sec, and bad = 180 sec. Benchmarks for the difficult tasks (3 through 6) were: excellent = 60 sec, OK = 61-179 sec, poor = 180-539 sec, and bad = 540 sec. (see Table 1, and Figures 1 through 3 in Appendix). Learning was not directly demonstrated by a consistent efficiency improvement for any test subject. Overall, 33% of the recorded times were excellent, 27% of the times were OK, 27% of the times were poor, and 13% of the times were bad. Again, although efficiency for task 4 was excellent or OK for all users, the validity is questionable due to task wording. C. Effectiveness Effectiveness was based on the number of clicks made by a test subject to complete a task, as well as the ability of a test subject to complete all or part of the task. Benchmarks for effectiveness were rated from excellent (the fewest number of clicks required for task completion) to poor (just over twice that number) (see Table 2, and Figures 4 and 5 in Appendix). Overall, 43% of the effectiveness scores were excellent, 17% were OK, and 40% were poor. Excluding data from Task 4, where 80% of the test subjects achieved excellent scores but failed to complete the task, the results are: 36% excellent, 20% OK, and 44% poor. Task completion was also documented (see Table 3 in Appendix). Overall, 60% of the tasks were completed and 40% were not. Of those tasks not completed, the following results were noted: Users 4 and 5 both selected the correct category link from the home page (and each selected the correct page several times) but failed to find the target resource in the website window. Completion of the task required scrolling down in the website window. User 3 completed 33% of task 3. Users 1, 3, 4, and 5 completed 67% of task 4; task wording was a problem. Users 1 and 3 completed 20% of task 5, users 2 and 4 completed 60%, and user 5 completed 80%. All users correctly selected the top-level category and then incorrectly selected the second level category (although users 2, 4, and 5 were able to recover from this to varying degrees). D. Satisfaction Satisfaction was measured using a 20-question survey based on WAMMI, a previously validated instrument (see Table 4 in the Appendix). The 5-point scale of each question ranged from Strongly Agree to Strongly Disagree. Although test subjects were neutral in over half of their responses to the post-evaluation survey, several points are evident from the remaining responses: Users strongly agreed (1) that:
User's strongly disagreed (5) that:
As part of the post-evaluation survey, each user was asked for the most negative and positive aspects of the WebBrain search directory (see Table 5 in the Appendix). IV. Discussion A. Interface Findings 1. Test subjects did not recognize the significance of the category positioning in the category display. Supported by comments made by the users during testing and observed use of browser navigation for navigation within the directory structure. Some users demonstrated and commented on their understanding of the Parent-Sibling relationship, and used it for navigation. Comments related to other category positions, when made, were incorrect. "Oh, here it is...but that is totally by serendipity." "Oh, so these are the top links and these are bottom links. I don't know, I think still it's confusing. The thing that I really liked in the beginning is now confusing." "OK. I'm getting confused. This thing is moving me around in a way that is kind of disorienting. It's moving me from concepts, I realize." "This is a nightmare." "Actually, I find the navigation kind of difficult on here, as far as the sequential type of navigation." 2. Subjects often failed to notice target resource’s existence in the web site window. Supported by user comments and observed use of the interface. User 4 failed to notice the links in the website window, as stated in a comment, until 2 min 18 sec into the task. "I wonder where you actually get to a link to something.... Oh, oh wait; it's pulling them up down here." "I think it's confusing because I'm looking up, and then I realize something is down there." "See, this is why I'm kind of confused because when I clicked on games, on the last task, it pulled up things down here, and this doesn't have things down there, so I really wasn't paying much attention." 3. Subjects did not access any assistance resources and were dependent upon browser functions for navigation. Supported by observation. 4. Subjects sometimes failed to notice scrolls available for the different category positions within the category link window Supported by user comments and observations. Most users verbally identified the category scrolls during the third task or later. User 4 did not notice category scroll bars until last task. “Boy, I hope I didn’t need these before now. I probably did.” “Red navigation arrows are too subtle.” 5. Subjects were delayed by number of topics in category display window Supported by user comments and observations, as well as efficiency ratings. "I can't believe it took me that long." "There are too many options, I was 'overwhelmed' with links, categories and silly animated graphics. Give me Yahoo instead any day." B. Search Directory Findings 1. The partial/incomplete display of categories' hierarchical position in the category display window resulted in users getting “lost” in the directory structure. Supported by verbal comments, effectiveness ratings, and suggestions included with post-evaluation survey "Too many options one-level down." "Show a navigation path-I got lost" "Actually, you know, the search engine looks pretty good, I mean, as far as the specificity, if that's the word, with which it's pulling things up; I can't find...I don't know where I'm going." 2. The ODP directory structure is sometimes ambiguous and redundant. Supported by verbal and post-evaluation comments, as well as effectiveness ratings. Users were often confused to find the same category under different directory listings (News as a main heading and News under Television). "Too many options one-level down." "As with all pre-coordinated systems, it is difficult to determine where fringe topics will be located (handball, balloons). It seems as if these should be connected via sports & entertainment-but if this was the case, why couldn't I find the allergy site? It should have been cross-listed as well." "I thought it would be here.... It seems like it would be there." C. Recommendations 1. Use simple text labels to highlight the significance of each category type within the category display window and category scrolls (addresses Interface Findings 1 and 4). 2. Purpose of each window should be highlighted in some way, preferably on the home page (addresses Interface Findings 1 and 2) 3. Dynamic resizing of website window to assist focusing user attention on either the category link or website window at any given time (addresses Interface Finding 2). 4. Link text labels to a “Help” section; provide a specifically labeled start screen or blatantly labeled “Help” link (addresses Interface Finding 3, but also others). 5. Restructure directory to reduce number of category links in any given category position, or consider using a directory structure better suited for the interface (addresses Search Directory Findings 1 and 2). V. Appendices Table 1: Efficiency Results
Figure 1: Efficiency Results Figure 2: Efficiency Mean Times Mean Time to Task Completion Figure 3: Efficiency Standard Deviation Table 2: Effectiveness Results
Figure 4: Effectiveness Results Figure 5: Effectiveness Mean Clicks Table 3: Successful Task Completion
Table 4: Post-Evaluation Survey
Table 5: Post-evaluation Comments Negative aspects of system:
Positive aspects of system:
|