Tuesday, March 27, 2012


An examination of four user-based software evaluation methods 
Ron Henderson,  John Podd,  Mike Smith,  and Hugo Varela-Alvarez

Since the focus of the last few class periods has been evaluation, I decided to go ahead and read a paper on different  methods rather than yet another hand tracking algorithm.  The paper was written in 1995, but focuses on evaluation methods that are still used (data logging, questionnaires, interviews, 'verbal protocol analyses').

For their study, the authors got a group of 148 people and had each of them use a different evaluation method to test one of three pieces of software (spreadsheet/word processor/database).  The subjects used the software and then applied the evaluation method.

Data Logging: internal software was used to log keystrokes with time samples, to be examined after the test.

Questionnaire: used a 7 point scale with 'not applicable' and 'don't understand' options.  Questions were over topics such as


     Program  self-descriptiveness
     User  control  of  the  program 
    Ease  of  learning  the  program  
    Completeness  of  the  program 
    Correspondence  with  user  expectations
    Flexibility  in  task  handling  
    Fault  tolerance 
    Formatting 

Open ended questions followed these asking about specific problems and calling for comments/suggestions.

Interview: Semi-structured format of scripted questions and following up on unique interviewee comments.

Verbal Protocol: Video taped users while they were evaluating the software.  Users were later asked to 'think aloud' as they watched the tapes play back.

Conclusions:

Data logging is nice because it's pretty much as objective as you can get, however, it's tedious to analyze.

Questionnaires can be give vague results if the wording is not incredibly specific for each question, and it's difficult to make questionnaires that everybody will understand completely.

Interviews are good for getting relevant information quickly, but are subject to the problem of memory decay.

The verbal protocol method tends to be good at finding problem areas because it calls to memory when the user was having trouble with a particular exercise.  However, it's very time-consuming.

The authors note that using a combination of these methods will most likely give the best results, as they add unique contributions, but that using multiple methods is probably affected by diminishing returns, so just blindly adding more methods is not the best approach.




Thursday, March 22, 2012

Guidelines for Multimodal User Interface Design

                                         Leah M. Reeves
Jennifer Lai
James A. Larson
Sharon Oviatt
T. S. Balaji
Stéphanie Buisine
Penny Collings
Phil Cohen
Ben Kraal
Jean-Claude Martin
Michael McTear
TV Raman
Kay M. Stanney
Hui Su
Qian Ying Wang


Communications of the ACM

Since I'm doing a large amount of the UI programming for our project, I thought I'd make a bit of  a topic switch and read up on some UI research.  The paper I read focused on multimodal UI design.



According to the paper, there are six major categories for guidelines.  These are:

  • Requirements Specification
    • Design for a broad range of users
    • Privacy/Security Issues
  • Designing Multimodal Input and Output
    • Maximize human cognitive/physical abilities
    • Integrate input methods in a way compatible with user preference/system functionality/context
  • Adaptivity
    • Adapt to the needs of your users (Ex: Gesture input!)
  • Consistency
    • Make it look consistent, use common features
  • Feedback
    • Users should be aware of which inputs are available
    • Users should notified of alternative interaction options
  • Error Prevention/Handling
    • Provide clearly marked exits from tasks
    • Allow undoing of commands
    •  If an error occurs, permit users to switch to a different modality
The authors do note that more research needs to be done in order to get a better grasp of what the most intuitive/effective combination of different input and output methods are, since the population that these decisions affect is so broad.  They also say that new techniques for error handling and adaptivity should be explored.

These guidelines will be useful to keep in mind as we create the interface for our project, especially since the Kinect is multimodal.




Source: http://delivery.acm.org/10.1145/970000/962106/p57-reeves.pdf?ip=128.194.247.31&acc=ACTIVE%20SERVICE&CFID=91378723&CFTOKEN=92548257&__acm__=1332440500_e7a95379e3a0a0cd7ffc5c29f7d7138f

Thursday, March 8, 2012

Manipulator and object tracking for in-hand 3D object modeling
Michael Krainin, Peter Henry, Xiaofeng Ren and Dieter Fox
The International Journal of Robotics Research 2011 30: 1311 originally published online 7 July 2011

In this paper, the authors write about using a PrimeSense depth camera (functionally the same as a Kinect, with RGB and Depth info), to create an algorithm which allows robots to create a 3D model an unknown object.

The process relies on finding the correct alignment between the object and the robot in each sensor frame.  Prior work has used only the manipulator or the object being modeled.  However, by using RGB-D and encoder data from both the robot's manipulator and the object being modeled, the authors can achieve a much more accurate alignment.

Several steps are taken in the algorithm to ensure accuracy for the model.  Most are somewhat complicated but I'll try to sum them up here in a few words.
  • Kalman filtering helps 'maintain temporal consistency' between input frames and also provide estimates of uncertainty by keeping track of the manipulator's joint angles and the rotation of the object, among other things.  
  • Articulated Iterative Closest Point tracking is used to estimate 'joint angles by attempting to minimize an error function over two point clouds'. 
For the actual modeling:
  • 'Surfels' used as the representation for easy addition of points and removing superfluous points, as well as dealing with occlusion and updating the model.
  • Loop closure to connect pieces of the model.  This involves 'maintaining a graph whose nodes are a subset of the surfels in the object model'.  Edges show that both nodes were visible in the frame, and allows for computation of connected components.
  • Object re-grasping is exactly what it sounds like.  Since some parts of the object will be occluded by itself or by the manipulator, you have to look at it from multiple angles.  It's really complicated.  Put the object down, then pick it back up in a different orientation.



While this doesn't really influence our work that much, it was pretty interesting to read.