Blog on paper: Specifying Gestures by Example
Summary:
Although the creative HCI by means of
gestures is appealing, it is not rapidly developed due to the difficulty to
create gestures and the code complexity of gesture recognizer. Therefore, this
paper primarily talks about GRANDMA (Gesture Recognizers Automated in a Novel
Direct Manipulation Architecture), with which gesture recognizer can be created
automatically from example gestures, removing the need for complicated hand
coding.
Each recognizer is rapidly trained from
a small number of examples of each gesture. The attributes of these gestures
should be meaningful and be considered in gesture classification.
Then this paper refers to GDP, which is
a gesture-based drawing program built with GRANDMA. All gestures in GDP are
arranged in hierarchy view classes, as shown in Figure 1.
Figure
1. GDP view classes and associated gesture sets
(1)Create a new gesture handler and associate it with the GraphicObjectView class in Figure 1. The gesture handler window is shown in Figure 2.
Figure 2. 4 new gesture class in gc6 – gc9
(2) Enter training examples for each new gesture class. Figure 3 shows the process of entering 7 training examples for “Delete” gesture class( gc9 ). Typically, 15 training examples per gesture class is perfect.
Figure
3. 7 training examples for “delete” gesture
After
introducing how to add new gestures, the paper begins to talk about gesture recognition.
The essential of gesture recognition is given a input gesture g, determine from
a set of C gesture classes, to which g belongs.
Each gesture is represented by an array
of sample points:
gp
= ( xp, yp, tp ), 0<=p<=P,
P: total
number of sample points, tp: the time specific point is drawn
Feature is an important aspect to
classify gestures. There are three principles for the selection of features:
(1)They should be incrementally
computable in constant time per input point.
(2)They should be meaningful so
that it can be used in gesture semantics and for recognition.
(3)There should be enough but not
too many features to provide differentiation for all gestures.
The
following 13 features, shown in Figure 4, are a good selection to identify
strokes:
A few noticeable points:
(1)cos(α) and sin(α) are used as a
feature instead of the α itself to avoid a discontinuity as the angle passes
through 2pi and warps to 0.
(2)The initial angle features are
computed from the first and third mouse point because the result is generally
less noisy than when computed from the first two points. Nowadays, because of
the enhancement in sampling precision, we may take the first and the fifth
point as measurement points.
Gesture
Classification formula:
Training:
The purpose
of training is to determine the weight of Wc0 and Wci
from example gestures.
Rejection:
The
purpose of rejection is to rejecting ambiguous gestures and outliers.
Eager
recognition and multi-finger recognition:
Eager
recognition: the recognition of gestures as soon as they are unambiguous. In this
recognition, classification occurs on every mouse point.
Multi-finger
recognition: the recognition of gestures made with multiple fingers
simultaneously. By treating the multi-finger input as multi-path data, the single-stroke
recognition algorithm may be applied to each of the path individually and the
result combined to classify the multi-path gesture.
Bibliography:
Specifying
Gestures by Example, Dean Rubine, Information Technology Center, Carnegie
Mellon University, Pittsburgh, PA, SIGGRAPH 91, Proceedings of the 18th annual
conference on Computer graphics and interactive techniques, Pages 329-337, ACM
New York, NY, USA 1991. http://dl.acm.org/citation.cfm?id=122753