Sketch Recognition Algorithms for
Comparing Complex and Unpredictable Shapes
This paper introduces a Mechanix---a computer-assisted
tutoring system for engineering students. It can recognize freehand sketches
and provide instant feedbacks. Mechanix will release the burden of grading so
that instructors can assign more free response questions in homework. Quite
interesting!
Firstly, let me clarify the concept of “body” in this paper.
Any stroke or set of strokes that forms a closed shape as a possible “body”. To
correctly recognize the “bodies”, we need the following steps:
Body Identification:
To check whether a collection of segments form a “body”
(closed shape), we just pick a random starting point from one of the segments.
For the other end of the segment, we try to find out the closest end point of
another segment in the collection. If the distance between this two endpoints
are within the threshold of 9% of the total length of the strokes (segments in
the collection), then we predict the two strokes connect with each other. This
simple process repeats again and again until the end point in the last segment
connects back to the starting point of the first segment. If it is, we predict
the collection of segments as a “body”.
Body Comparison:
One big difference between “body comparison” and
“instance-based recognition” is only one template will be used to generate a
binary “match/no match” classification.
Three key metrics to determine similarity of a body and a
template are Hausdorff distance, modified Hausdorff distance and Tanimoto
coefficient.
Hausdorff distance:
PA is the vector
of points in template. PB is the vector of points in “body”.
Hausdorff distance is to calculate the maximum of the maximum D from A to B and
B to A.
Modified Hausdorff distance:
It is the average of the averages of DA and DB.
To determine whether the “body” matches with the reference
(template), the authors introduces a “match probability”:
where 20 is the half of the bounding box size.
Tanimoto coefficient:
where nAB is the number of overlapping pixels. nA,nB
are the total number of pixels in A and B respectively.
For easier computation, we usually use a lax definition of
overlapping. That means we deal with points rather than pixels any more.
where nAB is the number of overlapping points
from A to B (Here overlapping means the distance between the two points is no
more than 10% of the bounding box size)
T(A,B) represents the overall percentage of overlap between
two shapes.
About Trusses:
Trusses are important structure for engineers because of
their comprehensive applications, such as bridges, airplane wings, building
supporting roofs, etc.
Trusses identification:
We
first use the PaleoSketch algorithm to preprocess each stroke and identify all
of the line segments. We use all of the line segments in a diagram to build a
graph with each segment as a node, and edges between segments that intersect.
Once we have built the connection graph, we consider each edge AB as a
potentially shared edge. We remove the AB edge from the graph and run a
breadth-first search (BFS) to search for another path from A to B. If we do not
find a path, then AB is not even part of a polygon, and so can’t be a shared
edge. If we do find a path then AB is part of a polygon, and we remove all of
the edges in that path from the graph and use BFS to search for a third path
from A to B (AB being the first path). If we do not find a path, then AB is not
a shared edge. If we do find a path, then the two paths we found with BFS each
represent a polygon sharing the AB edge.
Trusses Comparison:
When
comparing a student’s truss to the reference solution, we consider both the
sketch and the adjacency graph. It is important that the student’s submission
has the exact same graphical structure as the reference solution.
Bibliography:
http://dl.acm.org/citation.cfm?id=2283803&bnc=1
没有评论:
发表评论