Sketch Recognition Algorithms for Comparing Complex and Unpredictable Shapes

This paper introduces a Mechanix---a computer-assisted tutoring system for engineering students. It can recognize freehand sketches and provide instant feedbacks. Mechanix will release the burden of grading so that instructors can assign more free response questions in homework. Quite interesting!

Firstly, let me clarify the concept of “body” in this paper. Any stroke or set of strokes that forms a closed shape as a possible “body”. To correctly recognize the “bodies”, we need the following steps:

Body Identification:

To check whether a collection of segments form a “body” (closed shape), we just pick a random starting point from one of the segments. For the other end of the segment, we try to find out the closest end point of another segment in the collection. If the distance between this two endpoints are within the threshold of 9% of the total length of the strokes (segments in the collection), then we predict the two strokes connect with each other. This simple process repeats again and again until the end point in the last segment connects back to the starting point of the first segment. If it is, we predict the collection of segments as a “body”.

Body Comparison:

One big difference between “body comparison” and “instance-based recognition” is only one template will be used to generate a binary “match/no match” classification.

Three key metrics to determine similarity of a body and a template are Hausdorff distance, modified Hausdorff distance and Tanimoto coefficient.

Hausdorff distance:

P_A is the vector of points in template. P_B is the vector of points in “body”. Hausdorff distance is to calculate the maximum of the maximum D from A to B and B to A.

Modified Hausdorff distance:

It is the average of the averages of D_A and D_B.

To determine whether the “body” matches with the reference (template), the authors introduces a “match probability”:

where 20 is the half of the bounding box size.

Tanimoto coefficient:

where n_AB is the number of overlapping pixels. n_A,n_B are the total number of pixels in A and B respectively.

For easier computation, we usually use a lax definition of overlapping. That means we deal with points rather than pixels any more.

where n_AB is the number of overlapping points from A to B (Here overlapping means the distance between the two points is no more than 10% of the bounding box size)

T(A,B) represents the overall percentage of overlap between two shapes.

About Trusses:

Trusses are important structure for engineers because of their comprehensive applications, such as bridges, airplane wings, building supporting roofs, etc.

Trusses identification:

We first use the PaleoSketch algorithm to preprocess each stroke and identify all of the line segments. We use all of the line segments in a diagram to build a graph with each segment as a node, and edges between segments that intersect. Once we have built the connection graph, we consider each edge AB as a potentially shared edge. We remove the AB edge from the graph and run a breadth-first search (BFS) to search for another path from A to B. If we do not find a path, then AB is not even part of a polygon, and so can’t be a shared edge. If we do find a path then AB is part of a polygon, and we remove all of the edges in that path from the graph and use BFS to search for a third path from A to B (AB being the first path). If we do not find a path, then AB is not a shared edge. If we do find a path, then the two paths we found with BFS each represent a polygon sharing the AB edge.

Trusses Comparison:

When comparing a student’s truss to the reference solution, we consider both the sketch and the adjacency graph. It is important that the student’s submission has the exact same graphical structure as the reference solution.

Bibliography:

http://dl.acm.org/citation.cfm?id=2283803&bnc=1

urjnasw xkfjjkn

urjnasw xkfjjkn's hot blog

2013年4月9日星期二