urjnasw xkfjjkn's hot blog

2013年4月9日星期二

Sketch Recognition Algorithms for Comparing Complex and Unpredictable Shapes


Sketch Recognition Algorithms for Comparing Complex and Unpredictable Shapes

This paper introduces a Mechanix---a computer-assisted tutoring system for engineering students. It can recognize freehand sketches and provide instant feedbacks. Mechanix will release the burden of grading so that instructors can assign more free response questions in homework. Quite interesting!
Firstly, let me clarify the concept of “body” in this paper. Any stroke or set of strokes that forms a closed shape as a possible “body”. To correctly recognize the “bodies”, we need the following steps:

Body Identification:

To check whether a collection of segments form a “body” (closed shape), we just pick a random starting point from one of the segments. For the other end of the segment, we try to find out the closest end point of another segment in the collection. If the distance between this two endpoints are within the threshold of 9% of the total length of the strokes (segments in the collection), then we predict the two strokes connect with each other. This simple process repeats again and again until the end point in the last segment connects back to the starting point of the first segment. If it is, we predict the collection of segments as a “body”.

Body Comparison:

One big difference between “body comparison” and “instance-based recognition” is only one template will be used to generate a binary “match/no match” classification.
Three key metrics to determine similarity of a body and a template are Hausdorff distance, modified Hausdorff distance and Tanimoto coefficient.

Hausdorff distance:
PA is the vector of points in template. PB is the vector of points in “body”. Hausdorff distance is to calculate the maximum of the maximum D from A to B and B to A.

Modified Hausdorff distance:



It is the average of the averages of DA and DB.
To determine whether the “body” matches with the reference (template), the authors introduces a “match probability”:

where 20 is the half of the bounding box size.

Tanimoto coefficient:


where nAB is the number of overlapping pixels. nA,nB are the total number of pixels in A and B respectively.
For easier computation, we usually use a lax definition of overlapping. That means we deal with points rather than pixels any more.
where nAB is the number of overlapping points from A to B (Here overlapping means the distance between the two points is no more than 10% of the bounding box size)
T(A,B) represents the overall percentage of overlap between two shapes.

About Trusses:

Trusses are important structure for engineers because of their comprehensive applications, such as bridges, airplane wings, building supporting roofs, etc.

Trusses identification:

We first use the PaleoSketch algorithm to preprocess each stroke and identify all of the line segments. We use all of the line segments in a diagram to build a graph with each segment as a node, and edges between segments that intersect. Once we have built the connection graph, we consider each edge AB as a potentially shared edge. We remove the AB edge from the graph and run a breadth-first search (BFS) to search for another path from A to B. If we do not find a path, then AB is not even part of a polygon, and so can’t be a shared edge. If we do find a path then AB is part of a polygon, and we remove all of the edges in that path from the graph and use BFS to search for a third path from A to B (AB being the first path). If we do not find a path, then AB is not a shared edge. If we do find a path, then the two paths we found with BFS each represent a polygon sharing the AB edge.

Trusses Comparison:

When comparing a student’s truss to the reference solution, we consider both the sketch and the adjacency graph. It is important that the student’s submission has the exact same graphical structure as the reference solution.

Bibliography:

http://dl.acm.org/citation.cfm?id=2283803&bnc=1





A Domain_Independent System for Sketch Recognition



A Domain_Independent System for Sketch Recognition

This paper primarily introduces a domain-independent sketch recognition system, which uses direction graph and curvature graph to identify primitive shapes, uses feature-area verification for double check and post-process to complete hierarchical output.
Let’s start from imprecise stroke approximation:
Initial stroke approximation is basically done with the help of direction graph and curvature graph.
These two graphs are obtained by analyzing the sampling points on the original stroke.

where dn is the direction between point (n+1) and point n. cn is the curvature between point (n-k) and (n+k). The nominator is the total direction change between point (n-k) and point (n+k), and the denominator is the path length between these two points.
Then Direction graph and Curvature graph can be obtained:
For example,


For the Star Stroke, there are four times change of direction. So basically, there are four big changes in direction graph.
Feature area is used for stroke verification. It works like this:
The shaded area is the difference between original stroke and reference stroke. When the difference is very small, it basically indicates that the reference shape is exactly the one indicated by the original stroke.
From now on, we will dive into specific cases on how to proceed with primitive shape detection:

For vertex detection:

When an input stroke cannot be approximated with any kind of primitive shape, we will divide the stroke into two sub-strokes. The dividing point will be the point with highest curvature value in current stroke.

For line detection:

Generally, there are two ways for line detection. The first one is to check whether the direction curve can be fitted with a horizontal line in direction graph. The second one is to check whether the actual coordinates of the stroke can be fitted by a line. Then we can use feature-area verification for double check, just as shown in figure 3(a)

For curve detection:

If a line with non-zero slope shows in direction graph, the stroke is likely to be a curve. If the slope is close to 2*pi/N, then it is likely to be a circle. If succeeds, we hypothesize a candidate circle which takes the center of the stroke’s bounding box as its own center and the mean distance between the center and each stroke point as its radius. If the difference between the stroke and candidate circle is within the reasonable threshold, we can identify the stroke as a circle.

For arc detection:

The algorithm for arc recognition is devised as follows:
(1) Check whether the direction curve can be approximated with a line, and abort if fails.
(2) Find the perpendicular bisector, denoted as M, of the first and last stroke points, calculate the intersection point A between line M and the stroke path, and then find the circle on which the first and last stroke points as well as point A reside. The circle’s center is taken as the center of
the candidate arc.
(3) Take the mean distance from each stroke point to the center as the radius.
(4) Compute the ratio between the corresponding feature area against the candidate arc and that arc’s standard area. Check whether the ratio is less than a predefined threshold, and abort if fails.
Handling self-intersection Strokes
Given a stroke which cannot be approximated by any of our predefined primitive shapes, if it has no self-intersection point, then follow the original procedure; otherwise, make two copies of that stroke, divide them using different strategies, i.e. from self-intersection points versus from the point with maximum curvature, then compare respective recognition result and choose the better one.

Post-Process:

In this stage, we need to do simple relation retrieval, cleanup and basic object recognition. Then everything is done, we can output our sketch recognition result.

Bibliography:

http://dl.acm.org/citation.cfm?id=604499