Rectangle 27 0

c Implementing a brute force algorithm for detecting a self intersecting polygon?

class CheckPolygon2
    // internal supporting classes
    class endpointEntry
        public double XValue;
        public bool isHi;
        public Line2D line;
        public double hi;
        public double lo;
        public endpointEntry fLink;
        public endpointEntry bLink;
    class endpointSorter : IComparer<endpointEntry>
        public int Compare(endpointEntry c1, endpointEntry c2)
            // sort values on XValue, descending
            if (c1.XValue > c2.XValue) { return -1; }
            else if (c1.XValue < c2.XValue) { return 1; }
            else // must be equal, make sure hi's sort before lo's
                if (c1.isHi && !c2.isHi) { return -1; }
                else if (!c1.isHi && c2.isHi) { return 1; }
                else { return 0; }

    public bool CheckForCrossing(List<Line2D> lines)
        List<endpointEntry> pts = new List<endpointEntry>(2 * lines.Count);

        // Make endpoint objects from the lines so that we can sort all of the
        // lines endpoints.
        foreach (Line2D lin in lines)
            // make the endpoint objects for this line
            endpointEntry hi, lo;
            if (lin.P1.X < lin.P2.X)
                hi = new endpointEntry() { XValue = lin.P2.X, isHi = true, line = lin, hi = lin.P2.X, lo = lin.P1.X };
                lo = new endpointEntry() { XValue = lin.P1.X, isHi = false, line = lin, hi = lin.P1.X, lo = lin.P2.X };
                hi = new endpointEntry() { XValue = lin.P1.X, isHi = true, line = lin, hi = lin.P1.X, lo = lin.P2.X };
                lo = new endpointEntry() { XValue = lin.P2.X, isHi = false, line = lin, hi = lin.P2.X, lo = lin.P1.X };
            // add them to the sort-list

        // sort the list
        pts.Sort(new endpointSorter());

        // sort the endpoint forward and backward links
        endpointEntry prev = null;
        foreach (endpointEntry pt in pts)
            if (prev != null)
                pt.bLink = prev;
                prev.fLink = pt;
            prev = pt;

        // NOW, we are ready to look for intersecting lines
        foreach (endpointEntry pt in pts)
            // for every Hi endpoint ...
            if (pt.isHi)
                // check every other line whose X-range is either wholly 
                // contained within our own, or that overlaps the high 
                // part of ours.  The other two cases of overlap (overlaps 
                // our low end, or wholly contains us) is covered by hi 
                // points above that scan down to check us.

                // scan down for each lo-endpoint below us checking each's 
                // line for intersection until we pass our own lo-X value
                for (endpointEntry pLo = pt.fLink; (pLo != null) && (pLo.XValue >= pt.lo); pLo = pLo.fLink)
                    // is this a lo-endpoint?
                    if (!pLo.isHi)
                        // check its line for intersection
                        if (pt.line.intersectsLine(pLo.line))
                            return true;

        return false;
for (endpointEntry pLo = pt.fLink; (pLo != null) && (pLo.XValue >= pt.lo); pLo = pLo.fLink)
  • (A) Inner Loop through the list, using a second iterator (j), starting at i and ending when it passed below pt.Lo.
  • Fill up the list with endPoints
  • Outer Loop through the list in descending order, using an index iterator (i), instead of a foreach enumerator
  • Sort the List by XValue

@EvanParsons How did the testing go? Did this algorithm work out for you?

Breaking this down, the old-style for loop specifier has three parts: initialization, condition, and increment-decrement. The initialization expression, endpointEntry pLo = pt.fLink; initializes pLo with the forward Link of the current point in the list. That is, the next point in the list, in descending sorted order.

Finally, it loops after testing the loop condition (pLo != null) && (pLo.XValue >= pt.lo) which loops so long as the new point isn't null (which would mean that we were at the end of the list) and so long as the new point's XValue is still greater than or equal to the low X value of the outer loop's current point. This second condition insures that the inner loop only looks at lines that are overlapping the line of the outer loop's endpoint.

I am not certain what the true execution complexity of this algorithm is, but I suspect that for most non-pathological polygons it will be close to O(n*SQRT(n)) which should be fast enough.

I'm 99% sure it's doing what I want. The only issue is that I don't understand it completely yet (although I have a fairly good idea what it does), I'm going to dedicate some time tomorrow and make sure I understand it. I made some adjustments to my lineSegment intersection method and a few other areas and got the time down to 8:47. This is basically as fast as my original implementation of the Hoey-Shamos algorithm. Thanks a bunch!

It takes a total of 15 minutes to run all 700,000 polygons! That's extremely good. I just need to check if it doesn't miss out on any.

Okay, so I think I generally understand what's going on. But what's going on in the inner loop? I've never seen a for loop like that before.

So specifically what we do is to define a custom class (endpointEntry) that represents the High or Low X values of the line's two points. These endpoints are all put into the same List structure and then sorted based on their X values.

That I think would be much simpler. I can post a simplified version like that, if you want.

The Inner Loop simply scans the endPoints list in the same sorted order as the outer loop. But it will start scanning from where the outer loop from where the outer loop currently is in the list (which is the hiX point of some line), and will only scan until the endPoint values drop below the corresponding loX value of that same line.

The O(n^2) comes from the nested loops in the brute-force algorithm that are each bounded by n, making it O(n*n). The simplest way to improve this would be to find some way to reduce the inner loop so that it is not bound by or dependent on n. So what we need to find is some way to order or re-order the inner list of lines to check the outer line against so that only a part of the full list needs to be scanned.

The advantage of this approach is that these X ranges can be used to order the endpoints of the lines which can in turn be used as the starting and stopping points for which lines to check against for intersection.

The approach that I am taking takes advantage of the fact that if two line segments intersect, then the range of their X values must overlap each other. Mind you, this doesn't mean that they do intersect, but if their X ranges don't overlap, then they cannot be intersecting so theres no need to check them against each other. (this is true of the Y value ranges also, but you can only leverage one dimension at a time).

Then the body of the inner loop gets executed. Then the increment-decrement pLo = pLo.fLink gets applied, which simply sets the inner loop's current point (pLo) to the next lower point using it's forward-link (pLo.fLink), thus advancing the loop.

Then we implement an outer loop where we scan the entire list just as in the brute-force algorithm. However our inner loop is considerably different. Instead of re-scanning the entire list for lines to check for intersection, we rather start scanning down the sorted endpoint list from the high X value endpoint of the outer loop's line and end it when we pass below that same line's low X value endpoint. In this way, we only check the lines whose range of X values overlap the outer loop's line.

What is clearer to me now, is that I probably could have gotten around this whole fLink-bLink clumsiness by treating the endPoints List as an array instead:

What's tricky here, is that this cannot be done with an Enumerator (the foreach( pts) of the outer loop) because there's no way to enumerate a sublist of a list, nor to start the enumeration based on another enumerations current position. So instead what I did was to use the Forward and Backward Links (fLink and bLink) properties to make a doubly-linked list structure that retains the sorted order of the list, but that I can incrementally scan without enumerating the list:

Your current Brute-Force algorithm is O(n^2). For just your two 70,000 line polygons that's some factor of almost 10 Billion operations, to say nothing of the 700,000 other polygons. Obviously, no amount of mere code optimization is going to be enough, so you need some kind algorithmic optimization that can lower that O(n^2) without being unduly complicated.