CIS677: Notes for lecture 3, Line segment intersections

CIS677
Notes for lecture 3:
Line segment intersections

Outline

Line segment intersection
Some approaches
Sweep algorithm
Planar graph facts
A randomized incremental algorithm
- probability facts
- analysis

Line segment intersection

The problem:

given a set S of line segments, find all pairs in the set that intersect.

Note: when all segments meet at a point,
the answer size in this formulation is Omega(n²).

applications

circuit design
clipping, windowing, "tesselating curved geometries"
map overlay

Some approaches

brute force
bounding box intersection using range queries structures

bucketing

divide the plane up by a uniform grid
distribute segments to squares in the grid
find intersections within each square
can be quite fast for evenly distributed data
has some advantages for very large data sets: fewer page faults
- problems: bucket size
- too big, and all in a bucket
- too small, and too many buckets with nothing.

projection

project segments to x-axis
find all pairs of overlapping projection intervals.
check if real intersections

finding all pairs of overlapping projections:

walk along real line,
keep track of set C of intervals containing current point
this set changes only at endpoints
at the left endpoint of interval J, report meetings of J with C and add to C.
at the right endpoint, delete J from C.

requires sorting endpoints, considering each one by one.
maintenance of set: use list: O(n²) use binary tree: O(n log n)

All of these are Omega(n²) worst case, even when A=0.

Sweep line:

sweep a vertical line through the segments
maintain set of segments that the line meets,
just like projection method

That is, view the x-coordinate as "time": something true for the vertical line x=t will be said to be true at time t.

The y-coordinate of a segment at time t is the the y-coordinate of the intersection of the segment with the line x=t.

A segment is active for sweep lines that meet it, and so segment p--q is active from time p_x to time q_x.

The sweep algorithm keeps track of the order of the active segments along the sweep line, in an ordered list: that is, segments at each time t are sorted by y-coordinate.

Call that ordered list the status.

*e, c, d, and b are active and in the status in that order*

The status changes only when:

a segment becomes active,
a segment becomes inactive, or
when two segments meet.

So the algorithm keeps a schedule of these events, and does operations on the status to account for changes when the events occur.

When a segment becomes active, it is inserted into the status.
When a segment becomes inactive, it is deleted from the status.
When two segments meet, their order in the status reverses.

What does it mean to "schedule" an event?

For the projection algorithm, the endpoints could be sorted,
but intersection events are unknown beforehand.
SO, use a heap of events, so that the "next" event,
at smallest next time, can be found.

Where does a segment go in the status?

if the segment is added at time t,
compare its y-coordinate at time t with the
y-coordinates of the active segments at time t.

Plainly, only segments that are active at the same time can meet; but even more is true:

Key observation:

At any time t, suppose the next upcoming event, at time t', is an intersection of segments a and b; then all segments between a and b along the sweep line also meet a and b at time t'.
This implies: it's enough to schedule intersection events involving segments that are adjacent along the sweep line.
Proof:
c is active at time t
==> c active at t'
Ordering doesn't change between events,
so c still between a and b at time t'.

When do segments become adjacent along the sweep line?

...at an insertion, deletion, or intersection event.
when inserting segment a: check if a meets its neighbors in status
when deleting segment a: check if the neighbors of a above and below meet.
at intersection event, where a was above b:
check if a meets b's old neighbor below, and
b meets a's old neighbor above.

Schedule any such events found.

Keeping track of adjacencies. need to:

insert and delete entries in status;
find place of new segment in status
a dynamic ordered list is needed -->use a balanced binary tree, or skip list

Time

O(log n) to put event in heap, to get event from heap O(log n) to put segment in status, to delete it, to find neighbors

There are 2n+A events, so O(n+A)log n time needed.

Space

The status has no more than n segments; for suitable random data, O(sqrt n) on average.

What about the heap?

2n endpoint events scheduled;
At most A intersection events scheduled, so O(n+A) space.

Even better is possible by changing the algorithm:

As described, an intersection event is scheduled when
two segments become adjacent in the status;
they may then become nonadjacent.
However, it suffices to have events scheduled
only for currently adjacent segments.
-->when two segments become nonadjacent,
remove their intersection event.

Degeneracy

What about vertical segments?

could just special-case it.
or: use lexicographic tie-breaking, so
if events (x,y) and (x,y') have y<=y', consider (x,y) first.
equivalent to rotating the coordinate system
an infinitesimal amount. a vertical segment's events occur from bottom endpoint,
through intersections with segment on the status,
to its top endpoint.

What about multiple segments meeting at the same point?

multiple start endpoints: use order of segments a tiny moment later,
that is, look at rotation order about common endpoint

multiple segments meeting at an intersection: delete from status, and reinsert

Robustness

When are two event points actually equal?
One approach: round endpoints to integers, use exact arithmetic.
This only works once:
what happens if the output is then used as input somewhere else?
snapping helps this:

all segments initially have endpoints on an integer grid;
when two segments meet, the intersection point is "snapped"
to the nearest gridpoint
this results in polygonal lines approximating the input segments
segments passing near gridpoints with intersections
are also snapped
This introduces no new intersections

A heap is unnecessary (if you really only want intersections)

The relative ordering of crossing events doesn't matter:
between two endpoint events, crossing events can be processed
in any order
-->so a sorted array of endpoints, each with a list
of crossing events, will work.
proof sketch:

between time t and t', suppose no endpoint events occur
then two segments are in different order at t than at t'
iff they cross in between.
Adjacent segments in status can be checked for ordering
So processing is like sorting status into order at time t',
using swaps of adjacent segments
Any order of swapping will do.

A randomized incremental algorithm

First, some preliminaries:

Planar graphs:

are graphs:
- vertices V
- edges E
planar embedding or drawing:
- vertices are points in the plane
- for each edge {v,w}, a curve between v and w
- no edges cross
Fact: there is always a planar straight-line embedding
plane graph: a particular straight-line drawing of a graph
faces
- regions induced by drawing
- more formally: equivalence classes under the relation
- a == b when there is a curve between a and b not crossing
- any edge in the drawing
Euler's relation:
- with v vertices, e edges, f faces, have
- v-e+f>=2
- proof sketch:
  true for single edge or node;
  maintained under adding an edge
e = O(v), since:
- each edge bounds <= 2 faces, each face has >=3 bounding edges
- --> f <= 2e/3
- e <= v+f-2 <= v+2e/3-2, so e <= 3v-6.

This implies O(n+I)log n bound for sweep, where I is the number of intersections, not number of intersecting pairs.

Representation of plane graphs

need a representation so that:
for each edge, can find its endpoints
for each face, can find its edges in order
for each vertex, can find its incident edges in order
When faces are convex, a triangulation will do:
put a new vertex in each face,
connect with vertices around face
represent triangulation so that
each triangle knows its vertices and neighboring triangles
For convex subdivisions, this is very close to doubly-connected edge lists, described in the text.

A randomized incremental algorithm

slightly faster asymptotically
leads to
a planar point location data structure,
a near-optimum simple polygon triangulation algorithm
a divide-and-conquer approach

The algorithm builds the

Trapezoidal diagram of the segments:

...the subdivision induced by the segments, together with
visibility edges, for endpoints and crossing points:

an edge from the point to the point on the
nearest segment directly above (or nearest below)
the edge could be a ray to infinity, if nothing hit
such visibilities are easily found during sweep

each face is a trapezoid (in a generalized, possibly-infinity sense)

The algorithm, in outline:

	order the segments by a random permutation p₁...p_n
	for i=1..n,
		maintain TD(R_i), where R_i=={p₁..p_i}

To compute TD(R_i+1) from TD(R_i):
must determine effect of segment p_i+1:

find trapezoids meeting p_i+1
split some traps with new visibility edges
merge some traps: some visibility edges are blocked by p_i+1

Given that the trapezoid T containing the left endpoint of p_i+1 is known,
the remaining trapezoids meeting p_i+1
can be found by traversing the diagram along p_i+1
we will show bounds on the work for the traversal
The work to update is linear in the number of traps meeting p_i+1

So, somewhat like the randomized incremental algorithm for 2d planar convex hull, there are basically three problems:

Search: find the trap containing an endpoint;
Traverse: walk the diagram along the new segment;
Update: delete and create trapezoids

A few facts about probability:

a random permutation has each of the n! permutations equally likely;
each R_i is a random subset: each of the {n choose i}==n!/i!(n-i)! subsets is equally likely to be R_i
for fixed j with 1<=j<=i, p_j is a random element of R_i.

To make a random permutation, make an array Z with Z[i]=i for i=1..n, and

	for i = n downto 1
		pick a random number j with j <= i;
		swap Z[i] and Z[j]

Linearity of expectation

If X and Y are random variables taking integer values,
then the expected value of X+Y is
sum_x,y (x+y)Prob(X=x,Y=y)

== sum_x,y x Prob(X=x,Y=y) + sum_x,y y Prob{X=x, Y=y}
== sum_x x sum_y Prob(X=x, Y=y) + ...
== sum_x x Prob(X=x) + ...
== EX + EY

That is, E(X+Y) = EX + EY
...even when X and Y are dependent

Back to trapezoidal diagrams:

Search

One approach:
For each left endpoint of an unadded segment, maintain the trapezoid of TD(R_i) containing it.

Search analysis

For a given point p,
the work to update the trapezoid containing p is O(1), when that trapezoid changes.

How many times does a change of containing trapezoid occur?
We'll consider the probability that the containing trapezoid changes when p_i is added.

For this analysis, we need this

Observation: Any trapezoid T in TD(R_i) "determined" by no more than 4 segments: that is, T is in TD(S') for some S' subset S of size 4.

Call the segments in S' incident to T.

Moreover,

if T is in TD(S'), and
S' is a subset of R_i, and
no segment of S meeting T is in R_i,
then T is in TD(R_i).

If a change occurs for p when p_i is added,
then p_i is incident to the new containing trapezoid.

Put another way, let T be the trapezoid of TD(R_i) containing p, determined by some set S' of 4 segments of R_i.

Then a change occurs at step i iff p_i is a member of S'.

For a given set R_i,
p_i is a random element of R_i, and so
the probability that this occurs is 4/i.

Let J_i be a random variable,

J_i = 1 if the trapezoid containing p changes at step i,
J_i = 0 otherwise.

Then the expected work for p is the expected value of

sum_i J_i = sum_i EJ_i (by linearity of expectation)
=sum_i 1/i = O(log n).

Lemma 1. The expected work for search is O(n log n).

Before considering the update cost, we need to look at:

The expected complexity of TD(R_i):

The number of edges and faces is O(v),
so what is v?

The number of segment endpoints is 2i
Let A_i be the number of intersection points of R_i
The number of endpoints of visibility edges is <=2(2i+A_i)
So v<=3(2i+A_i)

What is the expected value of A_i?

Let p be an intersection point of two segments of S.
The probability that p becomes an intersection point of R_i is the probability that its two incident segments are in R_i, which is

{n-2 choose i-2} / {n choose i} == {i choose 2} / {n choose 2}, or about i²/n²

Let J_p be a random variable that is

1 when p is an intersection point of R_i
0 otherwise

Then the expected value EJ_p = i(i-1)/n(n-1),
and the expected number of intersection points,
by linearity of expectation, is
E sum_p J_p = sum_p EJ_p = Ai(i-1)/n(n-1).
(remember A = number of intersecting pairs of segments)

Lemma 2: The expected number of vertices of TD(R_i) is O(i+Ai²/n²).

Update cost:

Each trapezoid ever considered has O(1) work done for it;
How many trapezoids are made, over the course of the algorithm?
We'll look at the expected number of trapezoids created at each step.

(For convex hulls, the analogous problem was easy: two edges created per step.)

If trapezoid T is created when p_i is added, then p_i is incident to T.

Now consider T in TD(R_i);
what is the probability that T was created when p_i was added?
That is, what is the probability that p_i is a member of the S' for T?
R_i has i segments, and at most 4 of them are in S', so this probability is <=4/i.

The expected size (number of trapezoids) of TD(R_i) is O(i+Ai²/n²), from Lemma 2, and this is independent of the choice of p_i, so

Lemma 3. The expected number of trapezoids created over the course of the algorithm is sum_i O(i+Ai²/n²)/i = O(n+A).

Traversal cost:

It may be necessary to visit trapezoids that do not meet a new segment, in order to find those that do.

This additional work occurs when the boundary segment of a trapezoid is broken up by visibility edges from above.

*When walking along* a from the left across b, about 7 boundary pieces are touched.

To bound the work of this kind, consider a configuration consisting of a trapezoid with a "tail": a trapezoid together with a visibility edge dividing its boundary.
Such a tail gives a subdivision of the boundary of the trapezoid. Since the TD is planar, the total number of such tails is O(n+A).
This trap+tail configuration is determined by a set S' of at most 6 segments, just as a trapezoid is determined by at most 4 segments.

The trapezoid T plus the edge v form a trap+tail

When a segment traverses the graph, work is done for such a configuration if the segment meets the trapezoid, after which the configuration is gone.

So the total work for traversals is proportional to the total number of such configurations, which is O(n+A); the argument is similar as for updates, but with "6" replacing "4".

Putting these facts together, we get:

Theorem. The randomized incremental algorithm needs O(A + n log n) time to find the trapezoidal diagram of n segments with A intersecting pairs.

CIS677 Notes for lecture 3: Line segment intersections

Outline

Line segment intersection

applications

Some approaches

bucketing

projection

finding all pairs of overlapping projections:

Sweep line:

Time

Space

Degeneracy

Robustness

A heap is unnecessary (if you really only want intersections)

A randomized incremental algorithm

Planar graphs:

Representation of plane graphs

A randomized incremental algorithm

A few facts about probability:

Linearity of expectation

Search

Search analysis

The expected complexity of TD(Ri):

What is the expected value of Ai?

Update cost:

Traversal cost:

CIS677
Notes for lecture 3:
Line segment intersections

The expected complexity of TD(R_i):

What is the expected value of A_i?