def apply_transform(coords_2d, xform):
assert coords_2d.shape[1] == 2
assert xform.shape[0] == 3
assert xform.shape[1] == 3
= np.concatenate([coords_2d, np.ones((coords_2d.shape[0], 1))], axis=1)
coords_full assert coords_full.shape[0] == coords_2d.shape[0]
assert coords_full.shape[1] == 3
return xform.dot(coords_full.transpose()).transpose()[:,:2]
transforms
(x,y)
coordinates.
2D Affine Transformations
For manipulating sets of stroke-3 coordinates, I often needed to transform the coordinates - rescaling, rotating, moving, etc.
2D affine transformations are perfect for this. Each transformation is expressed as a matrix, and is applied by performing matrix multiplication against the 2D matrix of coordinates (with a column of 1
values concatenated next to x
and y
values).
apply_transform
apply_transform (coords_2d, xform)
Consider a sample set of x,y coordinates:
x | y |
---|---|
0 | 0 |
0.5 | 0.5 |
1 | 1 |
The simplest affine transformation is the identity.
Multiplying the 2D coords by the following matrix will output an identical set of coordinates:
1 0 0
0 1 0
0 0 1
def identity_xform():
return np.array([
1, 0, 0],
[0, 1, 0],
[0, 0, 1]]) [
identity_xform
identity_xform ()
= np.array([
sample_coords 0.0,0.0],
[0.5,0.5],
[1.0,1.0],
[
])
test(apply_transform(sample_coords, identity_xform()),
sample_coords, np.array_equal)
Scaling a set of coordinates by sx
in the x
direction and sy
in the y
direction is as easy as multiplying by:
sx 0 0
0 sy 0
0 0 1
def scale_xform(sx, sy):
return np.array([
0, 0],
[sx, 0, sy, 0],
[0, 0, 1]]) [
scale_xform
scale_xform (sx, sy)
= np.array([
scaled_coords_2x_3y 0.0,0.0],
[1.0,1.5],
[2.0,3.0],
[
])
=2.0, sy=3.0)),
test(apply_transform(sample_coords, scale_xform(sx
scaled_coords_2x_3y, np.array_equal)
Translating a set of coordinates by tx
in the x
direction and ty
in the y
direction uses the following matrix.
1 0 tx
0 1 ty
0 0 1
def translate_xform(tx, ty):
return np.array([
1, 0, tx],
[0, 1, ty],
[0, 0, 1]]) [
translate_xform
translate_xform (tx, ty)
= np.array([
translated_coords_2x_3y 2.0,3.0],
[2.5,3.5],
[3.0,4.0],
[
])
=2.0, ty=3.0)),
test(apply_transform(sample_coords, translate_xform(tx
translated_coords_2x_3y, np.array_equal)
To rotate a set of 2D coordinates by n
degrees, we first convert the desired angle of rotation from degrees to a value in radians, theta
.
Consider a point (x1, y1)
at (4, 4)
that I want to rotate by 30 degrees around the origin. If I make a line to the origin with length r
, the angle between line1 and the x axis, which I’ll call phi
, is 45 degrees.
= r * cos(phi) = 4
x1 = r * sin(phi) = 4 y1
If I want to rotate it by 30 degrees, its final position at point (x2, y2)
will be 45 + 30 = 75 degrees from the x-axis, equivalent to phi + theta
.
= r * cos(phi + theta)
x2 = r * sin(phi + theta) y2
Since cos(a + b) = cos(a)*cos(b) - sin(a)*sin(b)
and sin(a + b) = sin(a)*cos(b) + cos(a)*sin(b)
, this breaks down into:
= r * cos(phi + theta)
x2 = r * cos(phi) * cos(theta) - r * sin(phi) * sin(theta)
= x1 * cos(theta) - y1 * sin(theta)
= r * sin(phi + theta)
y2 = r * sin(phi) * cos(theta) + r * cos(phi) * sin(theta)
= y1 * cos(theta) + x1 * sin(theta)
So, multiplying any 2D coordinates by the following matrix will produce the correct rotation:
cos(theta) -sin(theta) 0
sin(theta) cos(theta) 0
0 0 1
def rotate_xform(rotate_angle):
if rotate_angle % 360 == 0:
return identity_xform()
= np.radians(rotate_angle)
theta = np.cos(theta)
cos_theta = np.sin(theta)
sin_theta return np.array([
-sin_theta, 0],
[cos_theta, 0],
[sin_theta, cos_theta, 0, 0, 1]
[ ])
rotate_xform
rotate_xform (rotate_angle)
Bounding Boxes
class BoundingBox:
float
xmin: float
xmax: float
ymin: float
ymax: xrange: float
float
yrange:
def __init__(self, xmin, xmax, ymin, ymax):
assert xmin <= xmax
assert ymin <= ymax
self.xmin = xmin
self.xmax = xmax
self.ymin = ymin
self.ymax = ymax
self.xrange = xmax - xmin
self.yrange = ymax - ymin
def __repr__(self):
return f"BBox(({self.xmin}, {self.ymin}), ({self.xmax}, {self.ymax})) (width {self.xrange} x height {self.yrange})"
@staticmethod
def create(coords: np.ndarray):
# rank-2
if len(coords.shape) == 2:
= coords[:, 0].min()
xmin = coords[:, 0].max()
xmax
= coords[:, 1].min()
ymin = coords[:, 1].max()
ymax
return BoundingBox(
=xmin, xmax=xmax, ymin=ymin, ymax=ymax
xmin
)else:
raise Exception(f"invalid coordinates passed - expected rank-2 matrix but got rank-{len(coords.shape)}")
def merge(self, other):
return BoundingBox(
=min(self.xmin, other.xmin),
xmin=max(self.xmax, other.xmax),
xmax=min(self.ymin, other.ymin),
ymin=max(self.ymax, other.ymax),
ymax
)
def area(self):
return self.xrange * self.yrange
def intersection(self, other):
= self
bb1 = other
bb2
assert bb1.xmin <= bb1.xmax
assert bb1.ymin <= bb1.ymax
assert bb2.xmin <= bb2.xmax
assert bb2.ymin <= bb2.ymax
= max(bb1.xmin, bb2.xmin)
x_left = max(bb1.ymin, bb2.ymin)
y_top = min(bb1.xmax, bb2.xmax)
x_right = min(bb1.ymax, bb2.ymax)
y_bottom
if x_right < x_left or y_bottom < y_top:
return None
return BoundingBox(xmin=x_left, ymin=y_top, xmax=x_right, ymax=y_bottom)
def iou(self, other):
"""
Intersection over union - area of the overlap relative to combined area of the bounding boxes
"""
= self.intersection(other)
overlap if not overlap:
return 0.0
return overlap.area() / float(self.area() + other.area() - overlap.area())
def normalization_xform(self, scale=1.0):
"""
Produce a normalization transform - a set of transformations,
given the input coordinates, to convert all coords into the range (0,1)
"""
= self.xrange if self.xrange > self.yrange else self.yrange
max_range return scale_xform(scale / max_range, scale / max_range).dot(
-self.xmin, -self.ymin)
translate_xform( )
BoundingBox
BoundingBox (xmin, xmax, ymin, ymax)
Initialize self. See help(type(self)) for accurate signature.
BoundingBox.create
BoundingBox.create (coords:numpy.ndarray)
BoundingBox.merge
BoundingBox.merge (other)
BoundingBox.area
BoundingBox.area ()
BoundingBox.intersection
BoundingBox.intersection (other)
BoundingBox.iou
BoundingBox.iou (other)
Intersection over union - area of the overlap relative to combined area of the bounding boxes
BoundingBox.normalization_xform
BoundingBox.normalization_xform (scale=1.0)
Produce a normalization transform - a set of transformations, given the input coordinates, to convert all coords into the range (0,1)
Stroke-3 Conversion
strokes_to_deltas
strokes_to_deltas (strokes)
points_to_deltas
points_to_deltas (points)
strokes_to_points
strokes_to_points (strokes)
deltas_to_strokes
deltas_to_strokes (_seq)
points_to_strokes
points_to_strokes (_seq)
deltas_to_points
deltas_to_points (_seq)
RDP
rdp_strokes
rdp_strokes (strokes, epsilon=1.0)
stroke_rdp_deltas
stroke_rdp_deltas (rescaled_strokes, epsilon=2.0)