dataset

End-to-end functions taking in centerline-stroke SVG’s and outputting deltas in Stroke-3 format.

Bulk Processing SVG’s into Stroke-3


source

enumerate_svgs

 enumerate_svgs (input_dir)

Find all the files within a directory (non-recursively)


source

svgs_to_deltas

 svgs_to_deltas (input_dir, output_dir=None, target_size=200,
                 total_n=1000, min_n=3, epsilon=1.0, limit=None)
# input_dir = '../data/svg/'
# output_dir = '../outputs'

# # debug: only run for the first 10 files
# limit = 10

# _ = svgs_to_deltas(input_dir, output_dir, limit=limit)
# partial_dataset = svgs_to_deltas(input_dir, output_dir, limit=None)
# len(partial_dataset)
# np.savez('../outputs/subset.npz', partial_dataset, encoding='latin1', allow_pickle=True)

Sidebar: Visualizing all the images in the dataset (up to N strokes)

# from moviepy.editor import *

# imgs_01 = sorted(enumerate_files("../outputs_segmented/png/01/"))
# abs_01 = [os.path.join("../outputs_segmented/png/01", f) for f in imgs_01]
# imgs_02 = sorted(enumerate_files("../outputs_segmented/png/02/"))
# abs_02 = [os.path.join("../outputs_segmented/png/02", f) for f in imgs_02]
# imgs_03 = sorted(enumerate_files("../outputs_segmented/png/03/"))
# abs_03 = [os.path.join("../outputs_segmented/png/03", f) for f in imgs_03]
# imgs_04 = sorted(enumerate_files("../outputs_segmented/png/04/"))
# abs_04 = [os.path.join("../outputs_segmented/png/04", f) for f in imgs_04]

# all_fnames = abs_01 + abs_02 + abs_03 + abs_04

# new_clip = ImageSequenceClip(all_fnames, fps=20)
# new_clip.write_videofile("new_file_fps20.mp4")

# new_clip = ImageSequenceClip(all_fnames, fps=24)
# new_clip.write_videofile("new_file_fps24.mp4")

Dataset Filtering


source

stroke_summary_df

 stroke_summary_df (dataset)

source

split_train_val

 split_train_val (full_dataset, output_fname, split_ratio=0.8,
                  max_strokes=None, max_points=None, min_points=None)