Support multiple input files with -o being a directory

This change makes it possible for scour to consume multiple input
files in one command invocation.  E.g.

     $ scour file1.svg file2.svgz ... output-directory
     # xargs friendly variant
     $ scour -o output-directory file1.svg file2.svgz ...

This avoids most of the "startup" overhead in python and scour when
many files are being processed.  On about a 100 of (already scour'ed)
gnuplot svg graphs, this change provides an almost 40% speed up compared
to a shell alternative:

     # Original shell pipeline (~29s)
     # Note; for bash, rewriting this without the "basename"-call does
     # not seem to improve performance considerably.
     $ for FILE in input/[01]* ; do \
         python3 -m scour.scour "$FILE" output/"$(basename "$FILE")" > /dev/null ; \
       done
    # With this patch (~16s)
    $ python3 -m scour.scour input/[01]* output > /dev/null

Signed-off-by: Niels Thykier <niels@thykier.net>
This commit is contained in:
Niels Thykier 2018-03-18 11:15:46 +00:00
parent 82ce83acab
commit c42dc6b793
2 changed files with 40 additions and 20 deletions

View file

@ -2358,7 +2358,7 @@ class DoNotStripXmlSpaceAttribute(unittest.TestCase):
class CommandLineUsage(unittest.TestCase):
USAGE_STRING = "Usage: scour [INPUT.SVG [OUTPUT.SVG]] [OPTIONS]"
USAGE_STRING = "Usage: scour [INPUT.SVG [[... INPUT.SVG] OUTPUT]] [OPTIONS]"
MINIMAL_SVG = '<?xml version="1.0" encoding="UTF-8"?>\n' \
'<svg xmlns="http://www.w3.org/2000/svg"/>\n'
TEMP_SVG_FILE = 'testscour_temp.svg'