scour

Author	SHA1	Message	Date
Niels Thykier	e8e104d8b8	Add optimization that prunes nested <g>-tags An optimization that prunes nested <g>-tags when they contain exactly one <g> and nothing else (except whitespace nodes). This looks a bit like `removeNestedGroups` except it only touches <g> tags without attributes (but can remove <g>-tags completely from a tree, whereas this optimization always leaves at least one <g> tag behind). Closes: #215 Signed-off-by: Niels Thykier <niels@thykier.net>	2020-05-18 06:09:02 +00:00
Niels Thykier	7e917c9ca0	g_tag_is_unmergeable: consider <g> tags with ids unmergeable If someone gave it an ID and we have not stripped it, then it is probably important and can alter the output somehow if we fiddle merge it into another node (or discard the <g> node with an ID). Signed-off-by: Niels Thykier <niels@thykier.net>	2020-05-17 20:18:18 +00:00
Niels Thykier	eb582fe44c	Refactor: Create a g_tag_is_unmergeable Both `mergeSiblingGroupsWithCommonAttributes` and `removeNestedGroups` used the same code in different forms. Extract it into its own function. Signed-off-by: Niels Thykier <niels@thykier.net>	2020-05-17 19:04:58 +00:00
Niels Thykier	a15acb3e4e	Rename testX.py to test_X.py to make py.test work out of the box (#181 ) This rename makes py.test/py.test-3 find the test suite out of the box. Example command lines: # Running the test suite (optionally include "-v") $ py.test-3 # Running the test suite with coverage enabled (and branch # coverage). $ py.test-3 --cov=scour --cov-report=html --cov-branch Signed-off-by: Niels Thykier <niels@thykier.net>	2020-05-17 19:55:24 +02:00
Niels Thykier	dd2155e576	Merge sibling <g> nodes with identical attributes In some cases, gnuplot generates a very suboptimal SVG content of the following pattern: <g color="black" fill="none" stroke="currentColor"> <path d="m82.5 323.3v-4.1" stroke="#000"/> </g> <g color="black" fill="none" stroke="currentColor"> <path d="m116.4 323.3v-4.1" stroke="#000"/> </g> ... repeated 10+ more times here ... <g color="black" fill="none" stroke="currentColor"> <path d="m65.4 72.8v250.5h420v-250.5h-420z" stroke="#000"/> </g> A more optimal pattern would be: <g color="black" fill="none" stroke="#000"> <path d="m82.5 323.3v-4.1"/> <path d="m116.4 323.3v-4.1"/> ... 10+ more paths here ... <path d="m65.4 72.8v250.5h420v-250.5h-420z"/> </g> This patch enables that optimization by handling the merging of two sibling <g> entries that have identical attributes. In the above example that does not solve the rewrite from "currentColor" to "#000" for the stroke attribute. However, the existing code already handles that automatically after the <g> elements have been merged. This change provides comparable results to --create-groups as shown by the following diagram while being a distinct optimization: +----------------------------+-------+--------+ \| Test \| Size \| in % \| +----------------------------+-------+--------+ \| baseline \| 17961 \| 100% \| \| baseline + --create-groups \| 17418 \| 97.0% \| \| patched \| 16939 \| 94.3% \| \| patched + --create-groups \| 16855 \| 93.8% \| +----------------------------+-------+--------+ The image used in the size table above was generated based on the instructions from https://bugs.debian.org/858039#10 with gnuplot 5.2 patchlevel 2. Beyond the test-based "--create-groups", the following scour command-line parameters were used: --enable-id-stripping --enable-comment-stripping \ --shorten-ids --indent=none Note that the baseline was scour'ed repeatedly to stablize the image size. Signed-off-by: Niels Thykier <niels@thykier.net>	2020-05-17 19:37:32 +02:00
Patrick Storz	40753af88a	Fix whitespace handling for SVG 1.2 flowed text See `718748ff22` Fixes https://github.com/scour-project/scour/issues/235	2020-05-17 17:33:50 +02:00
Patrick Storz	f65ca60809	Fix deprecation warning	2020-05-17 17:10:26 +02:00
Patrick Storz	4fe2655f86	Merge pull request #187 from nthykier/fix-gh-186-shorten-id-recycle-used-ids Enable shortenIDs to recycle existing IDs	2020-05-17 16:48:18 +02:00
Niels Thykier	58b75c314a	Add test case for #198/#202 Signed-off-by: Niels Thykier <niels@thykier.net>	2020-05-17 16:29:08 +02:00
Niels Thykier	6846e0c9ee	Preserve xhref:href attr when collapsing referenced gradients Closes: #198 Closes: #202 Signed-off-by: Niels Thykier <niels@thykier.net>	2020-05-17 16:29:08 +02:00
Niels Thykier	f61b4d36d6	Add test case for #203 Signed-off-by: Niels Thykier <niels@thykier.net>	2020-05-17 16:13:45 +02:00
Niels Thykier	09a656287d	Avoid picking an id-less gradient to replace one with an id Closes: #203 Signed-off-by: Niels Thykier <niels@thykier.net>	2020-05-17 16:13:45 +02:00
Patrick Storz	695676e3a5	Run tests with Python 3.7 / 3.8	2020-05-17 16:03:06 +02:00
Eduard Braun	049264eba6	Scour v0.37	2018-07-04 19:16:55 +02:00
Eduard Braun	5ccba31ff9	Update HISTORY.md	2018-07-04 19:05:25 +02:00
Patrick Storz	718748ff22	Merge pull request #199 from Ede123/newline_handling Several improvements for handling whitespace including newlines, especially in text nodes	2018-07-03 22:56:36 +02:00
Eduard Braun	651694a6c0	Add unittests for whitespace handling in text node Also expand/fix the test for line endings	2018-07-03 22:53:05 +02:00
Eduard Braun	703122369e	Strip newlines from text nodes and be done with it Follow the spec "blindly" as it turns out covering all the border and getting reasonably styled output is just to cumbersome. This way at least scour output is consistent and it also saves us some bytes (a lot in some cases as we do not indent <tspan>s etc. anymore)	2018-07-02 22:14:14 +02:00
Eduard Braun	2200f8dc81	temp	2018-07-02 01:05:54 +02:00
Eduard Braun	e1c2699f07	Improve whitespace handling in text content elements SVG specifies special logic for handling whitespace, see https://www.w3.org/TR/SVG/text.html#WhiteSpace by implementing it we can even shave off some unneeded bytes here and there (e.g. consecutive spaces). Unfortunately handling of newlines by renderers is inconsistent: Sometimes they are replaced by a single space, sometimes they are removed in the output. As we can not know the expected behavior work around this by keeping newlines inside text content elements intact. Fixes #160.	2018-07-01 20:19:58 +02:00
Eduard Braun	7d28f5e051	Improve handling of newlines Previously we added way to many and removed empty lines afterwards (potentially destructive if xml:space="preserve") Also adds proper indentation for comment nodes	2018-07-01 19:48:18 +02:00
Eduard Braun	06ea23d0e1	fix typo	2018-07-01 13:52:51 +02:00
Patrick Storz	8c95d950af	Merge pull request #192 from nthykier/gh-189-order-vs-SVGLength Work around an exception in removeDefaultAttributeValue() caused by some rarely used filter attributes that allow an optional second value which SVGLength does not handle properly	2018-06-30 19:03:15 +02:00
Patrick Storz	5d579f8927	Also special-case baseFrequency and add 'radius	2018-06-30 18:58:36 +02:00
Eduard Braun	3c64623a12	Discontinue official support for Python 3.3 (testing failed due to wheel now requiring Python >= 3.4) Also run flake8 in latest Python 3.6 (3.7 is not supported on Travis yet)	2018-06-29 19:29:09 +02:00
Patrick Storz	9f4a707bb7	Merge pull request #178 from nthykier/gh-163-path-rewrite Correct handling of "m0 0" vs. "z" commands	2018-06-29 19:11:53 +02:00
Niels Thykier	8a2892b458	Avoid crashing on stdDeviation attribute Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-21 06:39:08 +00:00
Niels Thykier	c504891bd7	test: Use number-optional-number variant of kernelUnitLength Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-21 06:19:38 +00:00
Tobias Oberstein	47f918e696	Merge pull request #191 from nthykier/gh-190-optimizeTransform-IndexError Avoid crashing on "scale(1)" (short for "scale(1, 1)")	2018-04-18 19:25:48 +02:00
Niels Thykier	18e57cddae	Avoid crashing on "scale(1)" (short for "scale(1, 1)") The scale function on the transform attribute has a short form, where only the first argument is used. But optimizeTransform would always assume that there were two when checking for the identity scale. Closes: #190 Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-18 05:41:35 +00:00
Niels Thykier	a459d629c1	removeDefaultAttributeValue: Special-case order attribute Scour tried to handle "order" attribute as a SVGLength. However, the "order" attribute can consist of two integers according to the [SVG 1.1 Specification] and SVGLength is not designed to handle that. With this change, we now pretend that "order" is a string, which side steps this issue. [SVG 1.1 Specification]: https://www.w3.org/TR/SVG11/single-page.html#filters-feConvolveMatrixElementOrderAttribute Closes: #189 Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-17 19:48:37 +00:00
Niels Thykier	039022ee9d	shortenID: Improve tracking of optimal ID lengths Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-16 18:52:12 +00:00
Niels Thykier	e25b0dae73	Remove a (now) unused parameter to renameID Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-15 17:36:07 +00:00
Niels Thykier	91503c6d7e	renameID: Replace referencedIDs with referringNodes This change pushes the responsibility of updating referencedIDs to its callers where needed. The only caller of renameIDs is shortenIDs and that works perfectly fine without updating its copy of referencedIDs. In shortenIDs, we need to be able to lookup which nodes referenced the "original ID" (and not the "new ID"). While shortenIDs could update referencedIDs so it remained valid, it is extra complexity for no gain. As an example of this complexity, imagine if two or more IDs are "rotated" like so: Original IDs: a, bb, ccc, dddd Mapping: dddd -> ccc ccc -> bb bb -> a a -> dddd While doable within reasonable performance, we do not need to support it at the moment, so there is no reason to handle that complexity. Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-15 17:35:05 +00:00
Niels Thykier	d6406a3470	shortenIDs: Avoid pointless renames of IDs With the current code, scour could do a pointless remap of an ID, where there is no benefit in it. Consider: ```xml <?xml version="1.0" encoding="UTF-8"?> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"> <defs> <rect id="a" width="80" height="50" fill="red"/> <rect id="b" width="80" height="50" fill="blue"/> </defs> <use xlink:href="#a"/> <use xlink:href="#b"/> <use xlink:href="#b"/> </svg> ``` In this example, there is no point in swapping the IDs - even if "#b" is used more often than "#a", they have the same length. Besides a performance win on an already scour'ed image, it also mean scour will behave like a function with a fixed-point (i.e. scour eventually stops altering the image). To solve this, we no longer check whether an we find exactly the same ID. Instead, we look at the length of the new ID compared to the original. This gives us a slight complication as we can now "reserve" a "future" ID to avoid the rename. Thanks to Eduard "Ede_123" Braun for providing the test case. Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-15 17:34:24 +00:00
Eduard Braun	8ddb7d8913	Add valid elements for 'spreadMethod' attribute Turns out 'default_attributes_universal' is actually empty right now so we might consider removing it altogether...	2018-04-15 18:40:06 +02:00
Eduard Braun	0ec0732447	Simplify 'default_attributes' handling a bit	2018-04-15 18:33:46 +02:00
Eduard Braun	20dcbcbe64	'default_attributes': make sure 'elements' is a list	2018-04-15 18:31:51 +02:00
Niels Thykier	1650f91ea4	Optimize removeDefaultAttributeValues Avoid looping over DefaultAttribute(s) that are not relevant for a given node. This skips a lot of calls to removeDefaultAttributeValue but more importantly, it avoids "node.nodeName not in attribute.elements" line in removeDefaultAttributeValue. As attribute.elements is a list, this becomes expensive for "larger lists" (or in this case when there are a lot of attributes). This seems to remove about 1½-2 minutes of runtime (out of ~8) on the 1_42_polytope_7-cube.svg test case provided in #184. Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-15 18:29:58 +02:00
Niels Thykier	5dc1b7a820	scour: Make optimized default_attribute data structures There are a lot of "DefaultAttribute"s and for a given tag, most of the "DefaultAttribute"s are not applicable. Therefore, we create two data structures to assist us with only dealing with the attributes that matter. Here there are two cases: * Those that always matter. These go into default_attributes_unrestricted list. * Those that matter only based on the node name. These go into the default_attributes_restricted_by_tag with the node name as key (with the value being a list of matching attributes). In the next commit, we will use those for optimizing the removal of default attributes. Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-15 18:29:58 +02:00
Niels Thykier	00cf42b554	Rename function to match DEP8 conventions Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-15 16:22:00 +00:00
Niels Thykier	0254014e06	Enable shortenIDs to recycle existing IDs This patch enables shortenIDs to remap IDs currently in use. This is very helpful to ensure that scour does not change been "optimal" and "suboptimal" choices for IDs as observed in GH#186. Closes: #186 Signed-off-by: Niels Thykier <niels@thykier.net>	2018-04-13 21:00:35 +00:00
Eduard Braun	3283d6d5ec	Simplify control point detection logic - make controlPoints() return a consistent type like flags() - rename the ambiguous "reduce_precision" to "is_control_point"	2018-04-08 16:48:33 +02:00
Eduard Braun	103dcc0a48	Fix handling of boolean flags in elliptical path commands (#183 ) * properly parse paths without space after boolean flags (fixes #161) * omit space after boolean flag to shave off a few bytes when not using renderer workarounds	2018-04-08 15:32:47 +02:00
Niels Thykier	ba7f4b5f18	Remove more redundant uses of .keys() Signed-off-by: Niels Thykier <niels@thykier.net>	2018-03-26 22:36:19 +02:00
Niels Thykier	f8d5af0e56	Remove now unused variable Signed-off-by: Niels Thykier <niels@thykier.net>	2018-03-26 22:36:19 +02:00
Eduard Braun	d508f59aa6	Completely remove "walltime" variable and use time.time() directly	2018-03-26 22:34:11 +02:00
Niels Thykier	b622642aa1	Simplify timer selection to always use time.time() (#175 ) In python2.7 and python3.3, time.time() is sufficient accurate for our purpose and avoids going through hoops to select the best available function. Signed-off-by: Niels Thykier <niels@thykier.net>	2018-03-26 22:30:25 +02:00
Niels Thykier	38274f75bc	Implement a basic rewrite of redundant commands This basic implementation can drop and rewrite some cases of "m0 0" and "z" without triggering the issues experienced in #163. It works by analysing the path backwards and tracking "z" and "m" commands. Signed-off-by: Niels Thykier <niels@thykier.net>	2018-03-11 08:33:50 +00:00
Niels Thykier	a2c94c96fb	Disable the "m0 0"-optimization as it is wrong in some cases The "m0 0" rewrite gets some cases wrong, like: m150 240h200m0 0 150 150v-300z Scour rewrote that into the following m150 240h200l150 150v-300z However, these two paths do not produce an identical figure at all. The first is a line followed by a triangle while the second is a quadrilateral. While there are some instances we can rewrite (that scour will no longer rewrite), these will require an analysis over multiple commands to determine whether the rewrite is safe. This will reappear in the next commit. Closes: #163 Signed-off-by: Niels Thykier <niels@thykier.net>	2018-03-11 08:25:46 +00:00

1 2 3 4 5 ...

271 commits