shortenIDs: Avoid pointless renames of IDs
With the current code, scour could do a pointless remap of an ID, where there is no benefit in it. Consider: ```xml <?xml version="1.0" encoding="UTF-8"?> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"> <defs> <rect id="a" width="80" height="50" fill="red"/> <rect id="b" width="80" height="50" fill="blue"/> </defs> <use xlink:href="#a"/> <use xlink:href="#b"/> <use xlink:href="#b"/> </svg> ``` In this example, there is no point in swapping the IDs - even if "#b" is used more often than "#a", they have the same length. Besides a performance win on an already scour'ed image, it also mean scour will behave like a function with a fixed-point (i.e. scour eventually stops altering the image). To solve this, we no longer check whether an we find exactly the same ID. Instead, we look at the length of the new ID compared to the original. This gives us a slight complication as we can now "reserve" a "future" ID to avoid the rename. Thanks to Eduard "Ede_123" Braun for providing the test case. Signed-off-by: Niels Thykier <niels@thykier.net>
This commit is contained in:
parent
00cf42b554
commit
d6406a3470
3 changed files with 43 additions and 5 deletions
|
|
@ -715,22 +715,36 @@ def shortenIDs(doc, prefix, options):
|
|||
idList.extend([rid for rid in identifiedElements if rid not in idList])
|
||||
# Ensure we do not reuse a protected ID by accident
|
||||
protectedIDs = protected_ids(identifiedElements, options)
|
||||
# IDs that we have used "ahead of time". Happens if we are about to
|
||||
# rename an ID and there is no length difference between the new and
|
||||
# the old ID.
|
||||
consumedIDs = set()
|
||||
|
||||
curIdNum = 1
|
||||
|
||||
for rid in idList:
|
||||
curId = intToID(curIdNum, prefix)
|
||||
|
||||
# Skip ahead if the new ID is in use and protected.
|
||||
while curId in protectedIDs:
|
||||
# Skip ahead if the new ID has already been used or is protected.
|
||||
while curId in protectedIDs or curId in consumedIDs:
|
||||
curIdNum += 1
|
||||
curId = intToID(curIdNum, prefix)
|
||||
|
||||
# Now check if we found a new ID (we can happen to choose the same
|
||||
# ID. More likely on a rerun)
|
||||
if curId != rid:
|
||||
# Avoid checking the ID if it will not affect the size of the document
|
||||
# (e.g. remapping "c" to "a" is not going to win us anything)
|
||||
if len(curId) != len(rid):
|
||||
# Then go rename it.
|
||||
num += renameID(doc, rid, curId, identifiedElements, referencedIDs)
|
||||
elif curId < rid:
|
||||
# If we skip reassigning an ID because it has the same length
|
||||
# (E.g. skipping "c" -> "a"), then we have to mark the "future"
|
||||
# ID as taken ("c" in the example).
|
||||
# The "strictly less than" in the condition is to ensure that we
|
||||
# properly update curIdNum in the corner case where curId == rid.
|
||||
consumedIDs.add(rid)
|
||||
# Use continue here without updating curIdNum to avoid losing
|
||||
# the current ID.
|
||||
continue
|
||||
curIdNum += 1
|
||||
|
||||
return num
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue