Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: colinta/SublimeStringEncode
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: deathaxe/StringEncode
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 6 commits
  • 6 files changed
  • 1 contributor

Commits on May 16, 2025

  1. Rename encode() to convert()

    The method actually encodes and decodes text, so `convert` reads more logical.
    deathaxe committed May 16, 2025
    Configuration menu
    Copy the full SHA
    29def6c View commit details
    Browse the repository at this point in the history
  2. Refactor CssEscape and CssUnescape

    This commit...
    
    1. moves Css command classes to ensure Html related commands
       are located next to each other.
    
    2. uses slices to truncate hex or escape prefixes
    3. drops unnecessary `enumerate()` from CssEscape
    4. optimizes and simplifies CssUnescape implementation by
       - dropping redundant pattern matching
       - using pre-compiled headers
       - look for valid unicode escapes only (at most 6 chars)
    
    Benchmark:
    
    1. CssEscapeCommand:
    
        import timeit;from StringEncode import string_encode;cmd = string_encode.CssEscapeCommand(view);timeit.timeit(lambda: cmd.encode('.tr { ≅≆≇≈∵: "∕≥-≅≉" }'))
    
       runtime: 4.0s => 3.1s
    
    2. CssUnescapeCommand:
    
        import timeit;from StringEncode import string_encode;cmd = string_encode.CssUnescapeCommand(view);timeit.timeit(lambda: cmd.encode('.tr { \\2245\\2246\\2247\\2248\\2235: "\\2215\\2265-\\2245\\2249" }'))
    
       runtime: 16.1s => 4.5s
    deathaxe committed May 16, 2025
    Configuration menu
    Copy the full SHA
    c8a7f5e View commit details
    Browse the repository at this point in the history
  3. Refactor HTML Entitizer and Deentitizer

    This commit leverages backported (and tweaked) python 12 stdlib's
    `html` module for HTML (un-)escaping.
    
    `HtmlEntitizeCommand` and `SafeHtmlEntitizeCommand` run 10 times faster.
    
        import timeit;from StringEncode import string_encode; cmd = string_encode.HtmlEntitizeCommand(view);timeit.timeit(lambda: cmd.encode("<div attrib=\"foo&amb;bar\">≅</div>"))
    
    runtime: 34s => 4.0s
    
    `HtmlDeentitizeCommand` and `SafeHtmlDeentitizeCommand` are now synonyms.
    Both use default `html_unescape`, because any attempt to unescape an already
    unescaped character doesn't have any effect.
    
        from StringEncode import string_encode;cmd = string_encode.HtmlDeentitizeCommand(view);timeit.timeit(lambda: cmd.encode('&lt;div attrib=&quot;foo&amp;amb;bar&quot;&gt;&cong;&lt;/div&gt;'))
    
    runtime: 23.3s => 4.4s
    deathaxe committed May 16, 2025
    Configuration menu
    Copy the full SHA
    a18c50f View commit details
    Browse the repository at this point in the history
  4. Refactor XML Entitizer and Deentitizer

    benchmark:
    
        from StringEncode import string_encode;cmd = string_encode.XmlEntitizeCommand(view);timeit.timeit(lambda: cmd.encode("<div attrib=\"foo&amb;bar\">≅</div>"))
    
    runtime: 6.5s => 4.5s
    
        from StringEncode import string_encode;cmd = string_encode.XmlDeentitizeCommand(view);timeit.timeit(lambda: cmd.encode('&lt;div attrib=&quot;foo&amp;amb;bar&quot;&gt;&cong;&lt;/div&gt;'))
    
    runtime: 0.84 => 0.84
    deathaxe committed May 16, 2025
    Configuration menu
    Copy the full SHA
    7736dd4 View commit details
    Browse the repository at this point in the history
  5. Add HtmlEscape to escape html/xml reserved chars only

    Don't convert unicode chars into entities.
    
    Counter function for safe_html_entitize, which converts only non-reserved chars.
    deathaxe committed May 16, 2025
    Configuration menu
    Copy the full SHA
    a48b499 View commit details
    Browse the repository at this point in the history
  6. Reorganize urllib.parse imports

    Import whole urllib.parse module as with base64, hashlib, etc.
    deathaxe committed May 16, 2025
    Configuration menu
    Copy the full SHA
    f751e4b View commit details
    Browse the repository at this point in the history
Loading