./textproc/py-ftfy, Fixes some problems with Unicode text after the fact

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 6.3.1, Package name: py312-ftfy-6.3.1, Maintainer: pkgsrc-users

Given Unicode text, make its representation consistent and possibly less broken.


Required to run:
[devel/py-setuptools] [textproc/py-html5lib] [devel/py-wcwidth] [lang/python310]

Master sites:

Filesize: 301.687 KB

Version history: (Expand)


CVS history: (Expand)


   2025-01-15 13:46:13 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message: py-ftfy: updated to 6.3.1 Version 6.3.1 (October 25, 2024) - Fixed `license` metadata field in pyproject.toml. - Removed extraneous files from the `hatchling` sdist output. Version 6.3.0 (October 8, 2024) - Switched packaging from poetry to uv. - Uses modern Python packaging exclusively (no setup.py). - Added support for mojibake in Windows-1257 (Baltic). - Detects mojibake for "Ü" in an uppercase word, such as \ "ZURÜCK". - Expanded a heuristic that notices improbable punctuation. - Fixed a false positive involving two concatenated strings, one of which began \ with the § sign. - Rewrote `chardata.py` to be more human-readable and debuggable, instead of \ being full of keysmash-like character sets. 
   2024-04-26 22:10:48 by Adam Ciarcinski | Files touched by this commit (3)
Log message: py-ftfy: do not install additional files into site-packages directory 
   2024-04-26 18:52:00 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message: py-ftfy: updated to 6.2.0 Version 6.2.0 (March 16, 2024) - Fixed a case where an en-dash and a space near other mojibake would be interpreted (probably incorrectly) as MacRoman mojibake. - Added [project.urls] metadata to pyproject.toml. - README contains license clarifications for entitled jerks. 
   2024-01-06 20:51:02 by Adam Ciarcinski | Files touched by this commit (4) | Package updated
Log message: py-ftfy: updated to 6.1.3 Version 6.1.3 (November 21, 2023) - Updated wcwidth. - Switched to the Apache 2.0 license. - Dropped support for Python 3.7. Version 6.1.2 (February 17, 2022) - Added type information for `guess_bytes`. Version 6.1.1 (February 9, 2022) - Updated the heuristic to fix the letter ß in UTF-8/MacRoman mojibake, which had regressed since version 5.6. - Packaging fixes to pyproject.toml. Version 6.1 (February 9, 2022) - Updated the heuristic to fix the letter Ñ with more confidence. - Fixed type annotations and added py.typed. - ftfy is packaged using Poetry now, and wheels are created and uploaded to PyPI. Version 6.0.3 (May 14, 2021) - Allow the keyword argument `fix_entities` as a deprecated alias for `unescape_html`, raising a warning. - `ftfy.formatting` functions now disregard ANSI terminal escapes when calculating text width. Version 6.0.2 (May 4, 2021) This version is purely a cosmetic change, updating the maintainer's e-mail address and the project's canonical location on GitHub. Version 6.0.1 (April 12, 2021) - The `remove_terminal_escapes` step was accidentally not being used. This version restores it. - Specified in setup.py that ftfy 6 requires Python 3.6 or later. - Use a lighter link color when the docs are viewed in dark mode. Version 6.0 (April 2, 2021) - New function: `ftfy.fix_and_explain()` can describe all the transformations that happen when fixing a string. This is similar to what `ftfy.fixes.fix_encoding_and_explain()` did in previous versions, but it can fix more than the encoding. - `fix_and_explain()` and `fix_encoding_and_explain()` are now in the top-level ftfy module. - Changed the heuristic entirely. ftfy no longer needs to categorize every Unicode character, but only characters that are expected to appear in mojibake. - Because of the new heuristic, ftfy will no longer have to release a new version for every new version of Unicode. It should also run faster and use less RAM when imported. - The heuristic `ftfy.badness.is_bad(text)` can be used to determine whether there appears to be mojibake in a string. Some users were already using the old function `sequence_weirdness()` for that, but this one is actually designed for that purpose. - Instead of a pile of named keyword arguments, ftfy functions now take in a TextFixerConfig object. The keyword arguments still work, and become settings that override the defaults in TextFixerConfig. - Added support for UTF-8 mixups with Windows-1253 and Windows-1254. - Overhauled the documentation: https://ftfy.readthedocs.org 
   2022-01-05 16:41:32 by Thomas Klausner | Files touched by this commit (289)
Log message: python: egg.mk: add USE_PKG_RESOURCES flag This flag should be set for packages that import pkg_resources and thus need setuptools after the build step. Set this flag for packages that need it and bump PKGREVISION. 
   2022-01-04 21:55:40 by Thomas Klausner | Files touched by this commit (1595)
Log message: *: bump PKGREVISION for egg.mk users They now have a tool dependency on py-setuptools instead of a DEPENDS 
   2021-10-26 13:23:42 by Nia Alarie | Files touched by this commit (1161)
Log message: textproc: Replace RMD160 checksums with BLAKE2s checksums All checksums have been double-checked against existing RMD160 and SHA512 hashes Unfetchable distfiles (fetched conditionally?): ./textproc/convertlit/distinfo clit18src.zip 
   2021-10-07 17:02:49 by Nia Alarie | Files touched by this commit (1162)
Log message: textproc: Remove SHA1 hashes for distfiles