Bump charset-normalizer from 2.0.12 to 3.0.0
Created by: dependabot[bot]
Bumps charset-normalizer from 2.0.12 to 3.0.0.
Release notes
Sourced from charset-normalizer's releases.
Version 3.0.0
3.0.0 (2022-10-20)
Added
- Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
- Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
- Add parameter
language_thresholdinfrom_bytes,from_pathandfrom_fpto adjust the minimum expected coherence ratio
normalizer --versionnow specify if the current version provides extra speedup (meaning mypyc compilation whl)Changed
- Build with static metadata (not pyproject.toml yet)
- Make language detection stricter
- Optional: Module
md.pycan be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1Fixed
- CLI with opt --normalize fail when using full path for files
- TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha characters have been fed to it
- Sphinx warnings when generating the documentation
Removed
- Coherence detector no longer returns 'Simple English' instead returns 'English'
- Coherence detector no longer returns 'Classical Chinese' instead returns 'Chinese'
- Breaking: Method
first()andbest()from CharsetMatch- UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflicts with ASCII)
- Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches
- Breaking: Top-level function
normalize- Breaking: Properties
chaos_secondary_pass,coherence_non_latinandw_counterfrom CharsetMatch- Support for the backport
unicodedata2This is the last version (3.0.x) to support Python 3.6 We plan to drop it for 3.1.x
Version 3.0.0rc1
This is the last pre-release. If everything goes well, I will publish the stable tag.
3.0.0rc1 (2022-10-18)
Added
- Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
- Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
- Add parameter
language_thresholdinfrom_bytes,from_pathandfrom_fpto adjust the minimum expected coherence ratioChanged
- Build with static metadata using 'build' frontend
- Make language detection stricter
Fixed
- CLI with opt --normalize fail when using full path for files
- TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha characters have been fed to it
Removed
... (truncated)
Changelog
Sourced from charset-normalizer's changelog.
3.0.0 (2022-10-20)
Added
- Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
- Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
- Add parameter
language_thresholdinfrom_bytes,from_pathandfrom_fpto adjust the minimum expected coherence ratio
normalizer --versionnow specify if current version provide extra speedup (meaning mypyc compilation whl)Changed
- Build with static metadata using 'build' frontend
- Make the language detection stricter
- Optional: Module
md.pycan be compiled using Mypyc to provide an extra speedup up to 4x faster than v2.1Fixed
- CLI with opt --normalize fail when using full path for files
- TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it
- Sphinx warnings when generating the documentation
Removed
- Coherence detector no longer return 'Simple English' instead return 'English'
- Coherence detector no longer return 'Classical Chinese' instead return 'Chinese'
- Breaking: Method
first()andbest()from CharsetMatch- UTF-7 will no longer appear as "detected" without a recognized SIG/mark (is unreliable/conflict with ASCII)
- Breaking: Class aliases CharsetDetector, CharsetDoctor, CharsetNormalizerMatch and CharsetNormalizerMatches
- Breaking: Top-level function
normalize- Breaking: Properties
chaos_secondary_pass,coherence_non_latinandw_counterfrom CharsetMatch- Support for the backport
unicodedata23.0.0rc1 (2022-10-18)
Added
- Extend the capability of explain=True when cp_isolation contains at most two entries (min one), will log in details of the Mess-detector results
- Support for alternative language frequency set in charset_normalizer.assets.FREQUENCIES
- Add parameter
language_thresholdinfrom_bytes,from_pathandfrom_fpto adjust the minimum expected coherence ratioChanged
- Build with static metadata using 'build' frontend
- Make the language detection stricter
Fixed
- CLI with opt --normalize fail when using full path for files
- TooManyAccentuatedPlugin induce false positive on the mess detection when too few alpha character have been fed to it
Removed
- Coherence detector no longer return 'Simple English' instead return 'English'
- Coherence detector no longer return 'Classical Chinese' instead return 'Chinese'
3.0.0b2 (2022-08-21)
Added
... (truncated)
Upgrade guide
Sourced from charset-normalizer's upgrade guide.
Guide to upgrade your code from v1 to v2
- If you are using the legacy
detectfunction, that is it. You have nothing to do.Detection
Before
from charset_normalizer import CharsetNormalizerMatches results = CharsetNormalizerMatches.from_bytes( '我没有埋怨,磋砣的只是一些时间。'.encode('utf_32') )After
from charset_normalizer import from_bytes results = from_bytes( '我没有埋怨,磋砣的只是一些时间。'.encode('utf_32') )Methods that once were staticmethods of the class
CharsetNormalizerMatchesare now basic functions.from_fp,from_bytes,from_fpand `` are concerned.Staticmethods scheduled to be removed in version 3.0
Commits
- 
0ec52efVersion 3.0.0 (#223)
- 
db134f3Update python-publish.yml
- 
690f74c🔧 pass --no-isolation through CIBW_CONFIG_SETTINGS --build-option
- 
20996c3⬆ cibuildwheel v2.11.1 (fix-tag)
- 
24f366c⬆ cibuildwheel v2.11.1
- 
33b7327🔧 update universal-wheel stage (missing build pkg)
- 
544595dMerge pull request #209 from Ousret/3.0
- 
6367d53📝 Missing CHANGELOG entry and add language_threshold to docs::advanced...
- 
b15f416📝 Update CHANGELOG.md
- 
f8e1153📝 Adjust speedup docs section
- Additional commits viewable in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- 
@dependabot rebasewill rebase this PR
- 
@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it
- 
@dependabot mergewill merge this PR after your CI passes on it
- 
@dependabot squash and mergewill squash and merge this PR after your CI passes on it
- 
@dependabot cancel mergewill cancel a previously requested merge and block automerging
- 
@dependabot reopenwill reopen this PR if it is closed
- 
@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- 
@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- 
@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- 
@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)