diff options
author | shadchin <shadchin@yandex-team.ru> | 2022-02-10 16:44:30 +0300 |
---|---|---|
committer | Daniil Cherednik <dcherednik@yandex-team.ru> | 2022-02-10 16:44:30 +0300 |
commit | 2598ef1d0aee359b4b6d5fdd1758916d5907d04f (patch) | |
tree | 012bb94d777798f1f56ac1cec429509766d05181 /contrib/python/idna/README.rst | |
parent | 6751af0b0c1b952fede40b19b71da8025b5d8bcf (diff) | |
download | ydb-2598ef1d0aee359b4b6d5fdd1758916d5907d04f.tar.gz |
Restoring authorship annotation for <shadchin@yandex-team.ru>. Commit 1 of 2.
Diffstat (limited to 'contrib/python/idna/README.rst')
-rw-r--r-- | contrib/python/idna/README.rst | 422 |
1 files changed, 211 insertions, 211 deletions
diff --git a/contrib/python/idna/README.rst b/contrib/python/idna/README.rst index b115e83b45..84ac75bba7 100644 --- a/contrib/python/idna/README.rst +++ b/contrib/python/idna/README.rst @@ -1,211 +1,211 @@ -Internationalized Domain Names in Applications (IDNA) -===================================================== - -Support for the Internationalised Domain Names in Applications -(IDNA) protocol as specified in `RFC 5891 <http://tools.ietf.org/html/rfc5891>`_. -This is the latest version of the protocol and is sometimes referred to as -“IDNA 2008”. - -This library also provides support for Unicode Technical Standard 46, -`Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_. - -This acts as a suitable replacement for the “encodings.idna” module that -comes with the Python standard library, but only supports the -old, deprecated IDNA specification (`RFC 3490 <http://tools.ietf.org/html/rfc3490>`_). - -Basic functions are simply executed: - -.. code-block:: pycon - - # Python 3 - >>> import idna - >>> idna.encode('ドメイン.テスト') - b'xn--eckwd4c7c.xn--zckzah' - >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah')) - ドメイン.テスト - - # Python 2 - >>> import idna - >>> idna.encode(u'ドメイン.テスト') - 'xn--eckwd4c7c.xn--zckzah' - >>> print idna.decode('xn--eckwd4c7c.xn--zckzah') - ドメイン.テスト - -Packages --------- - -The latest tagged release version is published in the PyPI repository: - -.. image:: https://badge.fury.io/py/idna.svg - :target: http://badge.fury.io/py/idna - - -Installation ------------- - -To install this library, you can use pip: - -.. code-block:: bash - - $ pip install idna - -Alternatively, you can install the package using the bundled setup script: - -.. code-block:: bash - - $ python setup.py install - -This library works with Python 2.7 and Python 3.4 or later. - - -Usage ------ - -For typical usage, the ``encode`` and ``decode`` functions will take a domain -name argument and perform a conversion to A-labels or U-labels respectively. - -.. code-block:: pycon - - # Python 3 - >>> import idna - >>> idna.encode('ドメイン.テスト') - b'xn--eckwd4c7c.xn--zckzah' - >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah')) - ドメイン.テスト - -You may use the codec encoding and decoding methods using the -``idna.codec`` module: - -.. code-block:: pycon - - # Python 2 - >>> import idna.codec - >>> print u'домена.испытание'.encode('idna') - xn--80ahd1agd.xn--80akhbyknj4f - >>> print 'xn--80ahd1agd.xn--80akhbyknj4f'.decode('idna') - домена.испытание - -Conversions can be applied at a per-label basis using the ``ulabel`` or ``alabel`` -functions if necessary: - -.. code-block:: pycon - - # Python 2 - >>> idna.alabel(u'测试') - 'xn--0zwm56d' - -Compatibility Mapping (UTS #46) -+++++++++++++++++++++++++++++++ - -As described in `RFC 5895 <http://tools.ietf.org/html/rfc5895>`_, the IDNA -specification no longer normalizes input from different potential ways a user -may input a domain name. This functionality, known as a “mapping”, is now -considered by the specification to be a local user-interface issue distinct -from IDNA conversion functionality. - -This library provides one such mapping, that was developed by the Unicode -Consortium. Known as `Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_, -it provides for both a regular mapping for typical applications, as well as -a transitional mapping to help migrate from older IDNA 2003 applications. - -For example, “Königsgäßchen” is not a permissible label as *LATIN CAPITAL -LETTER K* is not allowed (nor are capital letters in general). UTS 46 will -convert this into lower case prior to applying the IDNA conversion. - -.. code-block:: pycon - - # Python 3 - >>> import idna - >>> idna.encode(u'Königsgäßchen') - ... - idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed - >>> idna.encode('Königsgäßchen', uts46=True) - b'xn--knigsgchen-b4a3dun' - >>> print(idna.decode('xn--knigsgchen-b4a3dun')) - königsgäßchen - -Transitional processing provides conversions to help transition from the older -2003 standard to the current standard. For example, in the original IDNA -specification, the *LATIN SMALL LETTER SHARP S* (ß) was converted into two -*LATIN SMALL LETTER S* (ss), whereas in the current IDNA specification this -conversion is not performed. - -.. code-block:: pycon - - # Python 2 - >>> idna.encode(u'Königsgäßchen', uts46=True, transitional=True) - 'xn--knigsgsschen-lcb0w' - -Implementors should use transitional processing with caution, only in rare -cases where conversion from legacy labels to current labels must be performed -(i.e. IDNA implementations that pre-date 2008). For typical applications -that just need to convert labels, transitional processing is unlikely to be -beneficial and could produce unexpected incompatible results. - -``encodings.idna`` Compatibility -++++++++++++++++++++++++++++++++ - -Function calls from the Python built-in ``encodings.idna`` module are -mapped to their IDNA 2008 equivalents using the ``idna.compat`` module. -Simply substitute the ``import`` clause in your code to refer to the -new module name. - -Exceptions ----------- - -All errors raised during the conversion following the specification should -raise an exception derived from the ``idna.IDNAError`` base class. - -More specific exceptions that may be generated as ``idna.IDNABidiError`` -when the error reflects an illegal combination of left-to-right and right-to-left -characters in a label; ``idna.InvalidCodepoint`` when a specific codepoint is -an illegal character in an IDN label (i.e. INVALID); and ``idna.InvalidCodepointContext`` -when the codepoint is illegal based on its positional context (i.e. it is CONTEXTO -or CONTEXTJ but the contextual requirements are not satisfied.) - -Building and Diagnostics ------------------------- - -The IDNA and UTS 46 functionality relies upon pre-calculated lookup tables for -performance. These tables are derived from computing against eligibility criteria -in the respective standards. These tables are computed using the command-line -script ``tools/idna-data``. - -This tool will fetch relevant tables from the Unicode Consortium and perform the -required calculations to identify eligibility. It has three main modes: - -* ``idna-data make-libdata``. Generates ``idnadata.py`` and ``uts46data.py``, - the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors - who wish to track this library against a different Unicode version may use this tool - to manually generate a different version of the ``idnadata.py`` and ``uts46data.py`` - files. - -* ``idna-data make-table``. Generate a table of the IDNA disposition - (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC - 5892 and the pre-computed tables published by `IANA <http://iana.org/>`_. - -* ``idna-data U+0061``. Prints debugging output on the various properties - associated with an individual Unicode codepoint (in this case, U+0061), that are - used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging - or analysis. - -The tool accepts a number of arguments, described using ``idna-data -h``. Most notably, -the ``--version`` argument allows the specification of the version of Unicode to use -in computing the table data. For example, ``idna-data --version 9.0.0 make-libdata`` -will generate library data against Unicode 9.0.0. - -Note that this script requires Python 3, but all generated library data will work -in Python 2.7. - - -Testing -------- - -The library has a test suite based on each rule of the IDNA specification, as -well as tests that are provided as part of the Unicode Technical Standard 46, -`Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_. - -The tests are run automatically on each commit at Travis CI: - -.. image:: https://travis-ci.org/kjd/idna.svg?branch=master - :target: https://travis-ci.org/kjd/idna +Internationalized Domain Names in Applications (IDNA) +===================================================== + +Support for the Internationalised Domain Names in Applications +(IDNA) protocol as specified in `RFC 5891 <http://tools.ietf.org/html/rfc5891>`_. +This is the latest version of the protocol and is sometimes referred to as +“IDNA 2008”. + +This library also provides support for Unicode Technical Standard 46, +`Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_. + +This acts as a suitable replacement for the “encodings.idna” module that +comes with the Python standard library, but only supports the +old, deprecated IDNA specification (`RFC 3490 <http://tools.ietf.org/html/rfc3490>`_). + +Basic functions are simply executed: + +.. code-block:: pycon + + # Python 3 + >>> import idna + >>> idna.encode('ドメイン.テスト') + b'xn--eckwd4c7c.xn--zckzah' + >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah')) + ドメイン.テスト + + # Python 2 + >>> import idna + >>> idna.encode(u'ドメイン.テスト') + 'xn--eckwd4c7c.xn--zckzah' + >>> print idna.decode('xn--eckwd4c7c.xn--zckzah') + ドメイン.テスト + +Packages +-------- + +The latest tagged release version is published in the PyPI repository: + +.. image:: https://badge.fury.io/py/idna.svg + :target: http://badge.fury.io/py/idna + + +Installation +------------ + +To install this library, you can use pip: + +.. code-block:: bash + + $ pip install idna + +Alternatively, you can install the package using the bundled setup script: + +.. code-block:: bash + + $ python setup.py install + +This library works with Python 2.7 and Python 3.4 or later. + + +Usage +----- + +For typical usage, the ``encode`` and ``decode`` functions will take a domain +name argument and perform a conversion to A-labels or U-labels respectively. + +.. code-block:: pycon + + # Python 3 + >>> import idna + >>> idna.encode('ドメイン.テスト') + b'xn--eckwd4c7c.xn--zckzah' + >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah')) + ドメイン.テスト + +You may use the codec encoding and decoding methods using the +``idna.codec`` module: + +.. code-block:: pycon + + # Python 2 + >>> import idna.codec + >>> print u'домена.испытание'.encode('idna') + xn--80ahd1agd.xn--80akhbyknj4f + >>> print 'xn--80ahd1agd.xn--80akhbyknj4f'.decode('idna') + домена.испытание + +Conversions can be applied at a per-label basis using the ``ulabel`` or ``alabel`` +functions if necessary: + +.. code-block:: pycon + + # Python 2 + >>> idna.alabel(u'测试') + 'xn--0zwm56d' + +Compatibility Mapping (UTS #46) ++++++++++++++++++++++++++++++++ + +As described in `RFC 5895 <http://tools.ietf.org/html/rfc5895>`_, the IDNA +specification no longer normalizes input from different potential ways a user +may input a domain name. This functionality, known as a “mapping”, is now +considered by the specification to be a local user-interface issue distinct +from IDNA conversion functionality. + +This library provides one such mapping, that was developed by the Unicode +Consortium. Known as `Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_, +it provides for both a regular mapping for typical applications, as well as +a transitional mapping to help migrate from older IDNA 2003 applications. + +For example, “Königsgäßchen” is not a permissible label as *LATIN CAPITAL +LETTER K* is not allowed (nor are capital letters in general). UTS 46 will +convert this into lower case prior to applying the IDNA conversion. + +.. code-block:: pycon + + # Python 3 + >>> import idna + >>> idna.encode(u'Königsgäßchen') + ... + idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed + >>> idna.encode('Königsgäßchen', uts46=True) + b'xn--knigsgchen-b4a3dun' + >>> print(idna.decode('xn--knigsgchen-b4a3dun')) + königsgäßchen + +Transitional processing provides conversions to help transition from the older +2003 standard to the current standard. For example, in the original IDNA +specification, the *LATIN SMALL LETTER SHARP S* (ß) was converted into two +*LATIN SMALL LETTER S* (ss), whereas in the current IDNA specification this +conversion is not performed. + +.. code-block:: pycon + + # Python 2 + >>> idna.encode(u'Königsgäßchen', uts46=True, transitional=True) + 'xn--knigsgsschen-lcb0w' + +Implementors should use transitional processing with caution, only in rare +cases where conversion from legacy labels to current labels must be performed +(i.e. IDNA implementations that pre-date 2008). For typical applications +that just need to convert labels, transitional processing is unlikely to be +beneficial and could produce unexpected incompatible results. + +``encodings.idna`` Compatibility +++++++++++++++++++++++++++++++++ + +Function calls from the Python built-in ``encodings.idna`` module are +mapped to their IDNA 2008 equivalents using the ``idna.compat`` module. +Simply substitute the ``import`` clause in your code to refer to the +new module name. + +Exceptions +---------- + +All errors raised during the conversion following the specification should +raise an exception derived from the ``idna.IDNAError`` base class. + +More specific exceptions that may be generated as ``idna.IDNABidiError`` +when the error reflects an illegal combination of left-to-right and right-to-left +characters in a label; ``idna.InvalidCodepoint`` when a specific codepoint is +an illegal character in an IDN label (i.e. INVALID); and ``idna.InvalidCodepointContext`` +when the codepoint is illegal based on its positional context (i.e. it is CONTEXTO +or CONTEXTJ but the contextual requirements are not satisfied.) + +Building and Diagnostics +------------------------ + +The IDNA and UTS 46 functionality relies upon pre-calculated lookup tables for +performance. These tables are derived from computing against eligibility criteria +in the respective standards. These tables are computed using the command-line +script ``tools/idna-data``. + +This tool will fetch relevant tables from the Unicode Consortium and perform the +required calculations to identify eligibility. It has three main modes: + +* ``idna-data make-libdata``. Generates ``idnadata.py`` and ``uts46data.py``, + the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors + who wish to track this library against a different Unicode version may use this tool + to manually generate a different version of the ``idnadata.py`` and ``uts46data.py`` + files. + +* ``idna-data make-table``. Generate a table of the IDNA disposition + (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC + 5892 and the pre-computed tables published by `IANA <http://iana.org/>`_. + +* ``idna-data U+0061``. Prints debugging output on the various properties + associated with an individual Unicode codepoint (in this case, U+0061), that are + used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging + or analysis. + +The tool accepts a number of arguments, described using ``idna-data -h``. Most notably, +the ``--version`` argument allows the specification of the version of Unicode to use +in computing the table data. For example, ``idna-data --version 9.0.0 make-libdata`` +will generate library data against Unicode 9.0.0. + +Note that this script requires Python 3, but all generated library data will work +in Python 2.7. + + +Testing +------- + +The library has a test suite based on each rule of the IDNA specification, as +well as tests that are provided as part of the Unicode Technical Standard 46, +`Unicode IDNA Compatibility Processing <http://unicode.org/reports/tr46/>`_. + +The tests are run automatically on each commit at Travis CI: + +.. image:: https://travis-ci.org/kjd/idna.svg?branch=master + :target: https://travis-ci.org/kjd/idna |