Intermediate changes

commit_hash:77545ccbe0cf9f22f5ee56187fc1fc43fe1bfe1c
author: robot-piglet <[email protected]> 2025-09-19 19:22:05 +0300
committer: robot-piglet <[email protected]> 2025-09-19 19:33:20 +0300
commit: 092f5ded19ef842075ccebbc2900afb5cf009ac5 (patch)
tree: 826e14c44cbe0ef758dc963ccd54842d07428688 /contrib/python/xmltodict
parent: 1f2977c48482a39e97894637815ddb3579d0d44e (diff)
5 files changed, 100 insertions, 19 deletions
diff --git a/contrib/python/xmltodict/py3/.dist-info/METADATA b/contrib/python/xmltodict/py3/.dist-info/METADATA
index 5a038d733d3..6462e0a6b5c 100644
--- a/contrib/python/xmltodict/py3/.dist-info/METADATA
+++ b/contrib/python/xmltodict/py3/.dist-info/METADATA
@@ -1,6 +1,6 @@
-Metadata-Version: 2.1
+Metadata-Version: 2.4
 Name: xmltodict
-Version: 0.14.2
+Version: 0.15.0
 Summary: Makes working with XML feel like you are working with JSON
 Home-page: https://github.com/martinblech/xmltodict
 Author: Martin Blech
@@ -25,6 +25,17 @@ Classifier: Topic :: Text Processing :: Markup :: XML
 Requires-Python: >=3.6
 Description-Content-Type: text/markdown
 License-File: LICENSE
+Dynamic: author
+Dynamic: author-email
+Dynamic: classifier
+Dynamic: description
+Dynamic: description-content-type
+Dynamic: home-page
+Dynamic: license
+Dynamic: license-file
+Dynamic: platform
+Dynamic: requires-python
+Dynamic: summary
 
 # xmltodict
 
@@ -46,15 +57,15 @@ License-File: LICENSE
 ...  """), indent=4))
 {
     "mydocument": {
-        "@has": "an attribute", 
+        "@has": "an attribute",
         "and": {
             "many": [
-                "elements", 
+                "elements",
                 "more elements"
             ]
-        }, 
+        },
         "plus": {
-            "@a": "complex", 
+            "@a": "complex",
             "#text": "element as well"
         }
     }
@@ -110,7 +121,7 @@ True
 >>> def handle_artist(_, artist):
 ...     print(artist['name'])
 ...     return True
->>> 
+>>>
 >>> xmltodict.parse(GzipFile('discogs_artists.xml.gz'),
 ...     item_depth=2, item_callback=handle_artist)
 A Perfect Circle
@@ -178,7 +189,7 @@ Text values for nodes can be specified with the `cdata_key` key in the python di
 
 ```python
 >>> import xmltodict
->>> 
+>>>
 >>> mydict = {
 ...     'text': {
 ...         '@color':'red',
@@ -234,7 +245,7 @@ $ pip install xmltodict
 
 ### Using conda
 
-For installing `xmltodict` using Anaconda/Miniconda (*conda*) from the 
+For installing `xmltodict` using Anaconda/Miniconda (*conda*) from the
 [conda-forge channel][#xmltodict-conda] all you need to do is:
 
 [#xmltodict-conda]: https://anaconda.org/conda-forge/xmltodict
@@ -286,3 +297,13 @@ $ zypper in python2-xmltodict
 # Python3
 $ zypper in python3-xmltodict
 ```
+
+## Security Notes
+
+A CVE (CVE-2025-9375) was filed against `xmltodict` but is [disputed](https://github.com/martinblech/xmltodict/issues/377#issuecomment-3255691923). The root issue lies in Python’s `xml.sax.saxutils.XMLGenerator` API, which does not validate XML element names and provides no built-in way to do so. Since `xmltodict` is a thin wrapper that passes keys directly to `XMLGenerator`, the same issue exists in the standard library itself.
+
+It has been suggested that `xml.sax.saxutils.escape()` represents a secure usage path. This is incorrect: `escape()` is intended only for character data and attribute values, and can produce invalid XML when misapplied to element names. There is currently no secure, documented way in Python’s standard library to validate XML element names.
+
+Despite this, Fluid Attacks chose to assign a CVE to `xmltodict` while leaving the identical behavior in Python’s own standard library unaddressed. Their disclosure process also gave only 10 days from first contact to publication—well short of the 90-day industry norm—leaving no real opportunity for maintainer response. These actions reflect an inconsistency of standards and priorities that raise concerns about motivations, as they do not primarily serve the security of the broader community.
+
+The maintainer considers this CVE invalid and will formally dispute it with MITRE.
diff --git a/contrib/python/xmltodict/py3/README.md b/contrib/python/xmltodict/py3/README.md
index 6f776a8b4d8..4c24cf100f0 100644
--- a/contrib/python/xmltodict/py3/README.md
+++ b/contrib/python/xmltodict/py3/README.md
@@ -18,15 +18,15 @@
 ...  """), indent=4))
 {
     "mydocument": {
-        "@has": "an attribute", 
+        "@has": "an attribute",
         "and": {
             "many": [
-                "elements", 
+                "elements",
                 "more elements"
             ]
-        }, 
+        },
         "plus": {
-            "@a": "complex", 
+            "@a": "complex",
             "#text": "element as well"
         }
     }
@@ -82,7 +82,7 @@ True
 >>> def handle_artist(_, artist):
 ...     print(artist['name'])
 ...     return True
->>> 
+>>>
 >>> xmltodict.parse(GzipFile('discogs_artists.xml.gz'),
 ...     item_depth=2, item_callback=handle_artist)
 A Perfect Circle
@@ -150,7 +150,7 @@ Text values for nodes can be specified with the `cdata_key` key in the python di
 
 ```python
 >>> import xmltodict
->>> 
+>>>
 >>> mydict = {
 ...     'text': {
 ...         '@color':'red',
@@ -206,7 +206,7 @@ $ pip install xmltodict
 
 ### Using conda
 
-For installing `xmltodict` using Anaconda/Miniconda (*conda*) from the 
+For installing `xmltodict` using Anaconda/Miniconda (*conda*) from the
 [conda-forge channel][#xmltodict-conda] all you need to do is:
 
 [#xmltodict-conda]: https://anaconda.org/conda-forge/xmltodict
@@ -258,3 +258,13 @@ $ zypper in python2-xmltodict
 # Python3
 $ zypper in python3-xmltodict
 ```
+
+## Security Notes
+
+A CVE (CVE-2025-9375) was filed against `xmltodict` but is [disputed](https://github.com/martinblech/xmltodict/issues/377#issuecomment-3255691923). The root issue lies in Python’s `xml.sax.saxutils.XMLGenerator` API, which does not validate XML element names and provides no built-in way to do so. Since `xmltodict` is a thin wrapper that passes keys directly to `XMLGenerator`, the same issue exists in the standard library itself.
+
+It has been suggested that `xml.sax.saxutils.escape()` represents a secure usage path. This is incorrect: `escape()` is intended only for character data and attribute values, and can produce invalid XML when misapplied to element names. There is currently no secure, documented way in Python’s standard library to validate XML element names.
+
+Despite this, Fluid Attacks chose to assign a CVE to `xmltodict` while leaving the identical behavior in Python’s own standard library unaddressed. Their disclosure process also gave only 10 days from first contact to publication—well short of the 90-day industry norm—leaving no real opportunity for maintainer response. These actions reflect an inconsistency of standards and priorities that raise concerns about motivations, as they do not primarily serve the security of the broader community.
+
+The maintainer considers this CVE invalid and will formally dispute it with MITRE.
diff --git a/contrib/python/xmltodict/py3/tests/test_dicttoxml.py b/contrib/python/xmltodict/py3/tests/test_dicttoxml.py
index 470aca98a18..67e3a880979 100644
--- a/contrib/python/xmltodict/py3/tests/test_dicttoxml.py
+++ b/contrib/python/xmltodict/py3/tests/test_dicttoxml.py
@@ -231,3 +231,35 @@ xmlns:b="http://b.com/"><x a:attr="val">1</x><a:y>2</a:y><b:z>3</b:z></root>'''
         expected_xml = '<?xml version="1.0" encoding="utf-8"?>\n<x>false</x>'
         xml = unparse(dict(x=False))
         self.assertEqual(xml, expected_xml)
+
+    def test_rejects_tag_name_with_angle_brackets(self):
+        # Minimal guard: disallow '<' or '>' to prevent breaking tag context
+        with self.assertRaises(ValueError):
+            unparse({"m><tag>content</tag": "unsafe"}, full_document=False)
+
+    def test_rejects_attribute_name_with_angle_brackets(self):
+        # Now we expect bad attribute names to be rejected
+        with self.assertRaises(ValueError):
+            unparse(
+                {"a": {"@m><tag>content</tag": "unsafe", "#text": "x"}},
+                full_document=False,
+            )
+
+    def test_rejects_malicious_xmlns_prefix(self):
+        # xmlns prefixes go under @xmlns mapping; reject angle brackets in prefix
+        with self.assertRaises(ValueError):
+            unparse(
+                {
+                    "a": {
+                        "@xmlns": {"m><bad": "http://example.com/"},
+                        "#text": "x",
+                    }
+                },
+                full_document=False,
+            )
+
+    def test_attribute_values_with_angle_brackets_are_escaped(self):
+        # Attribute values should be escaped by XMLGenerator
+        xml = unparse({"a": {"@attr": "1<middle>2", "#text": "x"}}, full_document=False)
+        # The generated XML should contain escaped '<' and '>' within the attribute value
+        self.assertIn('attr="1&lt;middle&gt;2"', xml)
diff --git a/contrib/python/xmltodict/py3/xmltodict.py b/contrib/python/xmltodict/py3/xmltodict.py
index 098f62762ae..c8491b354bd 100644
--- a/contrib/python/xmltodict/py3/xmltodict.py
+++ b/contrib/python/xmltodict/py3/xmltodict.py
@@ -14,7 +14,7 @@ if tuple(map(int, platform.python_version_tuple()[:2])) < (3, 7):
 from inspect import isgenerator
 
 __author__ = 'Martin Blech'
-__version__ = "0.14.2"
+__version__ = "0.15.0"
 __license__ = 'MIT'
 
 
@@ -360,6 +360,14 @@ def parse(xml_input, encoding=None, expat=expat, process_namespaces=False,
     return handler.item
 
 
+def _has_angle_brackets(value):
+    """Return True if value (a str) contains '<' or '>'.
+
+    Non-string values return False. Uses fast substring checks implemented in C.
+    """
+    return isinstance(value, str) and ("<" in value or ">" in value)
+
+
 def _process_namespace(name, namespaces, ns_sep=':', attr_prefix='@'):
     if not namespaces:
         return name
@@ -393,6 +401,9 @@ def _emit(key, value, content_handler,
         if result is None:
             return
         key, value = result
+    # Minimal validation to avoid breaking out of tag context
+    if _has_angle_brackets(key):
+        raise ValueError('Invalid element name: "<" or ">" not allowed')
     if not hasattr(value, '__iter__') or isinstance(value, (str, dict)):
         value = [value]
     for index, v in enumerate(value):
@@ -421,12 +432,19 @@ def _emit(key, value, content_handler,
                                         attr_prefix)
                 if ik == '@xmlns' and isinstance(iv, dict):
                     for k, v in iv.items():
+                        if _has_angle_brackets(k):
+                            raise ValueError(
+                                'Invalid attribute name: "<" or ">" not allowed'
+                            )
                         attr = 'xmlns{}'.format(f':{k}' if k else '')
                         attrs[attr] = str(v)
                     continue
                 if not isinstance(iv, str):
                     iv = str(iv)
-                attrs[ik[len(attr_prefix):]] = iv
+                attr_name = ik[len(attr_prefix) :]
+                if _has_angle_brackets(attr_name):
+                    raise ValueError('Invalid attribute name: "<" or ">" not allowed')
+                attrs[attr_name] = iv
                 continue
             children.append((ik, iv))
         if isinstance(indent, int):
diff --git a/contrib/python/xmltodict/py3/ya.make b/contrib/python/xmltodict/py3/ya.make
index 7fdc1a14a9e..5dbf4dca8ff 100644
--- a/contrib/python/xmltodict/py3/ya.make
+++ b/contrib/python/xmltodict/py3/ya.make
@@ -2,7 +2,7 @@
 
 PY3_LIBRARY()
 
-VERSION(0.14.2)
+VERSION(0.15.0)
 
 LICENSE(MIT)
author	robot-piglet <[email protected]>	2025-09-19 19:22:05 +0300
committer	robot-piglet <[email protected]>	2025-09-19 19:33:20 +0300
commit	092f5ded19ef842075ccebbc2900afb5cf009ac5 (patch)
tree	826e14c44cbe0ef758dc963ccd54842d07428688 /contrib/python/xmltodict
parent	1f2977c48482a39e97894637815ddb3579d0d44e (diff)