| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fix gen_tld.py on non-unicode locale encodings
In python versions prior to 3.7, as well as when using non-unicode locale encodings, the `gen_tld.py` script fails:
```
ydb/ydb/library/cpp/tld % git rev-parse head
97b1a695d3be4edc08550d3ae7d200f6d9f3d42e
ydb/ydb/library/cpp/tld % LC_CTYPE=C ~/.pyenv/versions/3.6.15/bin/python gen_tld.py tlds-alpha-by-domain.txt|md5
Traceback (most recent call last):
File "gen_tld.py", line 57, in <module>
main()
File "gen_tld.py", line 39, in main
sys.stdout.write('%s*/\n' % str.rstrip())
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
```
This pull request fixes this behevaiour by explicit set output encoding to utf-8.
To ensure that I do not break anything, I checked the MD5 hash of the generated file before and after making my changes:
```
ydb/ydb/library/cpp/tld % git rev-parse head
97b1a695d3be4edc08550d3ae7d200f6d9f3d42e
ydb/ydb/library/cpp/tld % python3 gen_tld.py tlds-alpha-by-domain.txt|md5
564242d355d842db790977df3642a405
```
After
```
ydb/ydb/library/cpp/tld % git rev-parse head
1096dd7f034c573aabdf3bac2dc4b181a6688c71
ydb/ydb/library/cpp/tld % python3 gen_tld.py tlds-alpha-by-domain.txt|md5
564242d355d842db790977df3642a405
ydb/ydb/library/cpp/tld % LC_CTYPE=C ~/.pyenv/versions/3.6.15/bin/python gen_tld.py tlds-alpha-by-domain.txt|md5
564242d355d842db790977df3642a405
```
Pull Request resolved: #279
|