You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
postgres/src/common/unicode
Tom Lane 0245f8db36 Pre-beta mechanical code beautification. 3 years ago
..
.gitignore Update display widths as part of updating Unicode 5 years ago
Makefile Treat Unicode codepoints of category "Format" as non-spacing 4 years ago
README Add support for automatically updating Unicode derived files 7 years ago
generate-norm_test_table.pl Pre-beta mechanical code beautification. 3 years ago
generate-unicode_east_asian_fw_table.pl Update copyright for 2023 3 years ago
generate-unicode_nonspacing_table.pl Update copyright for 2023 3 years ago
generate-unicode_norm_table.pl Pre-beta mechanical code beautification. 3 years ago
generate-unicode_normprops_table.pl Pre-beta mechanical code beautification. 3 years ago
meson.build meson: don't require 'touch' binary, make use of 'cp' optional 3 years ago
norm_test.c Update copyright for 2023 3 years ago

README

This directory contains tools to generate the tables in
src/include/common/unicode_norm.h, used for Unicode normalization. The
generated .h file is included in the source tree, so these are normally not
needed to build PostgreSQL, only if you need to re-generate the .h file
from the Unicode data files for some reason, e.g. to update to a new version
of Unicode.

Generating unicode_norm_table.h
-------------------------------

Run

make update-unicode

from the top level of the source tree and commit the result.

Tests
-----

The Unicode consortium publishes a comprehensive test suite for the
normalization algorithm, in a file called NormalizationTest.txt. This
directory also contains a perl script and some C code, to run our
normalization code with all the test strings in NormalizationTest.txt.
To download NormalizationTest.txt and run the tests:

make normalization-check

This is also run as part of the update-unicode target.