69 lines
2.3 KiB
Markdown
69 lines
2.3 KiB
Markdown
# mars-unicode-tables
|
|
|
|
Unicode data and generated table sources for MARS-NWE/libnwcore.
|
|
|
|
This repository intentionally keeps upstream Unicode source data separate from
|
|
MARS-NWE-generated output.
|
|
|
|
## Layout
|
|
|
|
- `UCD/`
|
|
Unicode Character Database input files, currently imported from Unicode 17.0.0.
|
|
|
|
- `MAPPINGS/`
|
|
Unicode mapping files from `https://www.unicode.org/Public/MAPPINGS/`,
|
|
preserving the upstream `VENDORS/...` hierarchy.
|
|
|
|
- `scripts/`
|
|
MARS-NWE helper scripts/generators.
|
|
|
|
- `TAB/`
|
|
Generated C table output consumed by MARS-NWE/libnwcore.
|
|
|
|
- `LICENSES/`
|
|
License notes for Unicode data and MARS-NWE-authored helper code.
|
|
|
|
## Policy
|
|
|
|
Do not copy Novell NSS `shared/sdk/unitables/*.tab` files into this repository.
|
|
They may be used only as compatibility/reference material outside the committed
|
|
source data.
|
|
|
|
Unicode case/codepage tables should be generated from Unicode.org data files.
|
|
|
|
## Codepage table generation
|
|
|
|
`MAPPINGS/` contains the Unicode.org vendor mapping files. The codepage
|
|
helper generator emits compact byte/code-to-Unicode descriptors under `TAB/`:
|
|
|
|
```sh
|
|
./scripts/gen_codepage_tables.py
|
|
```
|
|
|
|
`TAB/codepageTables.c` and `TAB/codepageTables.h` are generated from direct
|
|
single-BMP-code-point mappings only. Composite mappings, directional pseudo
|
|
mappings, historical `DatedVersions/`, and `WindowsBestFit/` reverse/fallback
|
|
files remain in the source tree but are not emitted into byte-to-Unicode tables.
|
|
|
|
MARS-NWE links these generated tables into `libnwcore`; they are not loaded as
|
|
runtime `.tab` files.
|
|
|
|
## NSS-compatible unitable generation
|
|
|
|
`scripts/gen_nss_unitables.py` emits binary `UNI_*.TAB`/Macintosh `.TAB` files
|
|
under `TAB/unitables/` from Unicode.org `MAPPINGS/`. The binary layout follows
|
|
the table loader shape used by NSS `unilib.c`: a 256-byte `Version 1.xx` header,
|
|
codepage-to-Unicode lookup tables, then Unicode-to-codepage lookup tables.
|
|
|
|
These files are generated compatibility data. Do not replace them with Novell
|
|
`shared/sdk/unitables/*.TAB` files; those remain reference-only because their
|
|
redistribution license is unclear.
|
|
|
|
```sh
|
|
./scripts/gen_nss_unitables.py
|
|
```
|
|
|
|
`UNI_000.TAB` is intentionally not emitted by this generator. It uses the
|
|
separate collation/case-table layout; MARS-NWE currently uses the generated C
|
|
case tables in `TAB/unicodeTables.c` for that data.
|