| Age | Commit message (Collapse) | Author |
|
Make it easier to keep a consistent style
|
|
|
|
|
|
Reads UTF-8 and UTF-16 into UTF-8 or UTF-16 strings.
If strict is true, fails at first invalid character.
If strict is false, invalid characters are replaced with U+FFFD.
For the replacement, I changed behavior if uN::read_replace to only
jump one byte. Otherwise a common invalid case when ISO-8859-1 or
WIN-1252 are read as UTF-8 would skip many characters.
If skip_bom is true any bom at start of stream is ignored.
If skip_bom is false any bom will be included.
Input format can be forced, if not detect is used which will
try to guess and then fallback to UTF-8.
|
|
|
|
|
|
Generate the lookup tables from UnicodeData.txt, do to that,
add gen_ugc, which uses csv, buffers, line, io and other modules
to do the job.
|
|
|
|
|
|
|
|
Only a basic argument parser to start with.
|