summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2025-09-27Add simple prefix_treeJoel Klinghed
Will be used by tokenizer for short lists of strings
2025-09-22Change io::Reader and company to return ReadError::Eof instead of 0.Joel Klinghed
It's debatable if Eof should be considered an error or not. But it is pretty clear it generally is a special response that needs special handling, so easier to keep with the unexpected lot. Also keeps better at higher abstraction levels, such as the line reader.
2025-09-18java::uescape: Unicode reader that knows about Java's \uXXXX escapesJoel Klinghed
2025-09-17fixup! uio: Unicode readerJoel Klinghed
2025-09-17fixup! uline: Add unicode line readerJoel Klinghed
2025-09-17fixup! Add .clang-formatJoel Klinghed
2025-09-17uline: Add unicode line readerJoel Klinghed
2025-09-17uio: Remove unnecessary wrappersJoel Klinghed
2025-09-15Add .clang-formatJoel Klinghed
Make it easier to keep a consistent style
2025-09-15decompress: Return better io error for BUF_ERRORJoel Klinghed
Use new MaxTooSmall. As the comment notes tho, it might be that we are lacking input as well, but until I figure out how to test for that case and determine the cause, lets at least return a more specific error.
2025-09-15uio: Unicode readerJoel Klinghed
Reads UTF-8 and UTF-16 into UTF-8 or UTF-16 strings. If strict is true, fails at first invalid character. If strict is false, invalid characters are replaced with U+FFFD. For the replacement, I changed behavior if uN::read_replace to only jump one byte. Otherwise a common invalid case when ISO-8859-1 or WIN-1252 are read as UTF-8 would skip many characters. If skip_bom is true any bom at start of stream is ignored. If skip_bom is false any bom will be included. Input format can be forced, if not detect is used which will try to guess and then fallback to UTF-8.
2025-09-10Fix issues in bufferJoel Klinghed
2025-09-10Add unicode general category lookupJoel Klinghed
Generate the lookup tables from UnicodeData.txt, do to that, add gen_ugc, which uses csv, buffers, line, io and other modules to do the job.
2025-09-10fixup! Make clang-tidy happyJoel Klinghed
2025-09-08Make clang-tidy happyJoel Klinghed
2025-09-04Add UTF-8, UTF-16 and Modified UTF-8 supportJoel Klinghed
2025-09-03Initial commitJoel Klinghed
Only a basic argument parser to start with.