This may need to go under Development/Compiler instead.
All I want to do is parse CR, LF, and CR-LF (and possibly CR-CR-LF) line breaks.
I've attempted custom parsers before. But this seems generalizable.
Something like the Aho–Corasick algorithm should help. I started reading and saw something called a trie, and started working on that. Of course, when I read the whole thing, I realized a straight trie can't work, because to use it with ACA, the trie needs back-references and payload information. I decided to work on a straight trie anyway, maybe to propose it here, and go back to ACA later.
Other Swift tries I perused used a class
for nodes with the element value and a dictionary to follow-up nodes. But that would be too easy for me; I'm going to punish myself with a ridiculously recursive enumeration type. I made a Swift package manager project on my macOS system, and am using a Xcode project file generated by SPM. (I haven't tried from the command line yet.)
I first made a linked-list type to represent the sequences to be stored in the trie. Then I made the trie type. They're both at a gist I made for this thread. The linked list and its tests went fine. I did the trie and its tests in piecemeal too. But when I got noticeably past the insertion test, I got devastating problems, all intermittent.
- The compile cycle crashes with a Segmentation fault 11. I just repeatedly re-build, sometimes build-cleaning and/or restarting Xcode, and I bully through.
-
testInsertion()
fails with the calculated tries with junk values. The junk values aren't from other parts of the program; they're generally new. I hope I just messed up my algorithm, because my best other guess is that indirectenum
types have a recursion limit, and frying that limit causes memory corruption!
(Maybe the bullying through the segment-fault 11 is causing the corruption.)
I also have a wrapper trie type, but didn't included it in the gist. Again, I occasionally get corrupt values. My direct trie type used Int
, while my wrapper trie type used Character
. Sometimes the junk Character
value was none at all! I was going to switch to UnicodeScalar
(since it doesn't use remote memory), but decided to ask for help here first instead. In the wrapper type, only the methods that call NeverEmptyTrie.doInsert
to make the trie bigger trigger the same corruption, the methods that call NeverEmptyTrie.init(unify: with:)
don't have (obvious) corruption.