I have written a program that turns strings of hex digits into lists of allowable 2-letter word prefixes such that the total frequency of all the words starting with the prefixes corresponding to each digit (0-f) are approximately the same.
The idea is to reversibly turn any hex string into prose or poetry. The decode process does a reverse mapping on the prefix of each word of 3 or more letters (other than "the", I might add "and" too). This lets you add filler words to make the output flow.
I've also implemented functions for extracting the infohash from magnet links and constructing magnet links from infohashes, with potentially different trackers and no title, obviously. So torrents available via the mainline DHT could easily be shared through posts of 40 or more words.
From my quick test, it's not hard to think of prose corresponding to the prefixes, but 40 words might take a while. Maybe I should feed it through some kind of language model?
@djsundog @freakazoid yeah! off the top of my head, you could do some logit warping so that the language model is more likely to (or is forced to) select only lexical items with the desired orthographic characteristics, sorta like Jeff Binder's Visions and Revisions https://github.com/jeffbinder/visions-and-revisions/
@aparrish @djsundog That's exactly what I want. I'm thinking if I'm going to include a language model I could use it to create a much more efficient encoding, though. I know how I'd do it with a Markov model, but I have no idea how to do it with any other kind of model except through a method similar to the one I'm using now.
GPT-2-simple is 500 MB, which is more than I'd want to distribute with a program, though. Do you know of significantly smaller models that would still be useful?
A social network for the 19A0s.