-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decoding test #31
Comments
Not as far as I know! It would be a fun test though. Other messages in the past have been tried out informally in newsgroups and the like. There was a proposal a while back to get a bit more organized about this https://arxiv.org/pdf/1101.4968.pdf (section 5). I enjoyed decoding that team's message (a bit hard to find online now? I archived it at https://github.com/paulfitz/dearet/blob/master/Test_Message_2p1.txt) though I definitely had to go revise some more physics to do it. |
I have been purposely not reading much about the cosmicos message and try to find structure in the raw data sometimes, and it turns out you may have to "idiot-proof" messages sent to It might also be worth looking into Natural Semantic Metalanguage, which, while it may not be a true theory, might help create structures that describe and differentiate semantically more complex concepts using simpler ones. Btw, is the base-4 message in the second tab on the site really equivalent to the one using the symbols in the first tab? |
I'm in the process of decoding the message. I've started with the wrapped.txt file and first tried to analyze the frequency of individual symbols, of all pairs, all triplets and all fours. That didn't give much. After that, I've tried to spot some patterns in the message itself. The first thing I noted is a big chunk consisting almost entirely of 0s and 1s. I couldn't decipher anything from it, and I decided to take a close look at the start. After a bit of searching I've spotted the pattern of repeating 321s with something in between. I wrote a little program to tell me what are the lengths of runs of 321. I've spotted that some of the runs are of increasing length. After splitting the message by lines with a newline after every run, I saw a clear structure of small messages about numbers. There were a few undecipherable messages but the ones about numbers were first about sequential numbers, then about squares, then about primes. After that I've tried to find how can I tell these messages apart without the help of numbers. After a bit of shuffling prefixes and suffixes of messages around, I've spotted the 32233 pattern that looked like a separator. After splitting the whole message by that, I saw that the idea was probably right. I then tried to align same parts of different messages to identify the structure. After a bit of fiddling, I've spotted that chunks of numbers are preceded by 21001032xxxxxx, and that all the messages about numbers start by 210111132xxxxxx. After that, I've spotted the general pattern of 2xxxxx, the x's being 0s or 1s. It looked like these 2xxxxx were kind of labels, and so I decided to try to attach some meaningful names to them. I've quickly spotted that 210010 probably meant 'define', 2110000 probably meant 'is a number', 2110001 'is a square' and 2110010 'is a prime'. The next parts were having different overall structure but the numbers were the same. Here I thought that probably the correct prefix for number was 2111 that meant 'number'. I kinda blazed through the part about +, - and * as there were a lot of examples that just confirmed my first thought. After that something strange appeared. It looked like symbols were equated to numbers. I've somewhat quickly realized after reverting back to 2xxxx representation that that was actually equating numbers to their binary representation. A bit further there was even an arithmetic on binary numbers and that was confusing as some of the numbers looked like they were actual binary numbers and some of the numbers were symbols. I've tried to decipher the parts of the structure I've skipped before but there was no clear explanations. It looked like 2 preceded symbols and 3 was some kind of separator, but there were some weird things like there being a 2 that was not followed by binary, binary 1 not preceded by 2 and other strange things. I've decided to go on and faced a wall of some hard-to-understand arithmetic examples. They started simple with = 6 6, but quickly escalated to some cryptic structure although the overall meaning was kinda clear. I don't know how exactly I've come up with the idea of 2 and 3 being parentheses. Maybe it was influenced a lot by the prior knowledge, maybe it wasn't, but after I replaced the 2 with '(' and 3 with ')' I immediately saw that that was the right idea. The structure looked sensible apart from out-of-place '1()' sometimes. And that's when I realized that the message separator was not 32233 but 2233 meaning '(())'. I plan to continue to decode the message after that. TL;DR Final thoughts: |
Yay @Pavgran, thanks for trying, and the detailed post! Good point about parentheses. There's some needlessly cryptic constructions, I've been meaning to remap the message into a larger "symbol space" so component symbols are more distinct. |
@Pavgran nice comment on decoding cosmicos! @paulfitz sorry for chiming in: the real problem seems not to be the small alphabet (two delimiter symbols + two data symbols), but the strange syntax sometimes. I.e. the less syntax elements, the better. At the end the main problem is, that you only have a linear message string and want to transport multi dimensional data (lists of lists of lists of ...) in a first run and in a second run code in the former structure. So, for the syntax maybe it is best to stay within the Lisp/Scheme philosophy "code as data" and use S-expressions for which one variant of the message exists already (this means, we also need the two delimiter symbols). I'm not sure how to convince ET that the two symbols are in fact delimiters as @Pavgran wrote. Another thing is the ambiguity of binary representation of symbols and numbers, but I think this is no question of the alphabet, but of the encoding of the message data (maybe the prefixfree codes in #30 could help? Or one could use the binary equivalent of a random UUID during code compilation?). Does this make sense? What do you all think? |
Has anyone ever tested whether the cosmicos message can be successfully decoded by a single person or group (possibly of experts, like computer scientists), with no prior knowledge of its structure or content?
The text was updated successfully, but these errors were encountered: