The third iteration of Doug Bolden's various thoughts and musings.

Tag: ciphers

Version 2 of My Simple Sub Cipher

Ok, less “version 2” and more like “version 0.4” but still, I can engage in a bit of version-inflation if I want.

In last night’s post about an simple inline substitution cipher to help “jumble”-slash-“hide” text so that spoilers or text otherwise hidden until user action takes place, I had only the most basic pieces worked out.

This morning I worked out a few more basic features:

  • I have built a very basic “Simple Inline Substitution Cipher” page to handle the creation of these materials.
  • The cipher now should be able to pass through double- and single-quotes without breaking the HTML or Javascript.
  • Rather than paragraph tags, using span tags. This should help with adding stuff like single- to few-word elements inline with the rest of the text and no longer requires every instance to be a full paragraph of text.
  • Spans are giving a “click me” type title.
  • Spans are given a class of “gentleSubCipher” to allow for CSS to better improve their usability.
  • Spans are now given a random five-character ID to immensely reduce the issue of multiple IDs match and causing potential breakage.

It looks something like this:

CsrS rS E wESrI VHEveAV SsPbr4T Pjj QsV "D8PQVS" E4z 'Sr4TAV D8PQVS' E4z QsV PQsVf 4Vb VAVvV4QS QsEQ sEyV wVV4 EzzVz.

There are still quite a few limitations:

  • Just to clarify, it is not and will not be secure.
  • It still does not ignore HTML elements within that span.
  • It simple cannot work with feed readers and I need to test how to make it work better with screen readers.
  • Accented characters still are passed through but that still is not quite a problem.
  • Multiple posts with it might result in a problem where version shifts break previous posts when seen the front page, category page, etc.

For the latter, the idea might be to create unique scripts per post. I’ll have to give it some thought and testing.

The next version will be building in some logic for ignoring HTML elements. That should be a bit trivial for the types of things I need to ignore, but we’ll have to see.

Inline Substitution Ciphers to Play with Semi-Hidden Text

jLh 903mO moh0 m3 S6 SnE 0so Vhshn0Sh 3SnmsV3 6Y ShFS SL0S 0nh 903mO0MME ‘Lmoohs ms QM0ms 3mSh’ (C2r!) 9tS 0M36 Vhshn0MME nhO6Vsmd09Mh 03 LtU0s-BnmSShs ShFS 9E nhS0msmsV UtOL 6Y SLh QtsOSt0Sm6s, BLmSh3Q0Oh, 0so 6SLhn hMhUhsS3. jLm3 B6tMo hs09Mh Uh, Y6n ms3S0sOh, S6 BnmSh ShFS SL0S O6sS0msho 3Q6mMhn3 6n L0o 6SLhn 03QhOS3 s6S msShsoho S6 9h nh0o 9E jLh 7MV6nmSLU BLmMh 3tnn6tsoho 9E ShFS SL0S m3 QhnYhOSME LtU0s- 0so U0OLmsh-nh0o09Mh. a O6tMo 30E ‘J6t = Q66 Q66 Lh0o’ BmSL6tS SL0S 9hmsV msohFho. uE NhhQmsV mS 0 36UhBL0S 3mUQMh 3t93SmStSm6s OEQLhn, SLm3 Uh0s3 SL0S mS h03E Y6n Qh6QMh S6 Sn0s3M0Sh h1hs BmSL6tS 0sE 3OnmQS 0so 0MM6B3 mS S6 9h nhM0Sm1hME tso6sh 0S 0 M0Shn o0Sh.


If you click the text above, it should “solve out” to a line of text that reads:

The basic idea is to try and generate strings of text that are basically ‘hidden in plain site’ (PUN!) but also generally recognizable as human-written text by retaining much of the punctuation, whitespace, and other elements. This would enable me, for instance, to write text that contained spoilers or had other aspects not intended to be read by The Algorithm while surrounded by text that is perfectly human- and machine-readable. I could say ‘You = poo poo head’ without that being indexed. By keeping it a somewhat simple substitution cypher, this means that it easy for people to translate even without any script and allows it to be relatively undone at a later date.

And then if you click it again (without refreshing the page), it should do essentially nothing. This is my basic first pass on coming up with an idea I have had for Dickens of a Blog since way back. I am unsure when I first posited it but likely around 2006 or 2007.

The idea was simple: set aside some portion of the text in an otherwise open-to-read blog post {e.g., spoilers, info semi-hidden from scrapers, bits that otherwise might be triggers} through a simple enough cipher or baseline encryption that solving it would not become hostile to Doug’s happiness if keys/etc were lost.

The Code Behind It

Version 1 is above. What happens if I have a fairly simple Python code:

from random import sample

def scramble_AlphaNum(oldAlphaNum):
    return ''.join(sample(oldAlphaNum, len(oldAlphaNum)))

alphaNum = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz0123456789"
newAlphaNum = scramble_AlphaNum(alphaNum)

text = "The basic idea is to try and generate strings of text that are basically 'hidden in plain site' (PUN!) but also generally recognizable as human-written text by retaining much of the punctuation, whitespace, and other elements. This would enable me, for instance, to write text that contained spoilers or had other aspects not intended to be read by The Algorithm while surrounded by text that is perfectly human- and machine-readable. I could say 'You = poo poo head' without that being indexed. By keeping it a somewhat simple substitution cypher, this means that it easy for people to translate even without any script and allows it to be relatively undone at a later date."
txet = ""
paraName = "demo01"

for t in text:
    try:
        txet = txet + newAlphaNum[alphaNum.index(t)]
    except:
        txet = txet + t
        
output = """ <p id=\"""" + paraName + """\" onclick="gentleScramble('""" + newAlphaNum + """', '""" + paraName + """'); this.onclick=null;">""" + txet + """</p>"""
        
print(output)

Right now, I have to manually edit the file to have the paragraph, div, or span ID and then the contents. It’s fairly trivial to more generalize this. Running that, it spits out a paragraph tag that looks like:

<p id="demo01" onclick="gentleScramble('70u9wOboihzYgVXLamGHINxMAUrsk6CQfcpnv3jS2tR1DB4FJEZdeyWqP8Tl5K', 'demo01'); this.onclick=null;">jLh 903mO moh0 m3 S6 SnE 0so Vhshn0Sh 3SnmsV3 6Y ShFS SL0S 0nh 903mO0MME 'Lmoohs ms QM0ms 3mSh' (C2r!) 9tS 0M36 Vhshn0MME nhO6Vsmd09Mh 03 LtU0s-BnmSShs ShFS 9E nhS0msmsV UtOL 6Y SLh QtsOSt0Sm6s, BLmSh3Q0Oh, 0so 6SLhn hMhUhsS3. jLm3 B6tMo hs09Mh Uh, Y6n ms3S0sOh, S6 BnmSh ShFS SL0S O6sS0msho 3Q6mMhn3 6n L0o 6SLhn 03QhOS3 s6S msShsoho S6 9h nh0o 9E jLh 7MV6nmSLU BLmMh 3tnn6tsoho 9E ShFS SL0S m3 QhnYhOSME LtU0s- 0so U0OLmsh-nh0o09Mh. a O6tMo 30E 'J6t = Q66 Q66 Lh0o' BmSL6tS SL0S 9hmsV msohFho. uE NhhQmsV mS 0 36UhBL0S 3mUQMh 3t93SmStSm6s OEQLhn, SLm3 Uh0s3 SL0S mS h03E Y6n Qh6QMh S6 Sn0s3M0Sh h1hs BmSL6tS 0sE 3OnmQS 0so 0MM6B3 mS S6 9h nhM0Sm1hME tso6sh 0S 0 M0Shn o0Sh.</p>

I add that to my document via Custom HTML. The first string is the randomized a-z/A-Z/0-9 alphanumeric characters of the common American English alphabet (etc). It is randomized per running of the script.

Then at the bottom of the page, I insert another Custom HTML section with this Javascript:

<script>
function gentleScramble(newAlpha,para) {
	const AlphaNum = "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz0123456789";
	const newAlphaNum = newAlpha;
	
	let victim = document.getElementById(para).textContent;
	let solution = "";
		
	for (let v = 0; v < victim.length; v++) {
		foundIt = newAlphaNum.indexOf(victim[v]);
		if (foundIt != -1) {
			solution = solution + AlphaNum[foundIt];
		} else {
			solution = solution + victim[v];
		}
	}
		
	document.getElementById(para).textContent=solution;
}
</script>	

That text the paragraph and the substitution cipher and runs it the first click before passing the “this.onclick=null” to stop it from glitching out if a reader spam clicks it.

As it runs through it checks for characters in the defined “alphaNum” and ignores any that are not included. Those that are included it just re-subs them back to their original.

Voila.

Before you might say that this is fairly insecure, that is kind of the point. Is not trying to deeply encode the text, it is more just trying to play at gently hiding the text in a somewhat breakable pattern.

Current Issues

The first issue is that it is pretty hands on to generate the content, which is not 100% a problem for me but if I have several of these elements it will start to wear.

The solution I’m going to do is build a quick tool that allows for different element types {div, p, span} and a bit more of a GUI, probably through just a quick HTML page with text areas and buttons.

The second issue is that it only accepts characters in the a-z/A-Z/0-9 ranges. If I am typing in French and other languages, characters with diacritical marks will be ignored. This means that “ä” will show up as “ä” in the enciphered text. It’s not a deal breaker since the bulk of the text will be gently scrambled, but it can lead to potential weirdness.

The solution to this could be either to scan the contents and generate a shortened “alphaNum” that only includes characters in it while ignoring all the punctuation OR creating a new diaAlphaNum that includes a separate list of diacritically marked text.

I’m not sure which I prefer. I think I prefer to not worry about that so much.

The final issue at a glance is that any HTML elements inside that element {em, a, strong} would likewise be translated which at best would simply glitch them and at worse could in theory create HTML that is broken if it happens to stumble upon a different element than intended.

My solution to this problem is just to not do any of that.

There is a slight non-issue that feed readers and such will likely break in trying to help, but that’s a bit ok for the moment. Not for driving clicks or any such thing, just in that earlier attempts to build CSS/Javascript spoiler type solutions sometimes resulted in said spoilers being clearly visible to feed readers. It does possibly interfere with screen readers and that is a much bigger problem, but I’ll have to test it.

Possibilities for Expansion

My possible end goal for this would include this as a checklist:

  • Perhaps using a Vigenère cipher instead of a simple substitution one [because I prefer those],
  • Making it at least “smart” enough to ignore interior HTML elements, and
  • Generating a bit of styling that makes it more obvious what the reader is supposed to do, possibly including a failsafe type option if the reader has all javascript blocked, etc.

Powered by WordPress & Theme by Anders Norén