Changes

For the past couple weeks, I’ve been reading crypto fiction and essays, starting with True Names by Vernor Vinge. His vision was revolutionary for its time, and still inspires with its imagery of warlocks and witches, spells and incantations (of course, it’s all code in the real world).

I also read “True Nyms and Crypto Anarchy” by Timothy May in the same collection, most of which I did not agree with. His vision of a crypto-anarchist/libertarian “utopia” is misguided at best, and dangerous at worst — without delving into and getting sidetracked by politics, libertarians ignore the fact that their system would leave many out in the cold, the fact that the market doesn’t give a shit about people compared to profits. In short, it would be chaos for many. It would take another, lengthy post to dive into the details, but suffice it to say that the “crypto-anarchy” May describes is not desirable for most.

The most important essay I read, however, was “From Anonymity to Identification” by A. Michael Froomkin. To quickly summarize, privacy is in grave peril, both in the digital and real worlds. I’d highly recommend reading the paper, but the takeaway for me was that something has to happen now. Someone has to write the code to make privacy viable in the digital age now. I can’t just sit around and hope someone else will do it, so it might as well be me.

So, with this context in mind, I set out to begin coding a pseudonymous publication system for journalists based on the work of David Chaum, using the Dining Cryptographers protocol and “mix networks”, among other techniques building on the “web of trust” originating in PGP. As promised before, a post detailing the system is forthcoming.

The first choice I had to make was: which language should I write this in? This choice was largely limited by which cryptography libraries I decided to use. An article by Thomas Ptacek (whom I very much admire and respect) suggested Google Keyczar or cryptlib. Keyczar is out from the get-go, simply because I don’t trust Google not to insert backdoors into their products at the request of a three-letter agency. As for cryptlib, I’m wary: I don’t know when the library was last updated, the license is a concern, and links to the zip file itself are outdated and appear to be broken.

At this point, I’m looking at libgcrypt, a library developed by the folks behind GnuPG. Its bindings are in C, like cryptlib, so I was torn between the choice of C, or a language with a solid FFI. I bought a book on C, Learn C the Hard Way by Zed Shaw, although as I quickly discovered firsthand, C is a dangerous language. As I’ve heard for years, since I started learning to code, C is fraught with buffer overflows, memory handling issues, and other security concerns that make it exceedingly difficult to write safe code in C.

So, which language should I use? I considered the possibilities: C and C++ were out, for security reasons; both Java and C# are undesirable, as the JVM and .NET respectively are both closed-source in their most widely used implementations; and Python seemed like a reasonable choice, but the GIL makes it too hard to write concurrent code in a reasonable fashion. What does that leave me with?

It was at that point that I remembered Rust. It’s developed primarily by Mozilla, one of the few companies I trust. It has an excellent ecosystem, gaining popularity by the day. Most importantly, its compiler guarantees memory safety and easy-peasy concurrency, it’s “blazing fast”, and it has a reasonable FFI for interfacing with C libraries. With these benefits in mind, I’ve started reading the official book, The Rust Programming Language, freely available on the official site.

I’m afraid. I’m afraid I don’t have the coding chops to pull this off. I’m afraid I can’t do it. I’m afraid, most of all, that I’m going to fuck up the crypto code and introduce a subtle bug or vulnerability that will be discovered after the fact. So, the plan is to release an open source alpha as soon as humanly possible, and solicit cryptography experts to audit the code, try it out, to see what breaks and what works. At the same time, I hope to recruit better developers than myself to work on the code. Finally, I am going to meticulously study the Signal codebase for inspiration and guidance.

My hopes are slim. As Froomkin describes, privacy, both in the digital world and in real life, is rapidly dwindling. There are many arguments against privacy, in fact, in the post-9/11 age, mainly centered around the dangers that guaranteed, unbreakable privacy poses.

There’s an interesting solution proposed by David Chaum that essentially allows users of the system to “out” bad actors. Using the Dining Cryptographers protocol, that’s certainly possible. But these issues are not my main concern — my concern is that I’m going to write buggy code, that I’m going to create an ultimately broken system, that I’m going to create a system that costs lives rather than saving them.

Even so, I have to try.

Working the Problem #2

I finally solved challenge #11 of Cryptopals, after sorting out a few key misunderstandings. Now, I’m working on challenge #12, “Byte-at-a-time ECB decryption (Simple)”. God help me — if this is simple, I hate to see what’s complicated. Granted, most of the steps are outlined, but I’m missing something key here when it comes to decrypting bytes beyond the first 15.

Let’s back up. What follows is basically a stream-of-consciousness: you’re seeing me work the problem in real-time, “rubber duck debugging”, essentially. Spoiler alert: if you’re doing the Cryptopals challenges or plan on doing so, this might ruin the fun, depending on how far I get.

The challenge is an interesting one. Essentially, if you add known plaintext to mystery text, encrypted using ECB (Electronic Code Book) mode, one byte at a time, you can decrypt the mystery text, without the key. If you write software, this should scare the living shit out of you.

This is how it works: you start by providing a plaintext byte (let’s say “A”). Appended to this byte is the mystery text, in what we’ll call an “oracle function”. You keep adding bytes, the same byte, until you see that the blocks start repeating — that’s the key that tells you it’s using ECB mode. You can determine the block size using the same technique, observing when the first block stops changing. That’s the easy part.

The hard part is decrypting the mystery text, byte by byte. The first byte? That’s easy enough. You take the block size, subtract one, and create plaintext with that one byte at that particular length. Run it through the oracle function. Now, you go through all the possible bytes, 0 through 255, and add them to your repeated “A”s to get a full block. Then, you compare each block to the actual ciphertext given by the oracle function — when you hit a match, boom, you have the first byte of the mystery text, because you know the rest of the block and you know that last byte.

Now how do we get the rest of the bytes? I get the first 15, no problem. What you do is essentially the same as the last paragraph, with a few exceptions. First of all, let’s say the block size is 16, which using AES in ECB mode, it is. That would mean, in the last paragraph, your plaintext was 15 bytes of “A”. This time, for the second byte, we’re going to make it 14 bytes of “A”. That way, the last byte in the 16 byte block will be the second byte of mystery text. Remember, we already know the first byte, so when we iterate over bytes 0 through 255, we can append the bytes we’ve already seen to the plaintext, followed by the byte we’re iterating. In this way, we discover bytes 0-14 (15 bytes total, like I said at the beginning).

This is where I’m really stuck right now. Once I exhaust the first block, I don’t quite know where to go. Sure, I could keep appending the known bytes to my repeated text. And that should give me bytes 15 to n. I just know I’m doing something wrong with the math for the block selection, and the multiple for the block length. But! I have a debugger (pry/byebug with Ruby) and I know what the values should be, so I just have to work backwards and figure out the arithmetic for the block index and block multiple. I think looking at it with fresh eyes tomorrow night will help, as I’ve been working on this literally all day, since I woke up.

That should do it for now — I’m getting a bit verbose, compared to my other posts. From here, I think a good night’s sleep and reflection on what I’ve written here will allow me to solve the problem tomorrow.

UX Review: SSH

Inspired by my friend Dane’s UX review of Hacker News, I’ve decided to start a mini-series of UX reviews myself, of various crypto apps and libraries. The crypto world has serious problems when it comes to user experience, ease of use, and sensible defaults, among other issues, and I’d like to bring them to light. Let’s start with SSH.

For those of you who don’t write code for a living, SSH stands for “Secure Shell”, and it lets you log into another machine on the internet running an SSH server, opening a command line session. You can even start a graphical session, although it takes a lot of know-how, chutzpah, luck, and just the right planetary alignment.

In terms of practicality, SSH is pretty solid — I use it at least every few days. If you need a remote terminal session, it pretty much does what it says on the tin. However, the difficulty of starting a graphical session is too damn high. Personally, I prefer Microsoft Remote Desktop over SSH here (I know, I know, totally different operating systems, but the difference in ease of use is striking). Don’t get me wrong, graphical sessions are doable. Given time and effort, I can get one going. But your average user? You’re better off giving them a couple tin cans and some string (just to be clear, that’s a comment on SSH, not your average user).

Another glaring flaw with SSH is key generation. The standard key generation program, ssh-keygen, requires use of the command line — that’s strike one against its usability. Even if you’re a command line aficionado, there are multiple questions to be answered. Which algorithm do you use? RSA? What about key size? What’s enough? Assuming you can answer those questions, there are a series of personal questions like company name, email, full name, country…what’s relevant and why?

Let’s say you get through the thicket of questions around key generation. What about key management? If you have a single key pair (yes, there are two to worry about), it’s not so bad, assuming the user doesn’t mix up their public and private keys. If you have multiple key pairs, maybe one for personal use, one for work, and one for a side project, you’re fucked. You’ll have to either specify which key to use every time you connect, or use a combination of configuration files and ssh-agent. Good luck, and may God have mercy on your soul.

These problems sound trivial to a computer expert, because to someone who writes code, these problems are entirely tractable and make sense for software designed for developers. But if you’re designing software for general use, for ease of use, and you want regular folks to be able to use it, these problems are almost absurdly difficult to overcome. Think about the description of SSH: it lets you use another computer as though you were sitting in front of it. Given that promise, SSH delivers to the computer cognoscenti and fails the average user miserably.

Working the Problem

If there was a problem, yo I’ll solve it!

After quite a long break, I’m back to work on the Cryptopals challenges. Spoiler alert: if you haven’t done the challenges yet, and you plan on doing so, don’t read this post — it will give away key information that you need to learn on your own.

With that out of the way, let’s explore Set 2 / Challenge 11, “An ECB/CBC detection oracle”. I’ve had a lot of trouble with this particular challenge, to say the least. The idea is to encrypt a chosen plaintext with either ECB or CBC mode, chosen randomly, and write code to detect which is being used. This post will be, in part, a stream-of-consciousness, as I figure out what the problem may be with my code.

The key, as far as I can tell, is to append 5-10 bytes (count chosen randomly) before and after the plaintext. I know this is important because ECB mode, given the same plaintext block (16 bytes) and the same key (again, 16 bytes) will produce the same ciphertext (not to mention, the authors put that clue in italics). However, I haven’t been able to see this pattern in the ciphertext bytes I’m logging to the console, and therefore haven’t been able to tell the difference between ECB and CBC (CBC mode doesn’t have this weakness).

So, what am I doing wrong? There are a few possibilities. One possibility is that my plaintext is too short, such that the 5-10 bytes being appended are part of the same block, and won’t be repeated. In other words, if my plaintext is “zeldaSECRETzelda”, I won’t see repeated bytes because “zelda” is in the same block both times.

Another possibility is that ECB padding is screwing things up somehow…I had to add a switch in the code for CBC mode when I implemented CBC in terms of ECB as part of an earlier challenge.

I just had a thought: it doesn’t matter if part of a piece of text is repeated — it only matters if an entire block is repeated! All this time, I’ve been working under the subconscious assumption that if my appended bytes are repeated anywhere, they’ll be repeated in the ciphertext, but that just isn’t the case. Since I choose the plaintext, I should be able to make the repetition happen (it has to be possible, or this challenge would be unsolvable).

And just like that, I think I’m halfway to a solution for Challenge 11! I’m going to get to work, constructing plaintext that will give me the repetition I need to detect ECB mode. More to come…

Simulacra

I took a short break from reading Codebreakers, a productive distraction, as the part I was reading was particularly dry, and I needed a rest. During this break, I read Occupation: The Ordeal of France 1940-1944 by Ian Ousby, deserving of its own post one of these days, and started Cryptonomicon by Neal Stephenson. The latter has proved to be quite a tale, some of the best historical fiction I’ve ever read.

I call it historical fiction, and not speculative fiction as most would, because it is so accurate in its depiction of cryptology (among other things). The inspiration for this post was a particularly ingenious metaphor for WWII-era cipher machines. In this metaphor, a fictional Alan Turing (!) describes a broken bicycle chain, along with a broken spoke on the rear wheel. Every time the spoke hits the weak link on the chain, the chain comes off. He goes on to describe the period of the breakage, based on the rotation of the wheel and the chain, and their least common multiple. This metaphor is an ideal representation of the periodic repetition of these wheel-based cipher machines, as described in Codebreakers.

I can’t truly do it justice here, but I would highly recommend reading the book, which I’ve linked to above (the link is not an affiliate link, by the way — I gain nothing from your clicks). The book also describes Turing machines, WWII battles, information theory, codebreaking efforts, computer networks (albeit circa 1999), and myriad other topics which I can’t list in full here. More than this, it does so with enthusiasm and, above all, accuracy! I’ve yet to find a topic I’m familiar with that it covers incorrectly, earning it the title, in my mind at least, of historical fiction.

In other news, I haven’t abandoned my original quest, described in my earliest posts. I still intend to finish Codebreakers, and I still intend to complete the Cryptopals challenges. Once I complete both of those, I feel I’ll be ready to design my ultimate cryptosystem, the pseudonymous publication system I’ve decided to codename “Charlie“. Unfortunately, I’ve been held up by persistent illness, but the good news is that my doctors are taking steps to treat it. Anyway, I plan on continuing my reading of Codebreakers tonight, pushing through the dry matter and getting to the bulk of the book, its treatment of WWII codebreaking. That, I imagine, will be an exciting read, and if I become bored, I can always take a constructive break with Cryptonomicon.

A Report to the Shareholders

A recent article brought to light some shady behavior from Amazon. Mere weeks after Google sparked internal protests by working as a defense contractor, Amazon has essentially done the same thing, albeit in a much scarier setting: Amazon’s offshoot, “Rekognition”, is allowing police to use real-time facial recognition technology with zero oversight.

They even went so far as to call Orlando a “smart city”, as they’re using facial recognition in cameras around the city to track persons of interest (emphasis being on the phrase “persons of interest” — these aren’t people who have necessarily been convicted of a crime, or even charged with a crime). Due process is totally lacking in this horror show. Moreover, I wonder who defines “persons of interest”.

Beyond the eyes in the sky, the creators of the body cameras that are becoming more and more ubiquitous are either building in facial recognition technology, or making the option available in later versions. I’m all for body cameras, but their current usage is rife with problems: the main issue is that they can be turned off by the officers themselves, rendering them useless! To compound the problem, this facial recognition technology can give officers the opportunity to turn the cameras off in advance in the presence of a “troublesome” individual, “troublesome” being defined by the officer themselves, right or wrong. Imagine if you were in a place like China, with their “social credit” system…you could be targeted merely based on who you associate with, or whether you or your associates are critical of those in power!

The problem with all of these technologies, and the threat to our privacy and our other essential rights, is when these technologies are applied with zero oversight in the hands of individuals who are incentivized to misuse them. After all, what kind of officer wouldn’t use this technology if they thought it would help protect them? The question is: should we protect the officer, who has immense power, or the innocent-until-proven-guilty individual, whose power is extremely limited? When protection of both parties is not possible, I tend to vote for the latter, although this is a complex subject for another post.

So how can we thwart these potentially heinous, despicable technologies which threaten to turn our society into some Kafkaesque nightmare, something out of Minority Report? The first thought that comes to mind is “scramble suits”, something out of science fiction (specifically A Scanner Darkly, another Philip K. Dick story), which are basically a body suit that “scrambles” together different faces and bodies such that an observer will never know who’s truly behind the suit — the viewer is left with an impression of generic faces they could never hope to recall.

This is far-fetched, to say the least — aside from the technological feasibility, it mainly suffers from the same problem as the solution I proposed in my last post, that people would adopt such a technology at this point. I still plan on addressing this in future posts, how to get people to adopt privacy-friendly tech (this is no small problem). “Juggalos” (a subculture from my native Michigan) have a simpler idea: paint your face.

To cap off this post, I’d like to quote Amazon regarding this horrific real-time facial recognition technology:

Our quality of life would be much worse today if we outlawed new technology because some people could choose to abuse the technology.

On the face of it, I agree with this argument, as there is inherent danger in any freedom, However, this is a classic example of two-predicate logic, a fallacious argument that if we can’t let technology run rampant, it must be outlawed — there’s a lot of gray area between “rampant” and “outlawed”. In particular, this gray area includes legislation that could regulate the technology such that power is placed in the hands of the people, rather than the faceless entities that would use it to their own ends. With that, I’d like to leave you with a question: are you willing to put a price on privacy?

Stingray

I found a story on Hacker News detailing “Stingray”, and the criminal who found it:

Rigmaiden eventually pieced together the story of his capture. Police found him by tracking his Internet Protocol (IP) address online first, and then taking it to Verizon Wireless, the Internet service provider connected with the account. Verizon provided records that showed that the AirCard associated with the IP address was transmitting through certain cell towers in certain parts of Santa Clara. Likely by using a stingray, the police found the exact block of apartments where Rigmaiden lived.

Let’s get this out of the way first: what Rigmaiden did was wrong, and he deserved to be punished for his fraud. However, the abuse of power by law enforcement, and the way in which they were able to track him with Stingray, without a warrant, was even more heinous.

So what is Stingray, and why should we care? First, the “what”: Stingray is essentially a fake cell tower. Your phone connects to it, and it proxies the traffic so you don’t realize what’s happening. All the while, your phone is being tracked by law enforcement. So why should we care? Isn’t this only going to affect criminals?

As we’ve seen from the Snowden revelations, such operations by government agencies and law enforcement start out with the best of intentions, and quickly devolve into unconstitutional, unwarranted surveillance operations that target even the most minor of suspects, or even the innocent in the case of NSA spying. Even worse, these tools can be used to create an all-out police state in the hands of a tyrant.

So what does this have to do with crypto?

We can see from the Rigmaiden case that cryptography is not the end-all, be-all solution touted by some. Quite the opposite — crypto is but one piece in the security puzzle. Consider this: if you’re using Signal for end-to-end encryption of your communications, and some malicious entity has surreptitiously installed a keylogger on your phone, does it matter that your communications are encrypted?

The Rigmaiden case also tells us how important cryptography and security are for keeping unconstitutional behavior in check. Had there been a mesh network in use, connecting via an exit node to the wider internet, Stingray would have been useless. Combined with a DC-net, communications could have been kept truly anonymous.

My previous two statements are fairly broad and assume a lot, so let’s narrow them down a bit. What if each apartment complex used a mesh network, such that all of its inhabitants shared one high-throughput exit node? Of course, this only begs the question: how do we encourage adoption of such technologies?

I thought it would take something like the Snowden revelations, but unfortunately, that story ended not with a bang, but a whimper. It may take a seismic event, even more shocking than Snowden’s story, to shake us into action. What will it take? I’ll write more about the technologies we can use to subvert these abuses of power, and what it will take to spur adoption, in future posts.

Today I Learned #4

Today I learned (more) about William F. Friedman, the true “father of American cryptography”:

  • Friedman wrote the de facto standard texts on cryptanalysis. His pamphlets are still considered the prerequisite for cryptanalysts today, or at least, they were at the time The Codebreakers was written.
  • He invented the term “cryptanalysis”, as well as “monalphabetic” and “polyalphabetic”!
  • His magnum opus, The Index of Coincidence and its Applications in Cryptanalysis, brought cryptanalysis out of isolation and meshed it with the world of statistics and mathematics. I cannot understate how important this work was to modern cryptanalysis — without this work, cryptanalysis as we know it today would exist, but in a stunted, deformed state. If I recommended Codebreakers for one reason, it would be for the sublime beauty in Friedman’s techniques described within. I have to admit, I don’t understand a lot of the descriptions of cryptanalysis in the book, but Friedman’s methods are as simple and sweet as apple pie, and as powerful as a howitzer.
  • They say “behind every great man is a great woman”, and Friedman was no exception. Elizebeth Smith Friedman most notably worked with William to disprove the theory that Francis Bacon was the author of Shakespeare’s plays, by tearing apart the supposed “cryptograms” hidden within.
  • Friedman not only brought America to the forefront of worldwide cryptological prowess, but also fathered the NSA. The NSA is the direct descendant of the organization that Friedman created single-handedly (with Elizebeth by his side, of course).
  • In an episode that appears to be somewhat common throughout Codebreakers, Friedman sadly suffered a nervous breakdown in 1941 and was hospitalized for months, due to the sheer stress of the work involved. We see this today in software engineers in the form of “burnout”. In other cases (unrelated to Friedman), some cryptanalysts babbled incoherently, hallucinated, and suffered all manner of horrific things due to the pure fatigue of working on a problem nonstop.
  • Without Friedman, America would likely have been unable to solve the PURPLE cipher used by the Japanese in WWII. Who knows what the outcome of the war would have been without the intelligence gathered thanks to Friedman and his team…

It should go without saying that Friedman is now on my list of personal heroes, along with Étienne Bazeries. He was a true genius, a polymath, a visionary, a scholar, a man of the greatest importance in cryptology. I can’t truly do him justice here, but hopefully, I’ve shed some light on the brilliance that was William F. Friedman.

Today I Learned #3

Today I learned about Herbert Yardley, architect of the American Black Chamber, named in homage to the “Black Chambers” of Europe a couple centuries prior. In this organization, subsidized by the State Department, Yardley and his team cracked the diplomatic messages of every major ally we had at the time (shortly after WWI). He even wrote a book about it, “The American Black Chamber”, which was both praised and panned — some critics noted it was the first book of its kind by an American, offering a glimpse into a world never before seen by the general public, much like “The Codebreakers”, while others criticized it as jeopardizing foreign relations.

The reason I decided to write this post is the reaction of Congress to Yardley’s book. They essentially wanted to make it a crime to do what Yardley did, to disseminate information gained in a government position. Some representatives rightly pointed out that this would limit the freedom of the press to make public communications which they thought were damaging to the American public or the fabric of society, but nevertheless, with the backing of the administration at that time, the law passed and sits on the books to this day.

Could you imagine such a scenario today? Could you imagine regular folks, let alone representatives, giving two shits about freedom of the press, in an age where the press is vilified by our own president? Where Snowden was seen as a traitor by his predecessor? When trust in the media is at an all-time low? I know, there are plenty of people fighting the waves upon waves of “fake news” (I prefer the term “propaganda”, and will use that word instead from here on out). There are people who are fighting tooth-and-nail to restore faith in the press, and I respect them to no end. However, propaganda, and those who seek to discredit the press, seem to be winning.

So what does this have to do with crypto, you might be asking? Well, the entire reason I’m studying crypto is to create a secure pseudonymous publication platform for journalists. I want to restore that trust. With crypto, along with network analysis, trust graphs, and a whole host of other techniques, the details of which I haven’t completely hammered out yet, we can at once restore trust in journalism while protecting those writers from persecution (or prosecution, at that). America’s founders put freedom of the press in the very first amendment for a reason — tyranny cannot survive in the light. With cryptography, this dream can be realized.

In future posts, I will be fleshing out the details of this pseudonymous publication platform, as promised. I’ve been trying to work my way through “Codebreakers” as quickly as possible (without skimming or missing details, of course), so I’ve been distracted, in a good way. I’m learning a lot. I’ve also been trying to work my way through the Cryptopals challenges, although I have to admit, I’m stuck on the “ECB/CBC detection oracle” challenge, in which I have to distinguish between ciphertext encrypted randomly with one of those two modes (more to come on that in another post). I have a lot of pies in the oven, but I’m making forward progress. More to come, folks…

The Importance of Being Simple

Reading Codebreakers, I came across a section on American cryptography during WWI. The production of codebooks was both secure and efficient, but the front line was a different story. According to Kahn, no other army could match the American’s frustration when it came to actually using the damn things. One general even commanded his division not to use the codes at all before or during crucial operations! This is better than sending messages improperly encoded, or re-sending those messages in the clear after the fact, but it’s still incredibly reckless.

User experience is not a new concept. Luckily, we live in an age where UX has come to the forefront of app development concerns, teaching more people than ever about the necessity of designing something that’s not just easy, but simple for actual people to actually use.

Despite this, we have APIs like OpenSSL, which allow you to do insanely stupid things like using ECB mode, which is no better than a simple codebook, or CBC mode with a null IV, which will encrypt the same plaintext/key combo the same way, every time. Then we have GPG, which makes encrypting emails about as easy as pulling your own teeth.

I get why this is the case: developers want to give their users options. They want to make the tool as widely usable as possible. But it’s like Kurt Vonnegut said, “if you open a window and make love to the world, your story will get pneumonia”. He was talking about writing, but the same applies to the UX of your software. If you try to be everything to everyone, you’ll end up being nothing to no one.

So how do we create simple systems? Easy: put your system in front of regular folks, and listen to what they say. Put your app in front of someone who has no idea what it does, and see how they use it. Again, this isn’t a new concept. Cryptanalysis has a similar dictum: only real-world experience will prove (or disprove) the security of your system. No amount of theoretical hand-waving will do this for you. If your user has to worry about key sizes and verifying signatures manually (I’m looking at you, GPG), you’ve already lost.

Signal does an excellent job of simply securing communications, without making the user worry about details that are insignificant to them (but crucial to actual security). I’m not saying the problem is easy to solve. But I am saying it’s tractable.