1 00:00:00,360 --> 00:00:02,520 Thank you. 2 00:00:06,440 --> 00:00:08,160 I'm David. 3 00:00:08,440 --> 00:00:10,400 So I started using scratch 4 00:00:10,400 --> 00:00:13,680 when I was 11 years old, and I wasn't very good at it back then. 5 00:00:13,880 --> 00:00:17,520 I was doing more of the cat spending on the spot that was mentioned earlier, 6 00:00:18,520 --> 00:00:20,880 But now I'm back 13 years later 7 00:00:21,040 --> 00:00:23,560 with a bit more programming experience under my belt. 8 00:00:24,640 --> 00:00:27,760 Kind of returning to my origins to see what I can actually do with Scratch now. 9 00:00:28,640 --> 00:00:31,680 So by day, I'm a security researcher, 10 00:00:31,800 --> 00:00:34,120 especially in software reverse engineering. 11 00:00:34,640 --> 00:00:37,280 I'm particularly interested in file formats, protocols, 12 00:00:37,600 --> 00:00:40,320 serialization and parsing and lots of other things. 13 00:00:40,680 --> 00:00:43,480 And I also enjoy breaking weak cryptography, 14 00:00:44,360 --> 00:00:46,480 but I'm not a cryptographer or a mathematician. 15 00:00:47,400 --> 00:00:50,160 So my mathematical knowledge is actually pretty weak. 16 00:00:50,440 --> 00:00:53,680 So this might be interesting for other people in the same boat 17 00:00:53,680 --> 00:00:58,640 as to kind of see how I get by without actually understanding some of the maths. 18 00:00:59,200 --> 00:01:02,880 So the first thing that people ask when I tell them that I've implemented 19 00:01:02,880 --> 00:01:07,280 some photography in scratch is that like, why, why on earth have you done this? 20 00:01:07,360 --> 00:01:09,200 But why would you do that? 21 00:01:09,200 --> 00:01:12,600 And the first answer, and the shortest answer is because I can. 22 00:01:13,920 --> 00:01:15,920 And that's the only answer, really. 23 00:01:15,920 --> 00:01:17,640 I felt like I had to make up some other answers 24 00:01:17,640 --> 00:01:20,720 after the fact, because otherwise I'd seem a bit insane. 25 00:01:21,520 --> 00:01:25,320 So one answer is that as a reverse engineer, 26 00:01:26,120 --> 00:01:30,040 it's all about spotting patterns in code to figure out what it's doing. 27 00:01:30,440 --> 00:01:33,360 And the only way you can spot a pattern is if you've seen it before. 28 00:01:33,600 --> 00:01:36,600 And the best way to say it before is to implement the algorithm yourself. 29 00:01:36,840 --> 00:01:39,400 So by implementing algorithms from scratch, 30 00:01:39,920 --> 00:01:42,120 it improves your reverse engineering skills. 31 00:01:42,120 --> 00:01:46,200 And implementing things in scratch is kind of an added level of difficulty, really. 32 00:01:46,200 --> 00:01:48,080 And it means you can't cut any corners. 33 00:01:48,080 --> 00:01:50,480 You really do have to implement everything from scratch with no 34 00:01:50,720 --> 00:01:52,440 no cheating, using libraries and things like that. 35 00:01:53,720 --> 00:01:56,480 The next reason is that I want to demonstrate that 36 00:01:56,720 --> 00:02:00,360 cryptography isn't like this magic black box that does stuff. 37 00:02:00,640 --> 00:02:03,680 You can actually construct it yourself with quite primitive, 38 00:02:04,200 --> 00:02:06,960 primitive components like scratch. 39 00:02:08,000 --> 00:02:12,360 And finally, I'd like to show that scratch is a real programing language 40 00:02:12,360 --> 00:02:14,720 because a lot of people like to say it's just a toy. 41 00:02:14,720 --> 00:02:16,120 It's not a real programing language, 42 00:02:16,120 --> 00:02:19,480 but it's really no reason that it wouldn't be a real programing language. 43 00:02:19,520 --> 00:02:22,200 You can do anything in scratch that you can do with a computer. 44 00:02:22,760 --> 00:02:25,920 It might just be a bit more tedious. 45 00:02:26,400 --> 00:02:28,440 So my goals for this project 46 00:02:29,160 --> 00:02:32,600 is to have like reasonably fast cryptography, scratch it. 47 00:02:32,600 --> 00:02:34,280 So is quite a slow language. 48 00:02:34,280 --> 00:02:37,080 It's interpreted and the interpreter runs in JavaScript. 49 00:02:37,200 --> 00:02:39,040 So there's like a lot of layers of slowness going on, 50 00:02:40,040 --> 00:02:41,520 but I don't want to be waiting all day 51 00:02:41,520 --> 00:02:44,160 for a small message to encrypt, so I want it to be reasonably fast. 52 00:02:44,680 --> 00:02:45,200 And of course, 53 00:02:45,200 --> 00:02:48,480 I want it to be reasonably secure, but I probably wouldn't rely on it anyway. 54 00:02:49,560 --> 00:02:52,280 It is more of a prototype than something you would actually want to use. 55 00:02:52,840 --> 00:02:55,000 And finally, I really want it to be correct. 56 00:02:55,200 --> 00:02:59,200 So I want my implementation to match the specifications to the letter 57 00:02:59,400 --> 00:03:03,240 so that if I encrypt something in scratch, I could put it into Python 58 00:03:03,240 --> 00:03:08,840 and decrypt it in Python so that it is compatible with other implementations. 59 00:03:09,960 --> 00:03:13,680 And finally, what do I mean by modern cryptography? 60 00:03:14,280 --> 00:03:17,320 Because that's a bit vague, deliberately so, but 61 00:03:18,480 --> 00:03:21,320 for my project, I want to have standardized 62 00:03:21,680 --> 00:03:23,960 standardized ciphers that are actually used in the real world, 63 00:03:24,480 --> 00:03:26,400 because I think that makes it more interesting. 64 00:03:26,400 --> 00:03:30,080 And also, I don't want to use any known vulnerable ciphers. 65 00:03:30,360 --> 00:03:34,240 So something like C4 for that you might have heard of is pretty old at this point. 66 00:03:34,920 --> 00:03:37,200 There's multiple published vulnerabilities in it. 67 00:03:37,200 --> 00:03:38,760 So I wouldn't want to use a cipher like that, 68 00:03:38,760 --> 00:03:42,440 even though it'd be very easy to implement and scratch. 69 00:03:43,360 --> 00:03:46,440 So the algorithms that I chose to implement, 70 00:03:48,240 --> 00:03:49,200 the first one 71 00:03:49,200 --> 00:03:52,120 is chapter 20.35, 72 00:03:52,120 --> 00:03:55,160 which is no antiquated symmetrical encryption algorithm. 73 00:03:55,440 --> 00:03:58,160 I'll explain what that means a bit later, But I chose it because it's 74 00:03:58,160 --> 00:04:01,560 secure and fast and it's standardized and it's used in Thales 1.3 75 00:04:02,440 --> 00:04:06,800 and also the x two 5519 key exchange algorithm, 76 00:04:07,080 --> 00:04:11,160 also secure and fast, also standardized and also used in Thales 1.3. 77 00:04:11,240 --> 00:04:16,520 And that's interesting because Thales is what underpins HTTPS used by web browsers. 78 00:04:17,160 --> 00:04:20,200 So every time you load a web page over actually taps at the chance 79 00:04:20,200 --> 00:04:23,360 that your browser is using these same ciphers to encrypt a webpage. 80 00:04:24,600 --> 00:04:26,560 And finally, as a hash algorithm 81 00:04:26,560 --> 00:04:30,000 I implement, I'd like to be also first also standardized 82 00:04:30,160 --> 00:04:33,920 and it's used in Wireguard, which is a very modern VPN protocol. 83 00:04:34,560 --> 00:04:35,640 And one nice thing about it 84 00:04:35,640 --> 00:04:39,360 is it only uses 32 bit integers compared to some other hash functions 85 00:04:39,760 --> 00:04:42,680 and a lot of other functions like to use 64 bit integers, which as I 86 00:04:42,680 --> 00:04:44,640 explain later, are annoying to do in scratch. 87 00:04:45,640 --> 00:04:45,960 Another 88 00:04:45,960 --> 00:04:48,960 reason I picked it is because it shares a lot of code of transparency, 89 00:04:48,960 --> 00:04:51,800 so I can kind of copy and paste a lot of my implementation. 90 00:04:52,600 --> 00:04:55,880 At some point I would like to implement a signature algorithm. 91 00:04:56,080 --> 00:05:00,600 I haven't done that yet, but when I do it will probably be as two 5519. 92 00:05:01,080 --> 00:05:03,000 I don't know how it works yet, so I can't implement it. 93 00:05:03,000 --> 00:05:05,840 But one day I will. 94 00:05:06,760 --> 00:05:10,920 So, so just a quick overview of Chapter 20. 95 00:05:11,400 --> 00:05:16,080 It's a symmetric encryption algorithm, which means that you can encrypt a message 96 00:05:16,080 --> 00:05:20,560 using a key and then decrypt the message at some later point using the same key. 97 00:05:21,240 --> 00:05:23,200 So that might be after you've sent it to someone else, 98 00:05:23,200 --> 00:05:25,960 or it might be saving data to your hard drive or something like that. 99 00:05:27,800 --> 00:05:28,560 And it was published in 100 00:05:28,560 --> 00:05:30,840 2008 by Daniel Bernstein. 101 00:05:31,640 --> 00:05:35,040 And that's interesting because Scratch was published in 2007. 102 00:05:35,200 --> 00:05:37,480 So this is actually more modern than scratch itself, 103 00:05:37,680 --> 00:05:40,000 even though it might seem a bit old in the grand scheme of things. 104 00:05:41,600 --> 00:05:44,480 And it's an example of what's known as an hour X construction, 105 00:05:44,760 --> 00:05:49,920 which means that only users of three or four operations edition rotation and XOR. 106 00:05:50,320 --> 00:05:52,440 And I'll explain what those mean a bit later. 107 00:05:52,440 --> 00:05:55,840 But it's ultimately quite simple only only three things I need to worry about. 108 00:05:57,000 --> 00:05:59,080 And it also, as I mentioned earlier, uses 109 00:05:59,080 --> 00:06:01,520 32 bit numbers which work nicely in scratch. 110 00:06:02,320 --> 00:06:05,800 And it's a stream cipher, which means it works by producing 111 00:06:06,240 --> 00:06:11,040 a pseudo random sequence of bytes based on the key, which it then combines 112 00:06:11,040 --> 00:06:16,640 with the bytes from your message that you want to encrypt using XOR. 113 00:06:17,840 --> 00:06:19,840 So I did say this is from scratch. 114 00:06:19,840 --> 00:06:22,200 So I'm going to start off with the very basics. 115 00:06:22,200 --> 00:06:24,480 How do computers represent numbers? 116 00:06:25,040 --> 00:06:28,920 So obviously as humans we're used to using the digits 049 117 00:06:29,280 --> 00:06:32,520 and then when we want to do numbers higher than nine, we use more digits. 118 00:06:32,680 --> 00:06:35,520 So we got up to one zero as ten as 119 00:06:35,520 --> 00:06:38,760 binary only uses the digits zero at one. 120 00:06:38,760 --> 00:06:42,120 We call a binary digit a bit and there are eight bits and bytes 121 00:06:42,840 --> 00:06:46,080 and commonly you'll see other lengths of numbers 122 00:06:46,080 --> 00:06:49,000 are 32 bit for example, or 64 bit or even more 123 00:06:50,080 --> 00:06:52,440 frequently used in computing. 124 00:06:52,440 --> 00:06:55,080 So before I give an example of binary, 125 00:06:55,240 --> 00:06:57,200 just I will remind you all 126 00:06:57,200 --> 00:07:00,120 how decimal works because you kind of take it for granted, but 127 00:07:00,280 --> 00:07:02,640 you've got the number 137 on the left there. 128 00:07:03,160 --> 00:07:06,200 The first column is kind of worth 100. 129 00:07:06,960 --> 00:07:10,680 The second column is worth 30 and the last column is worth seven. 130 00:07:10,800 --> 00:07:13,400 And if you add those three components together, 131 00:07:13,760 --> 00:07:16,280 you end up at the number 137. 132 00:07:16,920 --> 00:07:21,800 And so binary is exactly the same concept, but we've just got zeros and ones. 133 00:07:22,080 --> 00:07:28,440 And so in this example, the number 137 is 10001001. 134 00:07:28,960 --> 00:07:31,080 The first one is worth 128. 135 00:07:31,480 --> 00:07:32,960 The next one is 12 eight. 136 00:07:32,960 --> 00:07:35,400 And the last one is one. 137 00:07:35,760 --> 00:07:38,120 And if you add them together, you will also get 147. 138 00:07:38,600 --> 00:07:42,440 So hopefully you can see how that kind of translates conceptually across 139 00:07:44,200 --> 00:07:45,160 and obviously, 140 00:07:45,160 --> 00:07:48,000 I said Joshua, 24, was an error X construction. 141 00:07:48,200 --> 00:07:50,040 So we've got additional rotation in X 142 00:07:50,040 --> 00:07:52,080 and obviously the first of those three is addition. 143 00:07:52,080 --> 00:07:54,840 And that's exactly what you'd expect it to be, except 144 00:07:55,160 --> 00:07:58,280 there's one catch because we're dealing with 32 bit integers. 145 00:07:58,760 --> 00:08:02,160 You can only represent numbers less than two to the power of 32, 146 00:08:02,320 --> 00:08:04,360 otherwise you run out of bits. 147 00:08:04,360 --> 00:08:08,480 So what happens is when a computer adds two separate numbers together, 148 00:08:08,720 --> 00:08:12,240 if the result is too big, it just kind of chops off the high bits 149 00:08:12,480 --> 00:08:14,920 which results in it, wrapping around to a smaller number again. 150 00:08:15,320 --> 00:08:18,120 So for example, if I had 151 00:08:18,120 --> 00:08:20,000 the number, that's one less in search of the path. 152 00:08:20,000 --> 00:08:22,040 MP two and then I added one, 153 00:08:22,560 --> 00:08:25,840 I would end up back at zero or if I added two, I would end up at one. 154 00:08:26,680 --> 00:08:30,120 So that's called wrap around. 155 00:08:30,320 --> 00:08:32,720 So the next two for three of our axes rotation. 156 00:08:33,680 --> 00:08:34,080 So for 157 00:08:34,080 --> 00:08:38,120 rotation, you, you look at the bits of your free number. 158 00:08:38,120 --> 00:08:41,720 In this example, I've got 23, the number 23 in bits 159 00:08:42,280 --> 00:08:45,320 and you shift all the bits one way or another. 160 00:08:45,360 --> 00:08:47,360 In this example, I'm shifting to the left by one. 161 00:08:48,480 --> 00:08:51,120 So all digits move to left by one and then the digit 162 00:08:51,120 --> 00:08:54,040 that would kind of fall off the end comes back round to the start. 163 00:08:54,960 --> 00:08:57,960 And if I was shifting by two or three then there'd be two or four digits 164 00:08:58,320 --> 00:09:00,000 wrapping around back to the start. 165 00:09:00,000 --> 00:09:02,440 So in this example, I had 23. 166 00:09:02,440 --> 00:09:05,680 I shifted that for the left by one, rotated it, the left by one given, 167 00:09:06,200 --> 00:09:11,840 and the resulting bits are equal to 46 as a decimal. 168 00:09:12,040 --> 00:09:14,000 And finally we've got XOR. 169 00:09:14,000 --> 00:09:16,520 So fundamentally XOR is a function 170 00:09:16,840 --> 00:09:20,040 that takes a pair of bits as inputs and outputs one bit. 171 00:09:20,600 --> 00:09:23,760 If both of the input bits are the same, then the output is zero. 172 00:09:24,040 --> 00:09:26,280 And if they're not the same, then the output is one. 173 00:09:26,280 --> 00:09:29,120 And bitwise XOR, which is kind of what I mean 174 00:09:29,120 --> 00:09:32,240 when I say XOR in general is where you have two numbers. 175 00:09:32,240 --> 00:09:37,000 You take that bits and you apply that XOR function to each pair of bits H number. 176 00:09:37,320 --> 00:09:39,960 So in this example I've got seven being excited before 177 00:09:40,960 --> 00:09:43,200 and you can see how in the first column 178 00:09:43,400 --> 00:09:47,560 I've got 000 becoming zero because they're about the same. 179 00:09:47,880 --> 00:09:51,680 And in the last column you've got one being zero zero 180 00:09:51,680 --> 00:09:55,560 becoming one because they're different and that results in 0011 181 00:09:55,560 --> 00:09:58,840 at the end, which is equal to three, so seven x or four. 182 00:09:58,840 --> 00:10:01,680 And this is all in this example is three. 183 00:10:01,680 --> 00:10:03,920 And obviously there's only four bits in this example, 184 00:10:03,920 --> 00:10:06,400 but the exact same thing is done with 32 bits 185 00:10:06,720 --> 00:10:10,080 as part of the 29. Now. 186 00:10:11,320 --> 00:10:16,600 So with those three co operations adding rotate, rotating 187 00:10:16,640 --> 00:10:20,400 and so they're combined together to make the whole cipher. 188 00:10:21,040 --> 00:10:24,840 So Jojo 20 has 16 189 00:10:26,040 --> 00:10:28,200 1632 bit integers 190 00:10:28,200 --> 00:10:30,760 called the states and they're arranged in this 4x4 grid. 191 00:10:31,760 --> 00:10:35,880 Now the key is only 256 bits or 30 bytes, 192 00:10:36,360 --> 00:10:38,560 but there's 512 bits for states. 193 00:10:38,880 --> 00:10:42,360 So the key only occupies half of it and it occupies the middle two rows 194 00:10:42,360 --> 00:10:45,440 split up into 32 bit chunks for each number. 195 00:10:46,440 --> 00:10:49,800 Now at the top row we've got this phrase. 196 00:10:49,800 --> 00:10:53,000 It says Expand 32 byte K split 197 00:10:53,000 --> 00:10:55,400 across four four words. 198 00:10:56,000 --> 00:10:59,400 And what that is, is an example of a nothing up my sleeve number, 199 00:10:59,520 --> 00:11:01,440 which cryptographers call them. 200 00:11:01,440 --> 00:11:05,600 And that's basically just an arbitrary, completely arbitrary number. 201 00:11:06,240 --> 00:11:09,120 And that's their way of proving that they're not doing anything sneaky 202 00:11:09,120 --> 00:11:09,880 with that number. 203 00:11:09,880 --> 00:11:11,920 It is purely an arbitrary number. 204 00:11:12,240 --> 00:11:15,520 They're not trying to introduce any vulnerabilities into the cipher Somehow 205 00:11:16,520 --> 00:11:20,320 way. And on the lower row you've got the counter and the nonce. 206 00:11:20,720 --> 00:11:21,960 The nonsense quite important. 207 00:11:21,960 --> 00:11:25,600 It has to be unique for every message that you encrypt. 208 00:11:25,720 --> 00:11:29,880 If it's not, then you end up with the cipher being vulnerable 209 00:11:29,880 --> 00:11:32,560 and you could potentially crack some or all of the messages. 210 00:11:34,040 --> 00:11:36,160 And there's also the counter, which I'll explain in just a bit. 211 00:11:36,520 --> 00:11:39,760 So, yeah, 212 00:11:40,600 --> 00:11:43,360 so once you've got the site state set up, 213 00:11:43,360 --> 00:11:45,920 there are 20 rounds, hence the name chatter 20. 214 00:11:46,240 --> 00:11:49,640 And in each round, something called the quarter round function 215 00:11:49,880 --> 00:11:53,160 is applied to the columns of the states 216 00:11:54,240 --> 00:11:57,000 or rather on every other round. 217 00:11:57,000 --> 00:12:00,520 It's applied to the columns on ones, on the other rounds 218 00:12:00,760 --> 00:12:03,680 it's applied to the diagonal columns. 219 00:12:03,680 --> 00:12:06,360 So that quarter round function looks like this. 220 00:12:07,160 --> 00:12:09,960 This is the same function represented in two different ways. 221 00:12:10,200 --> 00:12:12,960 So on the left we've got a kind of pseudocode representation. 222 00:12:13,040 --> 00:12:16,120 It's not quite valid syntax, but it's close enough. 223 00:12:16,120 --> 00:12:19,720 And on the right we've got this kind of I should note the name of this diagram is, 224 00:12:19,720 --> 00:12:23,360 but it's a, a diagrammatic representation of the same operations 225 00:12:23,640 --> 00:12:26,400 where the yellow box represents addition. 226 00:12:26,760 --> 00:12:32,280 The blue circle is X0 and the green, the green box is rotation. 227 00:12:33,400 --> 00:12:36,240 So you do 2020 rounds of that 228 00:12:37,040 --> 00:12:39,720 and then finally you add 229 00:12:39,720 --> 00:12:42,120 the initial state into your current state. 230 00:12:43,240 --> 00:12:46,880 The reason that's done is to make sure that the cipher isn't invisible. 231 00:12:47,040 --> 00:12:49,880 If that wasn't done, you could kind of undo the quarter 232 00:12:49,880 --> 00:12:51,840 round functions and derive the key. 233 00:12:51,840 --> 00:12:54,680 So that's quite an important step there at the end. 234 00:12:54,680 --> 00:12:58,880 So finally, after all that's been done, you've kind of you've randomly mixed up 235 00:12:59,040 --> 00:13:01,680 all the old data that was in your initial state 236 00:13:01,800 --> 00:13:03,760 and I've got your pseudo random values 237 00:13:03,760 --> 00:13:07,120 and this is what you xor with your message that you're trying to encrypt 238 00:13:07,680 --> 00:13:10,760 and to decrypt it again, you are with the exact same, 239 00:13:12,120 --> 00:13:15,120 the exact same data and you end up back where you started. 240 00:13:15,120 --> 00:13:18,880 Because something I didn't mention earlier about XOR is 241 00:13:18,880 --> 00:13:22,520 that if you XOR two things together and get the results 242 00:13:23,040 --> 00:13:26,640 if you xor the result with one of your inputs, you'll always end up 243 00:13:27,040 --> 00:13:30,000 with the value of the other input, if that makes any sense. 244 00:13:30,920 --> 00:13:34,320 So XOR is kind of reversible like that. 245 00:13:38,080 --> 00:13:40,360 So enough enough cryptography theory. 246 00:13:40,520 --> 00:13:42,520 Now it's time to talk a bit about scratch. 247 00:13:42,520 --> 00:13:44,600 So I'm sure you've all heard of scratch at least to some extent, 248 00:13:45,240 --> 00:13:48,000 but you might not realize quite how popular it is. 249 00:13:48,000 --> 00:13:51,600 There have been over 100 million projects shared on the scratch websites. 250 00:13:52,000 --> 00:13:55,880 And by contrast, GitHub has about 150 million 251 00:13:56,040 --> 00:13:59,520 public repositories, which is of course more, but is not very much more. 252 00:13:59,760 --> 00:14:03,640 And I would have never have guessed they were in the same order of magnitude. 253 00:14:03,840 --> 00:14:06,760 And scratch is open source, which means I don't use reverse engineering. 254 00:14:06,800 --> 00:14:09,960 Fortunately, the source code for that runtime is on GitHub, 255 00:14:09,960 --> 00:14:13,240 so you can look at how their internals work, which is very useful. 256 00:14:13,800 --> 00:14:17,280 And I'm just going to give a quick demo 257 00:14:17,280 --> 00:14:20,760 of like the capabilities of Scratch won't take very long, hopefully. 258 00:14:21,040 --> 00:14:26,320 So I've got a little test project up here on the on the right here. 259 00:14:26,600 --> 00:14:29,240 You've got your sprites and you've got your stage. 260 00:14:30,480 --> 00:14:31,200 The stage is where the 261 00:14:31,200 --> 00:14:35,120 sprites of positions and on the left hand, you've got your scripts. 262 00:14:35,360 --> 00:14:38,080 So you write, you write code by dragging these blocks around 263 00:14:39,120 --> 00:14:41,000 and you've got all the standard control flow 264 00:14:41,000 --> 00:14:43,640 that you'd expect from any other programing language. You've got loops, 265 00:14:43,680 --> 00:14:46,680 you've got if statements, you've got variables, you've got lists, 266 00:14:47,800 --> 00:14:50,480 and you've got basic mathematical functions, 267 00:14:50,760 --> 00:14:54,440 addition, subtraction, etc.. 268 00:14:55,400 --> 00:14:58,880 And yeah, I'll just make 269 00:14:58,880 --> 00:15:01,720 I'll make the count spin around since we were talking about that earlier. 270 00:15:02,240 --> 00:15:05,280 Um, there are 271 00:15:05,960 --> 00:15:06,760 there you go. 272 00:15:06,760 --> 00:15:09,840 So that's just an example of the kind of things you can do in scratch 273 00:15:10,680 --> 00:15:14,480 in your own use. 274 00:15:17,200 --> 00:15:19,040 So scratch, 275 00:15:19,040 --> 00:15:21,720 although in theory it can do anything because it's Turing complete 276 00:15:22,120 --> 00:15:25,080 and it has quite a lot of limitations when it comes to trying 277 00:15:25,080 --> 00:15:26,040 to implement cryptography. 278 00:15:27,040 --> 00:15:28,280 So run. 279 00:15:28,280 --> 00:15:31,480 It's all the numbers in scratch 280 00:15:31,480 --> 00:15:33,640 because it's running on the JavaScript under the hood. 281 00:15:33,640 --> 00:15:36,120 They're all 64 bit floating point totals. 282 00:15:36,840 --> 00:15:38,080 And I'll explain. 283 00:15:38,080 --> 00:15:40,440 Sorry, 64 bit double precision floats, even. 284 00:15:41,360 --> 00:15:43,640 I'll explain what that means in detail a bit later, 285 00:15:44,000 --> 00:15:46,680 but it's not effective integers, so we can't just 286 00:15:47,520 --> 00:15:49,640 drop in the algorithm and have it work. 287 00:15:50,640 --> 00:15:54,800 It also lacks X or electrodes, which might sound like a bit of a problem, 288 00:15:54,960 --> 00:15:58,200 but I'll go into how I address those shortcomings in just a bit. 289 00:15:58,880 --> 00:16:01,040 And of course, you've got to use your mouse 290 00:16:01,920 --> 00:16:04,040 making longer scripts gets very tiring. 291 00:16:04,040 --> 00:16:05,720 Like you've just got to do 292 00:16:05,720 --> 00:16:08,640 miles and miles of dragging to get your blocks into position. 293 00:16:08,640 --> 00:16:12,040 And it's very tedious if you're a fast touch type in comparison. 294 00:16:12,720 --> 00:16:14,760 And also there's no version control. 295 00:16:15,200 --> 00:16:18,120 When I write code, normally I'm very used to committing it to get, 296 00:16:18,400 --> 00:16:21,360 and when I can't do that, I feel pretty lost and feel like I'm about to lose. 297 00:16:21,360 --> 00:16:26,800 On my changes at any point, which of course you can. So 298 00:16:28,120 --> 00:16:30,680 before I get into how I overcome those difficulties, 299 00:16:30,680 --> 00:16:35,760 I will explain how floating point numbers work so well. 300 00:16:35,760 --> 00:16:38,640 So you, as I said, scratch as variables are stored as floating point numbers. 301 00:16:39,120 --> 00:16:41,880 And if you're familiar with scientific notation in physics, 302 00:16:42,160 --> 00:16:43,640 it's a very similar concept. 303 00:16:43,640 --> 00:16:46,640 So rather than writing that the speed of light is 300 million 304 00:16:46,680 --> 00:16:49,440 meters per second, we write three times ten to the power eight 305 00:16:50,440 --> 00:16:54,400 and floating point numbers are basically exactly the same concept. 306 00:16:54,680 --> 00:16:58,680 You've got the exponent, which is like the green section of that diagram 307 00:16:58,920 --> 00:17:01,640 and the fraction, which is the red section of that diagram. 308 00:17:02,520 --> 00:17:05,120 And so in the speed of light example, 309 00:17:05,120 --> 00:17:08,720 three would be the fraction, an eight would be the exponent. 310 00:17:08,880 --> 00:17:09,920 And there's one extra bit 311 00:17:09,920 --> 00:17:11,960 that says whether it's a positive or negative number or not. 312 00:17:12,960 --> 00:17:15,920 So that can represent like arbitrary numbers. 313 00:17:15,920 --> 00:17:19,040 So you can have fractions, you can have really big numbers. 314 00:17:19,440 --> 00:17:22,040 But there's one catch 315 00:17:22,040 --> 00:17:24,320 if your integer goes above. 316 00:17:24,440 --> 00:17:28,840 So it's the power of 53 because there's 52 bits for the fraction. 317 00:17:29,120 --> 00:17:31,560 If it goes above that, then you start losing precision. 318 00:17:32,040 --> 00:17:34,320 And what that means is the accuracy of your 319 00:17:34,360 --> 00:17:37,400 your maths will start to drift from the correct values, 320 00:17:37,600 --> 00:17:43,040 which is a big problem if you're trying to do something precise like cryptography. 321 00:17:43,520 --> 00:17:45,760 But fortunately, because 322 00:17:46,920 --> 00:17:49,920 32 bit numbers only go up to two to the power 32, 323 00:17:50,240 --> 00:17:53,040 that's actually not much of a problem for 32 bit numbers. 324 00:17:53,040 --> 00:17:57,800 As long as we make sure that we keep keep our numbers rounded to whole numbers, 325 00:17:58,120 --> 00:18:02,040 we won't run out of precision. 326 00:18:02,040 --> 00:18:04,800 So the next the next challenge is bit was XOR 327 00:18:05,200 --> 00:18:08,240 and of course Scratch has no active operator, but 328 00:18:08,240 --> 00:18:12,520 as a bit of a cheat, we can make a list of every possible result. 329 00:18:12,520 --> 00:18:16,480 So we can make we can make a list of every pair of numbers. 330 00:18:17,000 --> 00:18:20,880 And what the result is if you hold them together, every pair of numbers that is. 331 00:18:21,440 --> 00:18:24,960 And so the list for eight bit numbers is 65,000 entries long. 332 00:18:25,400 --> 00:18:28,280 So of course I wrote a Python scripts to generate that list 333 00:18:29,240 --> 00:18:30,120 and then 334 00:18:30,120 --> 00:18:33,720 you can you can implement that and scratch like so in there. 335 00:18:34,200 --> 00:18:37,640 So scratch lists are only one dimensional, but we kind 336 00:18:37,640 --> 00:18:38,880 of got a two dimensional table here. 337 00:18:38,880 --> 00:18:42,640 So there's a little bit of arithmetic to, to convert the indexes 338 00:18:43,200 --> 00:18:46,520 and let's just has a plus one on the end because scratches lists 339 00:18:46,520 --> 00:18:49,680 start counting at one instead of zero, which is a bit annoying if you're used to 340 00:18:49,680 --> 00:18:54,160 it being starting at zero. 341 00:18:54,160 --> 00:18:55,800 Now that's for eight bit numbers. 342 00:18:55,800 --> 00:18:59,400 But for chapter 20 we need to XOR 32 bit numbers together 343 00:18:59,840 --> 00:19:01,840 and this is where it gets a bit crazy. 344 00:19:01,840 --> 00:19:04,480 So we need to split our 32 345 00:19:04,480 --> 00:19:06,920 bit inputs into eight bit chunks 346 00:19:08,000 --> 00:19:11,680 and then we xor those chunks together separately using the lookup table 347 00:19:11,880 --> 00:19:15,000 and then we recombine them into 132 bit results. 348 00:19:16,240 --> 00:19:17,200 And the methods for 349 00:19:17,200 --> 00:19:20,640 splitting the numbers up is called shifting and masking. 350 00:19:21,320 --> 00:19:24,480 So shifting is very much like 351 00:19:24,480 --> 00:19:28,480 the rotation that I mentioned earlier, except the numbers that fall off the end. 352 00:19:28,680 --> 00:19:30,960 You don't bring them back to the start, they just disappear. 353 00:19:31,560 --> 00:19:36,000 So shifting left is actually equivalent to multiplying by a power of two. 354 00:19:36,280 --> 00:19:40,560 So five shifted left by two is the same as multiplying five by four 355 00:19:41,280 --> 00:19:43,680 and conversely five shifted right 356 00:19:43,680 --> 00:19:46,680 by two is the same as dividing five by four. 357 00:19:46,960 --> 00:19:49,120 But you need to round down once you're done. 358 00:19:49,400 --> 00:19:52,240 Otherwise you would end up with a fraction number at the end. 359 00:19:52,240 --> 00:19:54,640 But you actually you want a whole number at the end. 360 00:19:54,640 --> 00:19:57,120 So if you round down five divided by four, it's 361 00:19:57,120 --> 00:20:00,720 exactly the same as doing a bit shifts right by two. 362 00:20:00,720 --> 00:20:02,360 And that can be done in scratch very easily. 363 00:20:03,720 --> 00:20:05,520 So we have 364 00:20:05,520 --> 00:20:08,240 the XOR lookup tables, the shifting and masking. 365 00:20:08,400 --> 00:20:10,760 Oh, I actually haven't covered masking yet. 366 00:20:10,760 --> 00:20:13,480 So masking is quite a common operation. 367 00:20:13,480 --> 00:20:18,200 You do in bitwise logic if you want to extract some portion of a number. 368 00:20:18,240 --> 00:20:23,160 For example, the last eight bits you might end up with 255, 369 00:20:23,640 --> 00:20:25,920 which what that essentially means is 370 00:20:26,280 --> 00:20:28,280 you extract the last eight bits at a number, 371 00:20:29,040 --> 00:20:32,040 the scratch doesn't have a bitwise and operator, but in a pinch 372 00:20:32,040 --> 00:20:34,880 we can use the modulo operator, which fortunately it does have. 373 00:20:35,240 --> 00:20:39,600 So if you take a number and you put it modulo 256, 374 00:20:39,600 --> 00:20:42,760 you extract the last eight bits of the value. 375 00:20:43,640 --> 00:20:46,360 And for example, if I wanted to get the second last eight bits, 376 00:20:46,480 --> 00:20:50,760 I could shift it first and then do the modulo and I would get the eight bits 377 00:20:50,760 --> 00:20:54,000 that were the second, so the second lot of bits of the number. 378 00:20:54,600 --> 00:20:57,120 So combining all that together with if. 379 00:20:57,120 --> 00:20:58,600 XOR Yeah. 380 00:20:58,600 --> 00:21:02,920 XOR masking and shifting, you have this horrible abomination of scratch code, 381 00:21:03,320 --> 00:21:07,160 it doesn't even fit on one slide, it's going off the off to the right. 382 00:21:07,440 --> 00:21:09,360 If I zoomed out all the way and it still wouldn't fit. 383 00:21:10,320 --> 00:21:13,400 So as you can see, this is pretty unwieldy. 384 00:21:13,880 --> 00:21:16,400 So I didn't want to have to write a whole program 385 00:21:16,640 --> 00:21:18,640 with hundreds of blocks like this. 386 00:21:18,640 --> 00:21:21,640 So I decided that I needed to automate this process somehow. 387 00:21:22,640 --> 00:21:25,440 So I'm sure you've all seen the XKCD comic 388 00:21:25,800 --> 00:21:28,560 about some writing, writing tools. 389 00:21:28,560 --> 00:21:30,480 You'll always spend more time 390 00:21:30,480 --> 00:21:32,640 writing the tools than you do spend actually using them, 391 00:21:32,880 --> 00:21:35,200 and you'll probably never finish them either, which is very true, 392 00:21:36,720 --> 00:21:41,160 but I still did it so. 393 00:21:41,160 --> 00:21:43,720 So scratch scripts when you save them to a file 394 00:21:44,920 --> 00:21:50,400 they are, they're stored inside a file called with Don't SP3 395 00:21:50,400 --> 00:21:53,120 extension, which is basically just a zip file in disguise. 396 00:21:53,440 --> 00:21:57,760 And inside that zip file is adjacent file, which contains a list 397 00:21:57,760 --> 00:21:59,280 of all the blocks in your scripts 398 00:21:59,280 --> 00:22:02,280 and that contains all the attributes, like what their arguments are, 399 00:22:02,280 --> 00:22:04,680 what their position is, and what other books they connected to. 400 00:22:06,240 --> 00:22:07,320 So rather 401 00:22:07,320 --> 00:22:10,640 than dragging and dropping books around to the mouse, I could, in theory, 402 00:22:11,040 --> 00:22:15,000 write some code to generate this Jason file with all the blocks already in it. 403 00:22:15,440 --> 00:22:17,080 And that's exactly what I did. 404 00:22:17,080 --> 00:22:18,960 I made a Python library. 405 00:22:18,960 --> 00:22:21,000 I called Boyega 406 00:22:21,360 --> 00:22:23,160 for generating a scratch code. 407 00:22:23,160 --> 00:22:24,120 Now, the reason I called it 408 00:22:24,120 --> 00:22:27,520 that is because there's apparently a species of snake called Boyega 409 00:22:27,520 --> 00:22:31,200 Crostini, which supposedly is nicknamed the Cat Snake. 410 00:22:31,440 --> 00:22:34,080 I'm not quite sure why, because I don't see the resemblance myself. 411 00:22:34,080 --> 00:22:36,600 But that's why. That's why I picked the name. 412 00:22:36,600 --> 00:22:39,960 Because the scratch mascot is a cat, and the Python logo is a snake. 413 00:22:39,960 --> 00:22:42,560 So you put the two together and you've got a Boyega. 414 00:22:43,440 --> 00:22:47,240 So this this library that I made is is a bit weird. 415 00:22:47,240 --> 00:22:50,680 It's not quite converting Python code to scratch code. 416 00:22:51,160 --> 00:22:54,240 What it is, is a library for expressing scratch 417 00:22:54,600 --> 00:22:57,200 syntax, a scratch code for Python syntax. 418 00:22:57,720 --> 00:23:01,320 So it's actually more powerful than translating Python into scratch 419 00:23:03,000 --> 00:23:04,760 because you can, you can 420 00:23:04,760 --> 00:23:07,160 write a program to generate other programs 421 00:23:08,880 --> 00:23:10,920 or something. 422 00:23:11,160 --> 00:23:13,520 So I'm just going to show you a quick example. 423 00:23:14,280 --> 00:23:16,800 So with that horrible XOR example I showed you 424 00:23:16,800 --> 00:23:19,800 before can actually be 425 00:23:19,800 --> 00:23:22,320 implemented fairly concisely 426 00:23:23,440 --> 00:23:24,400 in Python. 427 00:23:24,400 --> 00:23:27,480 This might look more complicated, perhaps, but it's much easier to work with 428 00:23:27,480 --> 00:23:30,520 because you don't have to drag and drop million books around. So. 429 00:23:30,520 --> 00:23:32,560 So this is an example of co-written with the library 430 00:23:33,920 --> 00:23:34,680 here. 431 00:23:34,920 --> 00:23:38,040 Actually, I might zoom in a bit so you can see it, but 432 00:23:39,000 --> 00:23:40,800 okay, so here 433 00:23:40,800 --> 00:23:43,680 I'm declaring to scratch variables 434 00:23:44,120 --> 00:23:46,880 and I x 435 00:23:46,920 --> 00:23:49,800 all of them together and make the cat say the result. 436 00:23:50,400 --> 00:23:54,000 And here I'm doing all the shifting and masking and lookup tables 437 00:23:54,000 --> 00:23:55,760 and I talked about earlier 438 00:23:55,760 --> 00:23:58,280 and it's all implemented through Python syntax 439 00:23:58,640 --> 00:24:02,800 and you'll notice that I'm using the Python bit shift operator here that is 440 00:24:03,840 --> 00:24:04,480 under the hood 441 00:24:04,480 --> 00:24:07,840 translates that into the scratch books, just like I described earlier. 442 00:24:08,680 --> 00:24:11,880 And if I run my script, which hopefully still works, 443 00:24:12,920 --> 00:24:16,440 exploit it, um, open up around the scripts 444 00:24:19,440 --> 00:24:25,040 and if I open that in scratch. 445 00:24:28,920 --> 00:24:31,560 So this code was just generated by that script that I just ran. 446 00:24:32,160 --> 00:24:35,400 And if I, if I click that green flag, I shall make it full screen. 447 00:24:36,040 --> 00:24:37,880 I see the scratch that says Hello World. 448 00:24:37,880 --> 00:24:43,040 And then Axel's two numbers together and tells us the result. 449 00:24:44,280 --> 00:24:50,520 So? So 450 00:24:50,520 --> 00:24:53,880 now that I can write scratch code much more easily, effectively in Python. 451 00:24:54,480 --> 00:24:55,800 The sky's the limit. Really. 452 00:24:55,800 --> 00:24:58,920 So this, this is what I came 453 00:24:58,960 --> 00:25:07,760 up. So. 454 00:25:07,880 --> 00:25:08,640 So what you're looking at 455 00:25:08,640 --> 00:25:11,640 here is all the almost life, as I mentioned in the initial slide. 456 00:25:11,840 --> 00:25:15,640 So we've got the Cheshire Cat of 20.35, we've got the 457 00:25:17,240 --> 00:25:19,560 go to go now go to hashing 458 00:25:19,960 --> 00:25:23,040 and we've got the x 5519 key exchange algorithm. 459 00:25:23,680 --> 00:25:27,160 Now, I wish I had more time to explain what every bit of this does in detail. 460 00:25:27,480 --> 00:25:31,320 So everyone I find me afterwards and ask me, but for now 461 00:25:31,320 --> 00:25:35,840 I will just show you it working because I think it is very cool. 462 00:25:36,000 --> 00:25:40,320 So the code, the code for that you just saw on the screen 463 00:25:40,320 --> 00:25:44,360 that was generated by this Python script and it's actually pretty short. 464 00:25:44,360 --> 00:25:48,120 There's only 24 lines of typing in this file, but it's all calling 465 00:25:48,120 --> 00:25:52,120 into libraries that I wrote again, using using this library. 466 00:25:52,440 --> 00:25:55,800 So I've I've effectively written a cryptography library from scratch 467 00:25:56,160 --> 00:26:00,840 that you can use with my tool to embed cryptography within any scratch projects. 468 00:26:01,680 --> 00:26:05,440 And so if I generate the scratch code just now 469 00:26:06,720 --> 00:26:08,880 and open it and scratch 470 00:26:10,640 --> 00:26:12,360 it takes a little bit to it 471 00:26:12,360 --> 00:26:17,440 because there's almost 4000 books in here. 472 00:26:17,440 --> 00:26:24,120 But I'm just going to I'm going to shrink that down a bit. 473 00:26:24,360 --> 00:26:28,240 And Scratch has this handy function code, clean up books that artfully arranges 474 00:26:28,240 --> 00:26:29,800 your books into a joint list. 475 00:26:29,800 --> 00:26:32,160 So I can I can scroll for 476 00:26:34,400 --> 00:26:34,760 that. 477 00:26:36,480 --> 00:26:38,400 But actually, if we get down to the bottom, 478 00:26:38,400 --> 00:26:42,000 we've got this is like the main the main loop effect or the main function even. 479 00:26:42,360 --> 00:26:47,000 And if I, if I Zaman is doing fairly readable things. 480 00:26:47,280 --> 00:26:50,920 So we've got some C 481 00:26:51,520 --> 00:26:54,240 where I'm initializing a random number generator, 482 00:26:54,560 --> 00:26:59,280 we're getting some random bytes, we're performing an X to 5519 483 00:26:59,280 --> 00:27:02,040 scalar multiplication, which is part of the key generation process, 484 00:27:03,280 --> 00:27:04,520 etc. etc. 485 00:27:04,520 --> 00:27:07,120 So now I will show you actually doing something. 486 00:27:07,680 --> 00:27:09,360 So here we go. 487 00:27:09,360 --> 00:27:12,480 Sorry, just Jennifer, what I just did here is it generated 488 00:27:12,560 --> 00:27:15,840 a next two 5519 keeper. 489 00:27:15,840 --> 00:27:20,200 It generated a public no, it did a different human key 490 00:27:20,200 --> 00:27:23,280 exchange against public here that I embedded within the program 491 00:27:23,600 --> 00:27:26,400 generated a shared secret and then encrypted a message 492 00:27:27,120 --> 00:27:30,120 or sorry derived the session key from the shared secret. 493 00:27:30,400 --> 00:27:33,000 And now if I type something in here like hello, 494 00:27:34,120 --> 00:27:35,680 it encrypts it very quickly. 495 00:27:35,680 --> 00:27:38,280 And I'm sure you just noticed that was like a blink of an eye. 496 00:27:38,280 --> 00:27:41,640 So I'm quite glad that it was it was that fast after that. 497 00:27:42,480 --> 00:27:46,640 And also, obviously in this demo, the key is being printed on the screen. 498 00:27:46,640 --> 00:27:49,200 You probably wouldn't want to do that in any real application. 499 00:27:49,560 --> 00:27:53,040 So if it if anyone if anyone wants to try decrypting this message, 500 00:27:53,040 --> 00:28:02,760 you should be able to use standard, standard tools to decrypt it. 501 00:28:02,760 --> 00:28:03,680 So yeah, that's it. 502 00:28:03,680 --> 00:28:09,040 Thank you for listening, everybody. 503 00:28:09,040 --> 00:28:12,720 Give it up for David Buchanan.