Part Dalawa - Still trying to write 'Hello, world'
For those few of you who read the first part of this tutorial series, "Part Isa - Introduction, Manners & 'Hello, world'", you would know that we are roughly 8% of our way to completing the classic program 'Hello, world' in INTERCAL. In short, we had managed to write the character 'h' to our standard output but we were not entirely sure how we had achieved this. In this tutorial we will boldly attempt to finish writing our first INTERCAL program and possibly try to understand how the program actually works.
INTERCAL has very good support for the input and output of numbers. In accepting numeric input the following would be allowed:
EIGHT OH NINE
EIGHT ZERO NINER
WALO WALA SIYAM
:where all of these inputs mean '809', (the last being written in the Tagalog language). The 'WRITE IN' statement accepts numbers written in English, Sanskrit, Basque, Tagalog, Classical Nahuatl, Georgian, Kwakiutl, and Volapuk. INTERCAL will output numbers in Roman numerals, so in the example above we could use 'READ OUT' to output the value 809 as DCCCIX.
The programmer desiring to handle input on a character basis should consider using another language.
Indeed character input in INTERCAL is handled in a fashion that is significantly different from all other programming languages and as we will see soon character output is even more unique. To understand text input and output in INTERCAL we need to understand the Turing Text Model. It is possibly best described using the following diagram which I drew on the back of an envelope:
We can imagine that INTERCAL has a circular input tape with all of the 256 available characters printed on it. INTERCAL also has an 'input head' which is positioned at the location of the last character entered by the user. The input head starts at position 0 (ASCII 0) when your INTERCAL program starts. If the user types 'g', as in the diagram, the input head will be moved to the 'g' on the input tape and the decimal value 103 (g is ASCII 103) will be stored in the first position of your array.
So far, so good. It is when the user keeps on typing that things get tricky. If the user finishes typing the word 'goat', by typing 'oat', the following will result:
o: The input head will move to the right by 8 positions to reach the 'o' from its initial position of 'g', so the decimal value 8 will be stored in the second position of your array.
a: The input head can only travel to the right. So to reach the letter 'a', it must travel past the end of the 256 available characters and keep traveling until it reaches 'a'. To do this, it must travel 242 positions. So the decimal value 242 will be stored in the third position of the array.
t: As with the simple case of 'o', the input head travels from 'a' to 't', storing the decimal value 19 in the fourth position of the array.
Simple.
As with input in INTERCAL, described in the previous section, there is an output tape with all 256 ASCII characters printed on it and an output head. The tape travels in the same direction as the input tape, but the output tape is on the inside of the tape. This results in two subtle differences:
1. The numbers required to move the head from one position to another are different because the output head is effectively moving in the opposite direction to the corresponding input head.
2. Because the output head is on the inside of the tape, it sees the binary representation of the ASCII characters printed on it backwards. For example, to print the ASCII character 'b', binary 0110 0010, you would need to move the output head to the ASCII 'F', binary 0100 0110.
As with the input head, the output head starts at position zero. We can calculate the required head moves for the string 'Hello, world' in the following table:
:and so on. Continuing on these calculations you end up with the program:
DO ,1 <- #13
PLEASE ,1SUB#1 <- #238
DO ,1SUB#2 <- #108
DO ,1SUB#3 <- #112
DO ,1SUB#4 <- #0
DO ,1SUB#5 <- #64
PLEASE ,1SUB#6 <- #194
PLEASE ,1SUB#7 <- #48
DO ,1SUB#8 <- #22
DO ,1SUB#9 <- #248
DO ,1SUB#10 <- #168
DO ,1SUB#11 <- #24
DO ,1SUB#12 <- #16
DO ,1SUB#13 <- #214
DO READ OUT ,1
PLEASE GIVE UP
:which when compiled and run gives the enormously satisfying output:
Hello, world
I have written 'Hello, world' in many different programming languages over the past many years but I have never felt the sense of achievement that writing 'Hello, world' in INTERCAL has given me. Imagine the pride I would I feel if I built a small operating system using INTERCAL?
'Hello, world' has only scratched the surface of the power and flexibility of INTERCAL. The astute reader will note that our program is quite linear, running from start to finish without branching or looping. In the next tutorial in my INTERCAL series we will explore some of the options that INTERCAL provides us with to create more complex programs.
INTERCAL has very good support for the input and output of numbers. In accepting numeric input the following would be allowed:
EIGHT OH NINE
EIGHT ZERO NINER
WALO WALA SIYAM
:where all of these inputs mean '809', (the last being written in the Tagalog language). The 'WRITE IN' statement accepts numbers written in English, Sanskrit, Basque, Tagalog, Classical Nahuatl, Georgian, Kwakiutl, and Volapuk. INTERCAL will output numbers in Roman numerals, so in the example above we could use 'READ OUT' to output the value 809 as DCCCIX.
Character input
To write our 'Hello, world' program, we are much more interested in character output rather than Roman numeric output. To understand character output in INTERCAL, you must first understand character input. When speaking of character input, the INTERCAL manual says:The programmer desiring to handle input on a character basis should consider using another language.
Indeed character input in INTERCAL is handled in a fashion that is significantly different from all other programming languages and as we will see soon character output is even more unique. To understand text input and output in INTERCAL we need to understand the Turing Text Model. It is possibly best described using the following diagram which I drew on the back of an envelope:
We can imagine that INTERCAL has a circular input tape with all of the 256 available characters printed on it. INTERCAL also has an 'input head' which is positioned at the location of the last character entered by the user. The input head starts at position 0 (ASCII 0) when your INTERCAL program starts. If the user types 'g', as in the diagram, the input head will be moved to the 'g' on the input tape and the decimal value 103 (g is ASCII 103) will be stored in the first position of your array.
So far, so good. It is when the user keeps on typing that things get tricky. If the user finishes typing the word 'goat', by typing 'oat', the following will result:
o: The input head will move to the right by 8 positions to reach the 'o' from its initial position of 'g', so the decimal value 8 will be stored in the second position of your array.
a: The input head can only travel to the right. So to reach the letter 'a', it must travel past the end of the 256 available characters and keep traveling until it reaches 'a'. To do this, it must travel 242 positions. So the decimal value 242 will be stored in the third position of the array.
t: As with the simple case of 'o', the input head travels from 'a' to 't', storing the decimal value 19 in the fourth position of the array.
Simple.
Character Output
When it comes to character output, there is some good news and some bad news. The good news is that the input tape and output tape (and their corresponding heads) are independent, which is much simpler than if they were connected. The bad news is that the previous sentence is the only good news.As with input in INTERCAL, described in the previous section, there is an output tape with all 256 ASCII characters printed on it and an output head. The tape travels in the same direction as the input tape, but the output tape is on the inside of the tape. This results in two subtle differences:
1. The numbers required to move the head from one position to another are different because the output head is effectively moving in the opposite direction to the corresponding input head.
2. Because the output head is on the inside of the tape, it sees the binary representation of the ASCII characters printed on it backwards. For example, to print the ASCII character 'b', binary 0110 0010, you would need to move the output head to the ASCII 'F', binary 0100 0110.
As with the input head, the output head starts at position zero. We can calculate the required head moves for the string 'Hello, world' in the following table:
Head position | Required output | Required binary | Reverse binary | Required head position | Move head by |
---|---|---|---|---|---|
0 | H | 0100 1000 | 0001 0010 | 18 | 238 |
18 | e | 0110 0101 | 1010 0110 | 166 | 108 |
166 | l | 0110 1100 | 0011 0110 | 54 | 112 |
54 | l | 0110 1100 | 0011 0110 | 54 | 0 |
:and so on. Continuing on these calculations you end up with the program:
DO ,1 <- #13
PLEASE ,1SUB#1 <- #238
DO ,1SUB#2 <- #108
DO ,1SUB#3 <- #112
DO ,1SUB#4 <- #0
DO ,1SUB#5 <- #64
PLEASE ,1SUB#6 <- #194
PLEASE ,1SUB#7 <- #48
DO ,1SUB#8 <- #22
DO ,1SUB#9 <- #248
DO ,1SUB#10 <- #168
DO ,1SUB#11 <- #24
DO ,1SUB#12 <- #16
DO ,1SUB#13 <- #214
DO READ OUT ,1
PLEASE GIVE UP
:which when compiled and run gives the enormously satisfying output:
Hello, world
I have written 'Hello, world' in many different programming languages over the past many years but I have never felt the sense of achievement that writing 'Hello, world' in INTERCAL has given me. Imagine the pride I would I feel if I built a small operating system using INTERCAL?
'Hello, world' has only scratched the surface of the power and flexibility of INTERCAL. The astute reader will note that our program is quite linear, running from start to finish without branching or looping. In the next tutorial in my INTERCAL series we will explore some of the options that INTERCAL provides us with to create more complex programs.
Labels: character input, character output, goat, Reverse Binary, Turing Text Model
9 Comments:
Don't know if it helps, but I put together an INTERCAL Cheatsheet a while back.
This series is classic. :D
I have written 'Hello, world' in many different programming languages over the past many years but I have never felt the sense of achievement that writing 'Hello, world' in INTERCAL has given me.
it could be worse, it took genetic programming to breed a working "Hello, world" program for malbolge. it was the first program written in malbolge, and it came out two years after the language was released.
To create a birthday greeting for a friend, I wrote a Google spreadsheet that simplifies the process of determining the values to store in the array:
https://docs.google.com/spreadsheets/d/12Qoq4T1vq5Q291ZUgW7C_KhpsIKW7Nm9pwtEQYD7rK4/edit?usp=sharing
DO ,1 <- #6
PLEASE ,1SUB#1 <- #94
DO ,1SUB#2 <- #156
DO ,1SUB#3 <- #32
DO ,1SUB#4 <- #96
DO ,1SUB#5 <- #88
PLEASE ,1SUB#6 <- #52
DO READ OUT ,1
PLEASE GIVE UP
Just arrived at this post, having noticed that the code was lifted for the wikipedia article and slightly improved by the addition of a new feature, the '!' at the end of Hello, world!
Since I was running the result on a Linux terminal, the lack of a newline was bothering me, so I have added a newline. Also, I like well commented code, so I figured I would annotate the code properly (which also serves to demonstrate the impressively expressive and polite commentary capabilities of this excellent language). The result:
DO NOTICE THAT THE FOLLOWING SETS UP AN ARRAY OF 14 ELEMENTS
DO NOT FORGET THAT ANGLE-WORM IS THE GETS OPERATOR AND ALSO
PLEASE DO NOT FORGET THAT AN ARRAY GETS ITS SIZE FROM A CONSTANT AS A
SCALAR GET OPERATION
DO ,1 <- #14
DO NOT FORGET THAT CHARACTERS FOR OUTPUT ARE ON THE OUTPUT TURING TAPE LOOP
PLEASE DO NOTICE THAT CAPITAL H IS POSITION 238 ON THE OUPTUT TURING LOOP
BECAUSE THE OUTPUT CHARACTER LOOP IS ON THE INSIDE OF THE TAPE SO THE BITS ARE
MIRRORED RELATIVE TO THE INPUT TURING LOOP MEANING THE OUTPUT LOOP REQUIRES
PROPER ADJUSTMENT OF THE ASCII VALUE AND EFFECTIVELY ROTATES BACKWARD. THE
POSITION OF H WHICH IS 01001000 OR DECIMAL 12 WHEN MIRRORED IS 00010010 OR
DECIMAL 18 WHICH IS REALLY -18 OR 238 SPACES ON THE LOOP FROM 0
DO ,1 SUB #1 <- #238
DO NOT FAIL TO UNDERSTAND THAT e IS THEN 108 POSITIONS BEYOND H FOR SIMILAR
REASONS
DO ,1 SUB #2 <- #108
DO ,1 SUB #3 <- #112
PLEASE DO NOTHING BUT REALIZE THAT WE ARRIVED AT l AND WANT TWO OF THEM SO WE
DON'T WANT TO MOVE THE READ HEAD
DO ,1 SUB #4 <- #0
DO ,1 SUB #5 <- #64
DO NOT OVERLOOK THAT WE HAVE LOADED Hello ABOVE
DO NOT FORGET THE COMMA AND SPACE
DO ,1 SUB #6 <- #194
PLEASE DO ,1 SUB #7 <- #48
PLEASE NOTICE THAT WE ARE STARTING THE world PART HERE
DO ,1 SUB #8 <- #22
DO ,1 SUB #9 <- #248
DO ,1 SUB #10 <- #168
DO ,1 SUB #11 <- #24
DO ,1 SUB #12 <- #16
PLEASE NOTICE WE FINISHED THE world PART
DON'T FORGET THE EXCLAMATION POINT
DO ,1 SUB #13 <- #162
PLEASE DO NOT LEAVE OFF THE NEWLINE AS ITS OMISSION MAKES THE OUTPUT
AESTHETICALLY UNSTAISFACTORY ON MOST TERMINALS
DO ,1 SUB #14 <- #52
DON'T FORGET TO PUT THE RESULT ON STANDARD OUTPUT
PLEASE READ OUT ,1
PLEASE GIVE UP
It occurs to me that the above can be misread as suggesting I think the code here was lifted from wikipedia. I am quite confident that that is NOT the case, but the reverse is, in fact, true. I applaud this site for providing an excellent starting point for a rewarding career in Intercal programming.
Hey 'Unknown',
I can confirm that the code for 'Hello World' was my own work, as your second comment says. Your first comment was pretty clear about which direction the lifting happened in, which is fine by me.
I spent hours getting my Hello World working back in 2007 so if it helps some young orphan child browsing Wikipedia then that is excellent. INTERCAL programmers are the future of the human race. It would have been nice if the Wikipedia author could have cited my excellent three post blog, but I am not bitter. Not bitter at all.
I am a little bitter, to be honest.
Sorry about the 'unknown' part of all of this. I can figure out how to add a newline to your code, but I can't figure out how to make Google share my nickname without sharing a bunch of other stuff that I would just as soon keep private.
Now that gate level logic and assembly programming are no longer considered an important part of a computer science education, the only way to teach what bits are and how they work is Intercal. If you can wrap your head around mingle, select and unary logic operators you will be okay, albeit a bit twisted. Honestly, I can imagine that it took you hours. It took me about half an hour to add the newline, which is just one character. Admittedly 15 minutes of this was spent figuring out the implications of the write tape mechanics, but the rest was just twiddling bits on paper (where nature intended) to get the head movement right. Multiply that by 13 characters, and I would estimate at least 3 and a half hours of steady analysis to arrive at "Hello, world!"
Behold the power of Intercal!
Aha! Now I have truly conquered, I figured out how to have a real nickname!
Post a Comment
<< Home