Feeds

back to article First C compiler pops up on Github

If you have a nostalgic turn of mind, there's a new posting over on Github that you'll just love: the earliest known C compiler by the legendary Dennis Ritchie has been published on the repository. It's not new: long before his death in 2011, Ritchie wrote about the effort to find, recover and preserve the early work on C here …

COMMENTS

This topic is closed for new posts.

Page:

Silver badge

Allow me

The compiler can be found at https://github.com/mortdeus/legacy-cc ; the oddest bit to my eyes is the apparent need explicitly to declare storage e.g. as at the bottom of https://github.com/mortdeus/legacy-cc/blob/master/last1120c/c00.c — my experience goes only a little back beyond C89 so it's possible I'm completely misreading what's going on but it looks like the equivalent of an assembler's defb or equivalent, with extern being used in functions to import globals (so maybe scope wasn't well established yet?). Can anyone enlighten me?

2
0
Silver badge

Re: Allow me

They're initialised variables with file scope - visible to all routines in that file if declared as extern within the function.

I'm not 100% sure but if they were not declared within the file then the compiler would look in the global space if they were declared extern in a function.

I don’t know if the rest of the tools are available (see the Unix rebuild) but it would be fun to play with re-building the compiler with the code...

0
0
Silver badge

Re: Allow me

I think they're a list of variables with global scope; there are two programs (c0?.c and c1?.c) and the lists of variables only appear in c00.c and c10.c. It looks like your global variables went in the first .c/.o file whether you liked it or not (unless it was a style of programming they consciously chose).

Agree that in functions it looks like variables declared extern were global and variables not declared extern were local.

0
0
Silver badge
Thumb Up

Re: Allow me

> It looks like your global variables went in the first .c/.o file whether you liked it or not

My PDP-11 use was mostly RT-11/RSX-11M, not Unix, and on those OSes you had to manage linker overlays yourself, by defining which bits of the address space were permanently in memory, and which bits could be swapped in and out from disk. If the Unix environment used the same linker model it would make sense to keep all globals together in the permament space (I forget what that was officially called), with the overlay control code, so that as each overlay came and went the variables would always be available.

Maybe this summer is a good time to see if my PDP-11/83 still boots... :)

1
0
Silver badge
Boffin

Re: Allow me @Phil

Bell Labs PDP11 UNIX V7 and earlier did not have any support for overlays. I worked with them on RSX-11M, so I understand what you are talking about.

As far as I am aware, there was some prototype overlay code in the later BSD PDP11 releases, but it would only work on a machine with 22 bit addressing and separate Instruction and Data space (I&D) machines (11/70, 11/44 and later systems). Before this, the standard trick used for large software programs (and I saw this done for the BSD release of Ingres) was to split large programs into several executables, and use some proto-IPC interface to spread the function around. IIRC, Ingres from a BSD 2.3 tape used named pipes. Shared memory and message queues were all in the future, but I believe that the was a primitive semaphore implementation in UNIX V7. Have to look to find out.

There was some work done by Keele University in the UK to produce an overlay loader for UNIX V7 on PDP11s, which I managed to get working on my Systime 5000E (a strange beast, being a PDP11/34E [normally 18 bit addressing, no I&D], but actually with 22 bit addressing added by Systime). I used it with some success, but I never managed to get VI working on my small machine. It was all good fun, as was debugging the Calgary device buffer modifications to maximise the number of device drivers you could compile into the kernel on this 22bit non-I&D machine. Out of the box, the mods assumed that if you had 22 bit addressing, you had to have separate I&D spaces, because no DEC PDP11 did not.

Fun times, long gone.

0
0
Silver badge

Re: Allow me

C was then just the idea of a portable macro assembler to make it faster to port UNIX to new machines.

Hence the awful reliance even today on the #define abhorred by Bjarne Stroustrup.

1
0
Silver badge

#deine is C-s "Killer feature"

No matter what BS says, one of the main reasons C continues to thrive - particularly in embedded system - is that it has an embedded macro language (#define et al).

Like any tool, #defines can be used poorly or well. Slagging off #defines just becuase they are sometimes used poorly is plain stupid.

Used judiciously,the C macro-preprocessor can simplify code, improve abstraction and do a lot of stuff that is impossible to do in other languages.

4
0
Anonymous Coward

Re: Allow me

I think the following gives an impression of some of the difficulties faced by people at the time:

"A second, less noticeable, but astonishing peculiarity is the space allocation: temporary storage is allocated that deliberately overwrites the beginning of the program, smashing its initialization code to save space. The two compilers differ in the details in how they cope with this. In the earlier one, the start is found by naming a function; in the later, the start is simply taken to be 0. This indicates that the first compiler was written before we had a machine with memory mapping, so the origin of the program was not at location 0, whereas by the time of the second, we had a PDP-11 that did provide mapping. (See the Unix History paper). In one of the files (prestruct-c/c10.c) the kludgery is especially evident."

0
0
Silver badge

PDP-8 any good?

Used to have one in the garage but I think it's now gone to a better (less damp) home.

0
0

Re: PDP-8 any good?

Don't think so, I worked with PDP-8 derived HP1000s which had a totally different instruction set/register architecture to the PDP-11. Much as I loved the HP1000s, I have to admit that the PDP-11 had a much more elegant architecture.

0
0
Silver badge

Re: PDP-8 any good?

The PDP11 was a 16 bit machine - while you could theoretically get this working on the pdp8 (which had 12 bit addressing IIRC) it would require a heavy re-write to do lots of paging - you remember the stuff in Dos/Windows that held up computing for 15 or so years.

0
0
Headmaster

Re: PDP-8 any good?

Much, much too hard. Main problem was no stack on the PDP-8 hardware. Memory addressing was very limited at 128 words in the local page and the zero page. The PDP-11 was like a breath of fresh air with a nice flat address space and excellent addressing modes in the instruction set.

That said, I'd love a PDP-8/e with 32K of core, and a disc drive. The only one I used had an ASR33 for paper tape and interactive I/O (plus full A/D and D/A stuff).

Phil.

0
0
Silver badge

I'll delve into my archives ...

I should have similar available to me. I archived the tapes to my "home cloud" in roughly 1990. The data should still be accessible.

Not certain if I'm allowed to provide Bell/Berkeley Unics & UnixTSS code to the general public, though ... and not sure who to ask for permission, either. It's a can o'worms at this point ...

8
3
Bronze badge

Re: I'll delve into my archives ...

Someone downvoted your comment!! Why?

1
3
Silver badge

@Condiment (was: Re: I'll delve into my archives ... )

"Someone downvoted your comment!! Why?"

Because there are a few folks who choose to downvote my comments on sight, regardless of content. The reasoning is in the Why?eye of the beholder.

::shrugs:: Methinks most of 'em should look within. Regardless, no skin off my teeth.

4
4
Silver badge
Holmes

Re: @Condiment (was: I'll delve into my archives ... )

> Because .. no skin off my teeth.

This!!

I once saw a TV film about some german hacker/carder guy in his teens who ordered up an old PDP-11 because of its nostalgia value. He was then rather surprised when a truck showed up at his door...

Sadly, the guy was later found hanged in a public park. Apparently "suicide" but then he had become involved with turkish mafia.

1
1
Headmaster

Re: mixing your metaphors

"no skin off my teeth"

It's either no skin off your nose, or you escape by the skin of your teeth

4
2
Silver badge

Re: I'll delve into my archives ...

I would have thought that the Berkeley code would be re-distributable. The Berkeley software license was pretty permissive from the work go.

I'd love to know about the UnixTSS myself. Not because I have any (I obeyed the rules and always left it behind when I changed jobs), but I would love to see some of it again, especially the STREAMS and RFS code.

I just wish I had taken copies of the Bell Labs V6 PDP11 distribution, and the BSD 2.1 and 2.3 tapes I worked with in the very early '80s. I know that V7 and a later PDP-11 BSD tape images are available, but by that time, they were already getting difficult to work with on non-separate PDP11 systems.

0
0
Bronze badge

Re: mixing your metaphors

So no possibility that the author well knew these two metaphors, but mixed them for comedic effect?

"Metaphors be with you..."

1
0
Silver badge

@DaM (was: Re: @Condiment (was: I'll delve into my archives ... ))

"> Because .. no skin off my teeth.

This!!""

Oh. I see. Provincial insular xenophobia rearing it's ugly head.

Makes sense. I guess. Because, as everybody knows, the English Language has never changed. Ever. Despite the obvious fact that most of the technical terms used on ElReg were not actually invented within 3000 miles of the British Isles.

Kinda makes you sound French, Destroy All Monsters. HTH, HAND.

1
3
Silver badge

@Peter Gathercole (was: Re: I'll delve into my archives ...)

The post-AT&T BSD code is available online. The code I'm talking about came to Berkeley from Bell Labs along with ken ... Was an interesting time.

I'd have trashed it decades ago, but I haven't changed that particular job yet ;-)

0
1
Silver badge

@Anonymous IV (was: Re: mixing your metaphors)

Actually, it's a common variation in Blue Collar circles here in Northern California.

::shrugs::

0
2
Anonymous Coward

Re: mixing your metaphors

I'm pretty sure he knows that. The deliberate mixing of metaphors is a form of meta-metaphor, used to express boredom and disinterest. Sometimes rules are deliberately broken, because fuck rules.

0
0
Bronze badge

Re: mixing your metaphors

"no skin off my teeth"

It's either no skin off your nose, or you escape by the skin of your teeth

That's a mixing of idioms, or idiom-blending, a species of malapropism. I don't offhand see how to construe this as a mixing of metaphors. (You could claim that in the phrase losing skin is a metaphor for experiencing insult or irritation, but then it's not clear what teeth are the vehicle for. Or that teeth represent feelings, but then skinning simply represents injury, which is in keeping with the original metaphor.)

I assumed the blended idiom was deliberate; blending idioms is a common trope in contemporary US English, and indeed has been since at least the middle of the twentieth century. (Note its use in Kelly's Pogo, for example, alongside mondegreens and eggcorns and similar wordplay.)

0
0

Wow that code is ugly.

0
6
Silver badge

@Your Majesty

Not "ugly", Your Majesty. Rather, "primitive". We had to start somewhere.

The innocence of youth, demonstrated across generations ...

11
2

This post has been deleted by its author

Re: @Your Majesty

Yeah, I know. It was more like a comment on how much programming style has evolved since then. Languages like python and stuff are comparatively a breeze to look at.

I still maintain that it's ugly. Not that it's a bad thing.

0
0
Boffin

Or a PDP-11 emulator in javascript

http://pdp11.aiju.de/

2
0

Re: Or a PDP-11 emulator in javascript

Is it me or is there something ironic about running a C compiler in a javascript emulator of an ancient computer?

5
1
Thumb Up

Ironic? No

Amusingly convoluted, yes ;-)

0
0
Anonymous Coward

Re: Or a PDP-11 emulator in javascript

It's just what we need to get that 1970s nuclear reactor control system compiled and running

3
0
Silver badge
Happy

Re: Ironic? No

Convoluted? Let's compile a ZX-Spectrum emulator using this old C-compiler to use the PDP emulator, running a ZX Spectrum emulator, running a basic interpreter to run a computer game.

That's convoluted

1
0

K&R c...

...was elegant and clean. I loved it. And the compilation was predictable. I could look at the object code and see exactly where it came from.

4
1
Silver badge
Holmes

Re: K&R c...

Don't tell me you prefer that boring predictability over debugging objects with threads.

2
0
Yag
Bronze badge

Re: K&R c...

I prefer that boring predictability, same for the FAA and EASA, and I guess the pilots and passengers would prefer that as well.

1
0
Silver badge

Re: K&R c...

No, it was not.

The language was ambiguous. You could write a "bug free" C compiler and it would produce different code.

Or you could write "correct" C code for an agreed "compatible" compiler and it would not do as expected.

C++ was supposed to sort some of these issues, but unfortunately the original version was too much backward compatible.

C was not used for many critical control systems for this reason.

0
4
Yag
Bronze badge

Re: K&R c...

No, no, don't go that far... K&R was indeed quite messy, but ANSI made a (mostly) fine job in 89 and 99, despite a few oddities.

And ol' C89 is still well loved in aeronautical embedded systems.

Speaking about ANSI C89 & 99, there is a nice trivia question to ask to computer programmers : "What is the range of a signed char as defined by the standard?" I surprised a lot of people with this one...

1
0
Facepalm

Stupid Question

I understand that I may be a bit thick here but it appears to be written in C.

Now I get that you can write a new C compiler these days, compile it in GCC or whatever and then get it to compile itself.

But this is the first C compiler. How did they compile it? By hand?

0
0
Silver badge

Re: Stupid Question

The first proto-versions of the C compiler were compiled from B, the later proto-versions of the C compiler were compiled from the earlier proto-versions of C, and then finally we had a recognisable pre-C89 C language emerging from the swamp... several years later a pre-C89 C compiler compiled GCC. (And finally GCC compiled Emacs.)

5
0

This post has been deleted by its author

Silver badge
Childcatcher

Re: Stupid Question

> But this is the first C compiler. How did they compile it? By hand?

It's magic! People don't have to explain bits!

K&R are the Siegfried and Roy of....

Awww.. ok: http://en.wikipedia.org/wiki/Bootstrapping_%28compilers%29

3
0
Bronze badge

Re: Stupid Question

Er... yes?

When you bootstrap GCC onto a new platform how do you think they do it? They write a "proto-C" compiler that's capable of making a C compiler that's capable of creating a C++ compiler that's capable of creating the full GCC suite.

The first step always has to be done by hand for any "new" platform, because how do you manage memory without knowing the in-depth details of a new architecture and informing the compiler of them? A lot of the time the first step is just assembler and then eventually some sort of "proto-C" compiler of the most basic type that has none of the features of C but allows you to compile some existing base of code that can end up as a C compiler (usually with hints on how to actually DO things like open files, allocate memory, page RAM, etc.). Always has been, always will have to be.

The thing is, if you write the C compiler in C, then you know it's "complete" (self-compiling) and can be used as the second stage the second you get things working on a new platform. Writing a C compiler entirely in assembler is not an easy task. You'd write some psuedo-cut-down-proto-version-of-C and then use it to compile this code into something closer to a "real" compiler.

1
1

actually you only have to work on the code generator

Compilers have been ported - gcc ported many times.

This has isolated the machine dependent part to just the code generator.

So all you have to do is take an existing code generator (that interprets the intermediate code), and create a new output form.

You now have a full gcc compiler that runs on the original hardware (not the target). Thus, it is a cross compiler.

Of course, to finish the port, you must also port the linker (and the runtime libraries) to the target system - though it too has been worked over to isolate the hardware dependent part, so again, the amount of work is reduced.

Run the C compiler source through the cross compiler and you get a binary that will run on the target system.

0
0

Re: Stupid Question

It's fun and it is related to how you can end up with malicious code in a compiler that does not exist in the source. Once you have a working C compiler (from anywhere) you can then compile your "written in C" C compiler and get a C compiler, now you have a C compiler! If you write a C compiler that injects malicious code into certain object code - such as a C compiler - then you can use that to compile a clean C compiler and distribute the clean sources with malicious object code. Check the sources really carefully and they are safe. Compile them and the object code is nasty!!

0
0
Anonymous Coward

Re: Stupid Question

Lee D wrote "When you bootstrap GCC onto a new platform how do you think they do it?" And then described something different.

The short answer is, you don't bootstrap a compiler on a new platform. You do all the work for the new platform on an old platform, where you already have a working toolchain. In the entire history of the world, we only need to bootstrap a compiler from assembly language once. You can then use that compiler to bootstrap other compilers for new languages, or you can port it to another platform with no bootstrapping at all.

2
0

Re: Stupid Question

back in the early 90s this was more or less how you had to compile gcc on machines that did not come with a compiler

You downloaded a bit of binary code that would compiler some source code into a proto compiler

(or if you had access to a Convex, you could use the Convex C compiler to build the proto compiler)

the proto compiler would then compile a second set of source code into a simple compiler (Im tempted to write a basic compiler, but that would confuse matters).

the simple compiler would then use the gcc source code to compile gcc

[but this was not the end of the build cycle, there was more ...]

the gcc-you-had-just-build would then recompile the gcc source code to create a proper gcc compiler

Note that the downloading process could involve having to uudecode some uuencoded files to get the proto-compiler, and there may not be a uudecode program for your OS.. Fear not .. as long as your OS had a hex editor, you could usualy find a printed copy of the hex code to do basic uudecode on your system.

All you had to do is retype the printout exactly as it was in the printout.

Though I think most of the ftp sites (like the wsmr one) had binaries of uudecode for pretty much any conceivable platform) ...

0
0
Silver badge

Re: Stupid Question

Yes, Compiler writers of ANY language love to write the compiler in the same language.

0
0
Silver badge

Re: Stupid Question

"And finally GCC compiled Emacs."

Seems to me that some wag compiled GCC (and libraries), using EMACS as the compiler ... Worked on Motorola & Intel versions of Unix. Useful? Maybe ... as a learning tool.

I built SmallC using MS-DOS "debug" once, to prove a point. Wasn't fun, and it was a bloody useless hack, but the end result was self compiling, and I learned a lot in the process :-)

0
2
Silver badge

First C Compiler

Or, the great grandmother of all compiler errors.

I can use = instead of == in an expression and that's valid?

0
0

Page:

This topic is closed for new posts.