The GCC for THEOS project


Table of contents


News

Last update of the files: May 30, 2001

February 27, 2003: After a rather long time of absence, the project came back to the surface, as Jean-Michel Dubois posted a request on how to port Ghostscript to the THEOS operating system environment on the Ask THEOS forum. I mentioned, that using GCC for THEOS might be an alternative and he jumped on. So did Gary Walters and the three of us are working together on this project since a few weeks now.

May 19, 2001: Some minor fixes have been incorporated mainly into the code generation process. The big thing this time is an update of the web-site and a page with the contributors to the project. If I missed your name on this list, please let me know.

The CVS port is still pending (don't have time). Currently I am fiddling with the conversion of .cvsignore to CVSIGNORE. and vice versa. (Notice how the dot in the filename moves). This is not as easy as I thought, because I need to patch CVS pretty heavily as it seems.

Welcome to Andrés Hernández Schafhauser who joined the project and will be working on porting GCC to become a real THEOS program.

May 7, 2001: As of today I can compile and run CVS on THEOS. I successfully checked-out an existing source code tree, modified some files in it and saw, that CVS recognizes the changes. I can also see all history and status information on the THEOS client. I did not verify to check stuff back in (which is the next step).

Some minor tweaks were necessary to get CVS working under THEOS. For one, the THEOS CSI converts all characters to upper case and filenames are handled different (makefile becomes MAKEFILE. (noticed the trailing dot?) on THEOS). All that has been changed and I am confident to get the CVS port to a stage, where it is of general help for the THEOS community within the next couple of weeks.

April 2, 2001: In the last couple of weeks, I made numerous modifications to the compiler (patches) as well as the runtime system (t_* wrapper etc.). I am now able to compile, assemble, link and run the zlib of CVS. With the supplied test program I can gzip a file on THEOS und gunzip it on LINUX and vice versa.

Also, I created a Change Log to keep a history of changes in this project. Please visit this page to see a more details of the changes.

Motivation

While I was working on the Y2K modifications of the POS ACCESS report writer, I produced several test version for Ron Gibbs and some beta testers. After a while it was very confusing to assign error reports to a specific version. Even more, it was very time consuming to revert to the version in question to analyze and find the problems.

When I had some time off from the ACCESS Y2K stuff I was thinking to get CVS working on THEOS. CVS is a version control system I know from my work at S.E.S.A. and it would solve all the version control related problems I encountered.

So I FTPed all the source over to my THEOS box and started to compile the first modules (if I remember right, it was the ZLIB stuff). Anyway, I did not get very far, as the THEOS C compiler complained about some typedef constructs. So I looked at them, and figured a way how to modify the source to get it compiled under THEOS. Then I looked at the other modules. I found the same typedef stuff all over the place. I got in touch with Tim Williams and asked him if he could fix this problem in the THEOS C compiler. He told me, it was too big of a change and that he would not do it at this time.

I stopped working on CVS again, because I did not want to modify it that much and decided that I would need the GNU C-Compiler on THEOS in order to compile CVS.

Project history

I talked to Axel Zeuner, a colleague of mine back then, and he said it would be a large project. While looking at GCC I found the following things that proved his assumptions:

  1. GCC's executable output is ELF or AOUT which are incompatible with the THEOS executable layout
  2. GCC's assembler output is incompatible to the THEOS assembler
  3. GCC pushes arguments to a function call in the oposite order than the THEOS C compiler
  4. GCC produces code such that a caller to a function cleans the arguments pushed on the stack whereas the THEOS compiler generates code that leaves that to the called function.

Here are some thoughts to the above points:

  1. is a show stopper: All I could use of GCC is the assembly output. I did not want to change the linker as well
  2. I had to change the code generation part of GCC in order to write statements that can be assembled using the THEOS assembler
  3. This is another tough point, because I wanted to use the standard THEOS C library. Code produced with GCC should be linked against the library functions that come with THEOS. An adaption layer would solve this problem. In terms of runtime some overhead, but for now it is ok.
  4. The above mentioned adaption layer will solve that problem too.

I poked around a little bit in the GCC source and tried to get it working some time in October 2000. But I did not have enough time to work on the project and forgot most of the things I did.

It was not before sometime in January 2001 when I started with 2. It was very challenging, but I found a very well structured program that had all the hooks I needed to make the necessary changes. Once I had this done, I actually started with 3./4. and found out, that I do not only have a C compiler for THEOS but also a C++ compiler. Compiling a C++ program required some more modifications to the adaption layer to get global constructors/destructors working but fortunately the THEOS assembler/linker support multiple segments (which are called PABs in THEOS).

Some day in February, I was able to compile libgcc2.c. The first test program was written and I could actually compile the source, transfer the assembler output to my THEOS system, assemble, link and run it! While working on this, I realized, that it is not that easy to port the whole GCC to run on THEOS. I decided to keep it a cross-compiler for now.

The next step was to automate the process of transferring the C source files from the THEOS system to the LINUX system and the assembly language file back to the THEOS box. I solved it using the THEOS EXEC language, FTP and an RSH command I wrote (The RSH command comes with the GCC helper files).

Link collection of files you will need for setup of the project

The following is a list of files / projects you will need in order to get the project running in your environment. Please see the Setup HowTo below for instructions on how to install all this software.

HowTo setup the environment and build the compiler

Here's a small graphic about my setup.
                      +-----------+      +-----------+
                      | LINUX Box |      | THEOS Box |
                      +-----------+      +-----------+
                           | .3                | .2
                     [-----+-------------------+------]
                               192.168.1.0/24

Both computers have the network stuff setup and the LINUX box is running a full DNS server. I modified the ~/.rhosts file so that I can execute commands from the THEOS Box on the LINUX box via rsh.

The following steps are necessary on the LINUX box (I assume to have an account called thb on it - you must replace it with your local name. Also, I assume all files downloaded are stored in your home directory)

  1. Download GCC 2.95.2 from gcc.gnu.org and extract it in your HOME directory. This will create a directory called ~/gcc-2.95.2
  2. Download the THEOS specific header files to your HOME directory and install them in /usr/local/i386-theos/ with the following commands (you have to enter the underlined parts followed by CR):
    thb@linux:~ > su -
    root@linux:~ > md /usr/local/i386-theos
    root@linux:~ > chown thb.users /usr/local/i386-theos
    root@linux:~ > exit
    thb@linux:~ > tar -C /usr/local/i386-theos -xvjf ~/theos-header.tgz 
  3. Apply the THEOS specific patch to the GCC compiler. Make sure, that you donwload this file as gcc-theos.diff.gz. My browser uses to remove the trailing .gz when I download the file but does not gunzip it. Then use the following commands to apply the patch:
    thb@linux:~ > cd gcc-2.95.2
    thb@linux:~/gcc-2.95.2 > gunzip ~/gcc-theos.diff.gz
    thb@linux:~/gcc-2.95.2 > patch -p1 < ~/gcc-theos.diff
    
  4. Create a directory gcc-theos in your HOME directory
  5. cd into gcc-theos
  6. Run the following commands:
    thb@linux:~/gcc-theos > ../gcc-2.95.2/configure --target=i386-theos
    thb@linux:~/gcc-theos > make LANGUAGES="c c++"
    thb@linux:~/gcc-theos > su
    root@linux:~/gcc-theos > make LANGUAGES="c c++" install
    root@linux:~/gcc-theos > cd gcc
    root@linux:~/gcc-theos/gcc > make LANGUAGES="c c++" install-driver
    root@linux:~/gcc-theos/gcc > exit
    thb@linux:~/gcc-theos > 
    

    Note: the first two make commands produce a lot of output and stop with an error message like:

    ...
    mv: tmplibgcc2.a: No such file or directory
    make[1]: *** [libgcc2.a] Error 1
    make[1]: Leaving directory `/home/thb/gcc-theos/gcc'
    make: *** [all-gcc] Error 2 
    

    The third make command installs the driver program for our compiler. It will be named i386-theos-gcc and will reside in /usr/local/bin. You have to run this make command only once and from there on only, if you change something concerning the driver program.

    This is perfectly OK for now. If you get that far, you already produced some THEOS assembly language files. They are located in ~/gcc-theos/gcc. Use any editor to have a look at, e.g. ~/gcc-theos/gcc/frame.s.

The following steps are necessary on the THEOS box:

  1. Install the THEOS C compiler, I have PL:50031. You have to have it, because for now, we use it's assembler, linker and library functions. Some of the files in GCC.SOURCE are also compiled using the THEOS C compiler.
  2. Download and install the GCC helper files on the system account of your THEOS box. They consist out of the libraries GCC.SOURCE and GCC.C32LIB as well as some EXEC, C and BASIC programs.
  3. Set the following environment variables for your account:
    GCCHOST
    The name or IP address of the LINUX host where the cross compiler resides
    GCCUSER
    The account name that can be used on the LINUX box for FTP and RSH access.
    GCCPWD
    The password to this account
    LOCALUSER
    The name of the local (THEOS) user that has been used in the .rhosts entry
    If one or more of these parameters are not set, the compilation process will not work.
  4. Now try to compile the test program TEST.C by issuing the commands:

    SYSTEM>gcc test
    SYSTEM>asm test.s
    SYSTEM>ld test
    
    and check that it works:
    SYSTEM>test
    
  5. I also include a C++ test program called TESTP.CPP. Try if you can compile and run that as well with the following commands:

    SYSTEM>gpp testp
    SYSTEM>asm testp.s
    SYSTEM>ldp testp
    SYSTEM>testp
    
    Notice, that you must use a different compiler and linker command here. We need a different linker shell here, because C++ needs specific library files that must be linked in a specific order. The program itself does not do anything useful for now, but you can see the order in which global constructors and destructors are called automagically.

The THEOS runtime environment

In order to get the compiled programs working with the standard THEOS functions, I decided to implement a wrapper layer between the GCC code and the THEOS C library. This layer is necessary for three reasons:
Order of arguments on the stack

The THEOS C compiler pushes arguments on the stack from left to right, thus the C fragment
  f(1, 2, 3);
is translated to the assembly language sequence
  push  1
  push  2
  push  3
  call  f
and the code compiled for function f with the THEOS compiler assumes this order of arguments. The GCC compiler pushes the arguments in the reverse order, so the assembly language looks like:
  push  3
  push  2
  push  1
  call  f
  add   esp,+12
The wrapper layer reverses this order before it calls the THEOS C library function. When a wrapper function is known to the THEOS port of GCC, it converts the function name from f to t_f which calls the wrapper function. The wrapper function itself will call f once finished with the argument cleanup. For some functions the call to f could end up in a THEOS system call. The program THEOWRAP.BASIC in conjunction with THEOFUNC.LIST (both part of the GCC helper files, that you should already have downloaded) construct all wrapper source automatically.

The last line in the above example leads to the second reason for the wrapper functions.

Stack cleanup

The runtime environment of the THEOS C compiler assumes (with some exceptions) that the callee of a function removes the arguments from the stack. The code compiled by GCC in contrast assumes this to be done by the callee of a function. Again, the wrapper function takes care of this circumstance.

CPU Register usage

GCC uses registers of the CPU to hold C variables and for C++ code the this pointer. Register usage for variables can be forced by using the register storage class in a C program as well as by the compiler itself during optimization. The THEOS C library does not save these registers (on the x86 architecture namely EBX, ECX, ESI and EDI) before it uses them. Also, in some THEOS C functions the register ES is overridden to access special THEOS memory segments. The GCC runtime enviornment assumes DS == ES and again the wrapper takes care of that

The following diagram shows the layer in the hierarchy of a typical application:
                      +----------------------+
                      | GCC compiled program |
                      +----------------------+
                      | t_* wrapper library  |
                      +----------------------+
                      |   THEOS C  library   |
                      +----------------------+
                      |   Operating System   |
                      +----------------------+

Once all THEOS C library functions are available as GCC compiled files, the wrapper library can be eliminated.

Things that need to be done

I want to use this list to make sure, that I do not miss anything important on the course through this project. The order in which the items are listed here does not imply any priority

Final comments

If you find anything in the above which is incorrect, please let me know. Also, I am still looking for people who want to work with me on this project. If you think, you can afford some time to work on this project, please send a mail to thb@net-bembel.de. I am sure, together we can get GCC running and help others to get some goodies into the THEOS operating system environment.

If you want to be informed about new file versions or anything else that is important in this project, please send me a mail and I will put you on the distribution list of the news-messages.