User:MessiahAndrw/LLVM OS Specific Toolchain

From OSDev Wiki
Jump to: navigation, search

These instructions are on building an OS Specific Toolchain using LLVM and clang instead of GCC.

THIS IS A WORK IN PROGRESS - DON'T FOLLOW THESE YET!

Note that LLVM by default builds cross-compilers for all targets, the right target simply have to be activated (why is there a LLVM Cross-Compiler page?) These instructions add your operating system as a new target. By following these instructions, hopefully you will have a LLVM-based toolchain for building applications that run under your OS!

Note that LLVM is developed in C++11 and uses many of the language's modern features, and there have been reported difficulties compiling some components on other compilers (such as GCC). If you run into trouble, you can try compiling LLVM with Clang.

Contents

Tools

These are the tools (built on LLVM) that we will be building and using:

  • clang - A C/C++/Objective-C compiler frontend.
  • libc++ - A standard C++ library with all of the C++11 bells and whistles.
  • libc++abi - The portable ABI behind libc++.
  • lld - A linker.
  • LLVM-as - An assembler that comes with LLVM.

Checking out

First we must check out the source code and create the basic structure. We'll assume we want two directories - a 'llvm' directory containing the source code, and 'build' containing our build toolchain.

Check out the source LLVM source code:

git clone http://llvm.org/git/llvm

Check out Clang:

cd llvm/tools
git clone http://llvm.org/git/clang

Check out Checkout extra Clang Tools: (optional)

cd clang/tools
git clone http://llvm.org/git/clang-tools-extra extra

Checkout Compiler-RT:

cd ../../../projects
git clone http://llvm.org/git/compiler-rt

Check out libc++:

git clone http://llvm.org/git/libcxx

Check out libc++abi:

git clone http://llvm.org/git/libcxxabi

Check out lld:

git clone http://llvm.org/git/lld

Make a build directory

cd ../..
mkdir build

Modifying LLVM

The first step is to get LLVM to recognize your OS as a platform. Like the other tutorial, we'll assume your OS is called MyOS. Of course, you'd replace MyOS with your own operating system's name (unless your OS is called MyOS).

llvm/configure

Modifying configure isn't actually needed, because unlike GCC, LLVM builds its compilers with every target added. The --target= option you pass to configure specifies the target you want the compiler to run on, not build for. One day, we might want our OS to be self hosting, so we can add our target to configure now so we don't forget later.

Around line 4085 you will see:

case $target in
 *-*-aix*)
   llvm_cv_target_os_type="AIX" ;;
 *-*-irix*)
   llvm_cv_target_os_type="IRIX" ;;
 *-*-cygwin*)

Add your OS:

*-*-myos*)
   llvm_cv_target_os_type="MyOS" ;;

Repeat this in llvm/autoconf/configure.ac in case you need to rebuild your configure file.

llvm/include/llvm/ADT/Triple.h

In the enum called OSType (around line 120), add your OS:

 enum OSType {
   ...
   MyOS,
   ...
 };

llvm/lib/Support/Triple.cpp

In the function Triple::getOSTypeName (around line 135) add your OS:

 case MyOS: return "myos";

In the function parseOS (around line 326), add your OS:

 .StartsWith("myos", Triple::MyOS)

Around line 414 is getDefaultFormat that returns the default executable format type for a platform. The fallback is ELF, but if you want to use PE or MachO (or maybe your own) you can stick it here.

llvm/lib/Support/* Notes

There is some platform specific stuff in llvm/lib/Support/* (particularly Hosts.cpp and the subdirectories), but they appear to be support files for the platform the compiler runs on, not targets.

llvm/autoconf/config.sub

Find the section that begins with the comment "First accept the basic system types" and add "-myos*" to the list.

Modifying LLD

If your operating system uses its own executable format, you can find the relevant code under llvm/projects/lld/lib/ReaderWriter/, but this is a much more difficult job than porting using a common executable format like ELF, MachO, or PE.

NOTE: I am a different contributor (not MessiahAndrw), and my OS does use a different executable format. I will document my progress on that later.

Modifying Clang

llvm/tools/clang/lib/Basic/Targets.cpp

We need to create a target so Clang knows a little bit about the platform it's compiling for, so we will create a TargetInfo object called MyOSTargetInfo. You can override some compiler internals here (such as setting the size of long ints) - look at what the other targets do for example.

Somewhere in this file, above AllocateTarget, create your target object:

 // MyOS target
template<typename Target>
class MyOSTargetInfo : public OSTargetInfo<Target> {
 protected:
  void getOSDefines(const LangOptions &Opts, const llvm::Triple &Triple,
                    MacroBuilder &Builder) const override {
    Builder.defineMacro("_MYOS");
  }
 
 public:
   MyOSTargetInfo(const llvm::Triple &Triple)
       : OSTargetInfo<Target>(Triple) {
     this->UserLabelPrefix = "";
   }
};

In AllocateTarget, you'll need to add your OS in switch(Triple.getArch()):

switch (Triple.getArch()) {
   ...
   case llvm::Triple::x86: // and/or llvm::Triple::x86_64
      ...
      switch (os) {
         ...
 
        case llvm::Triple::MyOS:
            return new MyOSTargetInfo<X86TargetInfo>(Triple); // or MyOSTargetInfo<X86_64TargetInfo>
         ...
    ...
 }

llvm/tools/clang/lib/Driver/ToolChains.h

Next, we have to create a toolchain object that Clang uses to figure out how to connect to the other toolchain components (namely the linker and assembler) for our target.

Add this somewhere:

 class LLVM_LIBRARY_VISIBILITY MyOS : public Generic_ELF {
 public:
   MyOS(const Driver &D, const llvm::Triple &Triple,
        const llvm::opt::ArgList &Args);
 
 protected:
   Tool *buildAssembler() const override;
   Tool *buildLinker() const override;
 };

Note that we're inheriting from the Generic_ELF toolchain, but you can look at some other examples (Windows, Mac OS) for alternatives.

llvm/tools/clang/lib/Drive/ToolChains.cpp

Here's the code for the toolchain object, insert it somewhere in this file:

 /// MyOS MyOS tool chain which can call as(1) and ld(1) directly.
 
MyOS::MyOS(const Driver &D, const llvm::Triple& Triple, const ArgList &Args)
  : Generic_ELF(D, Triple, Args) {
   // Fill this in with your default library paths one day..
   //getFilePaths().push_back(getDriver().Dir + "/../lib");
   //getFilePaths().push_back("/usr/lib");
}
 
Tool *MyOS::buildAssembler() const {
  return new tools::myos::Assemble(*this);
}
 
Tool *MyOS::buildLinker() const {
  return new tools::myos::Link(*this);
}

Note in the constructor that we have the ability to add default include paths, which are sent to our assembler and linker. We'll comment them out now so our system doesn't automatically try to add our host system's libraries when we compile code for our OS.

llvm/tools/clang/lib/Frontend/InitHeaderSearch.cpp

In the function InitHeaderSearch::AddDefaultCIncludePaths, add this somewhere, so we don't automatically add /usr/local/include as an include path:

  case llvm::Triple::MyOS:
    // Fill this in with your default include paths...
    // AddPath("/usr/local/include", System, false);
    break;

If you want your target to automatically add default include paths, you can customize this file. Add your target under AddDefaultCIncludePaths, AddDefaultCPlusPlusIncludePaths, AddDefaultIncludePaths, etc.

llvm/tools/clang/lib/Driver/Driver.cpp

We need make LLVM use our toolchain object when it targets our OS, so in Driver::getToolChain, add your OS to switch (Target.getOS()):

    case llvm::Triple::MyOS:
      TC = new toolchains::MyOS(*this, Target, Args);
      break;

llvm/tools/clang/lib/Driver/Tools.h

In here, we define the Assemble and Link classes that our toolchain object references:

/// myos -- Directly call GNU Binutils assembler and linker
namespace myos {
  class LLVM_LIBRARY_VISIBILITY Assemble : public GnuTool  {
  public:
    Assemble(const ToolChain &TC) : GnuTool("myos::Assemble", "assembler",
                                         TC) {}
 
    bool hasIntegratedCPP() const override { return false; }
 
    void ConstructJob(Compilation &C, const JobAction &JA,
                      const InputInfo &Output,
                      const InputInfoList &Inputs,
                      const llvm::opt::ArgList &TCArgs,
                      const char *LinkingOutput) const override;
  };
  class LLVM_LIBRARY_VISIBILITY Link : public GnuTool  {
  public:
    Link(const ToolChain &TC) : GnuTool("myos::Link", "linker", TC) {}
 
    bool hasIntegratedCPP() const override { return false; }
    bool isLinkJob() const override { return true; }
 
    void ConstructJob(Compilation &C, const JobAction &JA,
                      const InputInfo &Output,
                      const InputInfoList &Inputs,
                      const llvm::opt::ArgList &TCArgs,
                      const char *LinkingOutput) const override;
  };
} // end namespace myos

TODO: We're inheriting from GnuTool, use LLD and LLVM-as.

llvm/tools/clang/lib/Driver/Tools.cpp

Here's the code for our Assemble and Compile - they invoke 'as' and 'ld'.

 void myos::Assemble::ConstructJob(Compilation &C, const JobAction &JA,
                                   const InputInfo &Output,
                                   const InputInfoList &Inputs,
                                   const ArgList &Args,
                                   const char *LinkingOutput) const {
  ArgStringList CmdArgs;
 
  Args.AddAllArgValues(CmdArgs, options::OPT_Wa_COMMA, options::OPT_Xassembler);
 
  CmdArgs.push_back("-o");
  CmdArgs.push_back(Output.getFilename());
 
  for (const auto &II : Inputs)
    CmdArgs.push_back(II.getFilename());
 
  const char *Exec = Args.MakeArgString(getToolChain().GetProgramPath("as"));
  C.addCommand(llvm::make_unique<Command>(JA, *this, Exec, CmdArgs));
 }
 
 void myos::Link::ConstructJob(Compilation &C, const JobAction &JA,
                               const InputInfo &Output,
                               const InputInfoList &Inputs,
                               const ArgList &Args,
                               const char *LinkingOutput) const {
  const Driver &D = getToolChain().getDriver();
  ArgStringList CmdArgs;
 
  if (Output.isFilename()) {
    CmdArgs.push_back("-o");
    CmdArgs.push_back(Output.getFilename());
  } else {
    assert(Output.isNothing() && "Invalid output.");
  }
 
  /* if (!Args.hasArg(options::OPT_nostdlib) &&
      !Args.hasArg(options::OPT_nostartfiles)) {
      CmdArgs.push_back(Args.MakeArgString(getToolChain().GetFilePath("crt1.o")));
      CmdArgs.push_back(Args.MakeArgString(getToolChain().GetFilePath("crti.o")));
      CmdArgs.push_back(Args.MakeArgString(getToolChain().GetFilePath("crtbegin.o")));
      CmdArgs.push_back(Args.MakeArgString(getToolChain().GetFilePath("crtn.o")));
  }*/
 
  Args.AddAllArgs(CmdArgs, options::OPT_L);
  Args.AddAllArgs(CmdArgs, options::OPT_T_Group);
  Args.AddAllArgs(CmdArgs, options::OPT_e);
 
  AddLinkerInputs(getToolChain(), Inputs, Args, CmdArgs);
 
  addProfileRT(getToolChain(), Args, CmdArgs);
 
  if (!Args.hasArg(options::OPT_nostdlib) &&
      !Args.hasArg(options::OPT_nodefaultlibs)) {
    if (D.CCCIsCXX()) {
      getToolChain().AddCXXStdlibLibArgs(Args, CmdArgs);
      CmdArgs.push_back("-lm");
    }
  }
 
  // We already have no stdlib...
  /*if (!Args.hasArg(options::OPT_nostdlib) &&
      !Args.hasArg(options::OPT_nostartfiles)) {
    if (Args.hasArg(options::OPT_pthread))
      CmdArgs.push_back("-lpthread");
    CmdArgs.push_back("-lc");
    CmdArgs.push_back("-lCompilerRT-Generic");
    CmdArgs.push_back("-L/usr/pkg/compiler-rt/lib");
    CmdArgs.push_back(
         Args.MakeArgString(getToolChain().GetFilePath("crtend.o")));
  }*/
 
  const char *Exec = Args.MakeArgString(getToolChain().GetLinkerPath());
  C.addCommand(llvm::make_unique<Command>(JA, *this, Exec, CmdArgs));
 }

TODO: Replace with LLD and LLVM-as.

Compiling the toolchain

You can compile your LLVM now that your OS is added. This is very slow the first time, but if you made a mistake above 'make' will continue where it left off once you correct your errors.

cd build
../llvm/configure --enable-optimized
make

This will build the entire LLVM toolchain. It takes several hours on my machine (x86_64, dual-core 3.0Ghz, 8 GiB RAM)/ Without --enable-optimized, the compiler code will be built without optimizations and will be both slow, and really huge (my 'clang' executable was 810 MiB!)

You can install it with:

sudo make install

WARNING: Only do this if you're as crazy as I am. On my desktop, I can easily reinstall Clang and the other LLVM tools if I break it (and I also have GCC installed), so I don't mind doing this. Do this at your own caution, or pass a --prefix= to configure to install it into a custom sub-directory.

Compiling your first program for your OS

Let's test out the compiler! Create a simple C file:

 int do_something(int a, int b) {
   return a * b;
 }

And compile it with the Clang system you just built: (Use --target=x86_64-myos if you added your OS as a 64-bit platform.)

clang -target x86-myos -c -o test.o test.c

TODO: The -c option tells us to just compile, because we haven't finished with the linker yet.

This is great, but you will notice you can't just start including <stdio.h> because you'll need to port a C library first. If you don't want to use the C library (you won't be able to use LibC++/the C++ library - and many of the fancier C++11 features that require runtime support) you can stop here.

Porting a C Library

At some point you'll likely want a C Library. Libc++ depends on a functioning C library.

TODO: Any Clang specific stuff here.

Hosting on our OS

This section is for compiling the LLVM toolchain to run under your OS, rather than just build programs for your OS.

TODO: I haven't gotten this far yet!

External Links

See Also

Personal tools
Namespaces
Variants
Actions
Navigation
About
Toolbox