NE

From OSDev Wiki
Jump to navigation Jump to search
Executable Formats
Microsoft

16 bit:
COM
MZ
NE
Mixed (16/32 bit):
LE
32/64 bit:
PE
COFF

*nix
Apple

Introduction

The NE (New Executable) format was introduced with Windows 1.0. Like its predecessor MZ, it is a 16-bit format, but it lifts some of its maximum size restrictions and adds new features, such as references to other files.

All numbers in the NE file's structures are in little-endian byte order.

Support

NE files are the native executable format of Windows 1.01 through 3.xx as well as OS/2 1.x. The format can be run on later 32-bit Windows systems (in a Virtual DOS Machine), such as Windows 95 and Windows NT 3.1, but their native format is PE.

Because 64-bit mode on x86 processors lacks virtual 8086 support, NE executables cannot be run on 64-bit Windows.

Inside the NE file

DOS Stub

Every NE executable is also a valid MZ executable. This enables the developer to package both an MS-DOS and Win16 version of the program in one file, but for most programs, the MZ executable only prints "This Program requires Microsoft Windows" and exits, so the MZ part is often referred to as the DOS Stub.

To detect whether an MZ executable is also an NE executable, the following must be done:

  1. Read the e_lfarlc field (uint16_t at offset 0x18). If it does not equal 0x0040, the executable is not an NE executable.
  2. Read the e_lfanew field (uint32_t at offset 0x3C). Seek to that location in the file and read 2 bytes. If these bytes are 'N' and 'E', the executable is an NE executable. (Other common possibilities at that location are 'L' and 'E' for LE, and 'P' and 'E' for PE executables.)

NE Header

The NE header is a relatively large structure.

struct NE_Header {
    char sig[2];                 // {'N', 'E'}
    uint8_t MajLinkerVersion;    //The major linker version
    uint8_t MinLinkerVersion;    //The minor linker version (also known as the linker revision)
    uint16_t EntryTableOffset;   //Offset of entry table from start of NE_Header
    uint16_t EntryTableLength;   //Length of entry table in bytes
    uint32_t FileLoadCRC;        //32-bit CRC of entire contents of file
    uint16_t FlagWord;           // Uses the FlagWord enum
    uint16_t AutoDataSegIndex;   //The automatic data segment index
    uint16_t InitHeapSize;       //The initial local heap size
    uint16_t InitStackSize;      //The initial stack size
    uint32_t EntryPoint;         //CS:IP entry point, CS is index into segment table
    uint32_t InitStack;          //SS:SP initial stack pointer, SS is index into segment table
    uint16_t SegCount;           //Number of segments in segment table
    uint16_t ModRefs;            //Number of module references (DLLs)
    uint16_t NoResNamesTabSiz;   //Size of non-resident names table in bytes
    uint16_t SegTableOffset;     //Offset of segment table from start of NE_Header
    uint16_t ResTableOffset;     //Offset of resources table from start of NE_Header
    uint16_t ResidNamTable;      //Offset of resident names table from start of NE_Header
    uint16_t ModRefTable;        //Offset of module reference table from start of NE_Header
    uint16_t ImportNameTable;    //Offset of imported names table from start of NE_Header
    uint32_t OffStartNonResTab;  //Offset of non-resident names table from start of file (!)
    uint16_t MovEntryCount;      //Count of moveable entry point listed in entry table
    uint16_t FileAlnSzShftCnt;   //File alignment size shift count (0=9(default 512 byte pages))
    uint16_t nResTabEntries;     //Number of resource table entries (often inaccurate!)
    uint8_t targOS;              //Target OS

    //
    // The rest of these are not defined in the Windows 3.0 standard and
    // appear to be specific to OS/2.

    uint8_t OS2EXEFlags;         //Other OS/2 flags
    uint16_t retThunkOffset;     //Offset to return thunks or start of gangload area - what is gangload?
    uint16_t segrefthunksoff;    //Offset to segment reference thunks or size of gangload area
    uint16_t mincodeswap;        //Minimum code swap area size
    uint8_t expctwinver[2];      //Expected windows version (minor first)
};

//
// In 16-bit DOS/Windows terminology, DGROUP is a segment class that referring
// to segments that are used for data.
//
// Win16 used segmentation to permit a DLL or program to have multiple
// instances along with an instance handle and manage multiple data
// segments. This allowed one NOTEPAD.EXE code segment to execute
// multiple instances of the notepad application.
//
enum FlagWord {
    // how is data handled?
    NOAUTODATA = 0x0000,
    SINGLEDATA = 0x0001, // shared among instances of the same program
    MULTIPLEDATA = 0x0002, // separate for each instance of the same program

    // additional flags:
    LINKERROR = 0x2000, // Linker error, module cannot load
    LIBMODULE = 0x8000, // if this flag is set, this is a DLL;
                        // see the "Dynamic Libraries" section below
};

#define GLOBINIT 1<<2     //global initialization
#define PMODEONLY 1<<3    //Protected mode only
#define INSTRUC86 1<<4    //8086 instructions
#define INSTRU286 1<<5    //80286 instructions
#define INSTRU386 1<<6    //80386 instructions
#define INSTRUx87 1<<7    //80x87 (FPU) instructions

//Application flags
//Application type
enum apptype {
    none,
    fullscreeen,    //fullscreen (not aware of Windows/P.M. API)
    winpmcompat,    //compatible with Windows/P.M. API
    winpmuses       //uses Windows/P.M. API
};

// #define OS2APP 1<<3    //OS/2 family application
// //bit 4 reserved?
// #define IMAGEERROR 1<<5    //errors in image/executable
// #define NONCONFORM 1<<6    //non-conforming program?
// #define DLL        1<<7

//Target Operating System
enum targetos {
    unknown,    //Obvious ;)
    os2,        //OS/2 (as if you hadn't worked that out!)
    win,        //Windows (Win16)
    dos4,       //European DOS  4.x
    win386,     //Windows for the 80386 (Win32s). 32 bit code.
    BOSS        //The boss, a.k.a Borland Operating System Services
};
//Other OS/2 flags
#define LFN 1        //OS/2 Long File Names (finally, no more 8.3 conversion :) )
#define PMODE 1<<1   //OS/2 2.x Protected Mode executable
#define PFONT 1<<2   //OS/2 2.x Proportional Fonts
#define GANGL 1<<3   //OS/2 Gangload area

In the following examples, it is assumed that neHeader is a variable of type struct NE_Header.

Segment Table

The segment table is similar to a section table in newer executable formats. Each segment can contain code, data, relocation information, etc.

It is found at neHeader.SegTableOffset from the beginning of neHeader and occupies neHeader.SegCount entries, each 8 bytes long.

// The type is a 3-bit integer or'ed together with the other flags.
#define SEGFLAGS_TYPE_CODE  0
#define SEGFLAGS_TYPE_DATA  1
#define SEGFLAGS_TYPE_MASK  0x0007

#define SEGFLAGS_MOVABLE    0x0010
#define SEGFLAGS_PRELOAD    0x0040
#define SEGFLAGS_HAS_RELOCS 0x0100
#define SEGFLAGS_DISCARD    0xF000

// Segment table entry
typedef struct {
    uint16_t    SectorBase; // offset in sectors from beginning of file;
                            // byte offset: segEnt.SectorBase * (1 << neHeader.FileAlnSzShftCnt)
    uint16_t    SegBytes;   // length of segment in file, in bytes
    uint16_t    SegFlags;   // see SEGFLAGS_*
    uint16_t    MinAlloc;   // minimum number of bytes to allocate
} NE_SegEnt;

Resource Table

The resource table contains structures such as icons and version information.

It is found at neHeader.ResTableOffset from the beginning of neHeader. Its length can only be determined by decoding it.

It begins with a short header:

typedef struct {
    uint16_t    AlignmentShiftCount;
} NE_ResourceTableHeader;

Then follows a table of resource types and resources.

typedef struct {
    uint16_t    TypeID;
    uint16_t    ResourceCount;
    uint32_t    Reserved;
} NE_ResourceTypeHeader;

#define RESOURCE_FLAGS_MOVEABLE 0x0010
#define RESOURCE_FLAGS_PURE     0x0020
#define RESOURCE_FLAGS_PRELOAD  0x0040

typedef struct {
    uint16_t    FileOffset;     // byte offset of resource from start of file:
                                // resource.FileOffset * (1 << resourceTableHeader.AlignmentShiftCount)
    uint16_t    ResourceLength; // actual number of bytes:
                                // resource.ResourceLength * (1 << resourceTableHeader.AlignmentShiftCount)
    uint16_t    ResourceFlags;  // see RESOURCE_FLAGS_*
    uint16_t    ResourceID;
    uint32_t    Reserved;
} NE_Resource;

These have to be decoded in a loop:

  1. Read a uint16_t TypeID;.
  2. If TypeID is 0x0000, this is the end of the resource table; break out of the loop.
  3. Read the rest of NE_ResourceTypeHeader.
  4. Loop resourceTypeHeader.ResourceCount times to read the NE_Resource for each resource of the given type.

Resource Type IDs and Resource IDs

Resource type IDs and resource IDs are either integers or strings.

If the top bit of the ID is set (0x8000), the ID is an integer ID.

If the top bit is not set, the ID is an offset, in bytes relative to the beginning of the resource table, to the type string.

Type strings are encoded as Pascal strings: one uint8_t specifying the length, followed by as many bytes containing the characters. Make sure to add the NUL terminator at the end of the string if you are using Pascal strings with C string functions!

Resident-Name Table

The resident-name table contains the name of the module at index 0 followed by the module's exported procedures that stay resident in memory.

It is found at neHeader.ResidNamTable from the beginning of neHeader. Its length depends on the length of the strings contained within.

The structure of a resident-name table entry is roughly this:

typedef struct {
    uint8_t  NameLength;
    char     Name[NameLength];
    uint16_t OrdinalNumber;
} NE_ResidentNameEntry;

As is often the case with NE structures, the resident-name table must be read in a loop:

  1. Read uint8_t NameLength.
  2. If NameLength is 0x00, this is the end of the table; break out of the loop.
  3. Read NameLength bytes to obtain the Name.
  4. Read uint16_t OrdinalNumber.

Module-Reference Table

The module-reference table is an additional layer of indirection when looking up module names in the imported names table.

It is found at neHeader.ModRefTable from the beginning of neHeader. Its entries are all of type uint16_t and its length in bytes is neHeader.ModRefs times 2 (the size of uint16_t).

The usage of the module reference table is described below, in the section concerning relocations.

Imported-Name Table

The imported name table contains the names of modules and procedures that are being imported by the executable. Its entries are referenced by the module-reference table and the relocation tables of segments.

It is found at neHeader.ImportNameTable from the beginning of neHeader. Its entries are all Pascal strings:

typedef struct {
    uint8_t  NameLength;
    char     Name[NameLength];
} NE_ImportedNameTableEntry;

i.e. one uint8_t of length followed by that many bytes of characters. (Make sure to add the NUL terminator at the end of the string if you are using Pascal strings with C string functions!)

The usage of the imported name table is described below, in the section concerning relocations.

Entry Table

The entry table contains information about the entry points in the executable and is organized in bundles.

It is found at neHeader.EntryTableOffset from the beginning of neHeader. Its format differs depending on the segment indicator of each bundle. Let's begin with a few definitions:

#define ENTRY_FLAGS_EXPORTED   0x01
#define ENTRY_FLAGS_GLOBALDATA 0x02

typedef struct {
  uint8_t  EntryFlags;       // see ENTRY_FLAGS_*
  uint16_t EntryPointOffset; // entry point offset within segment
} NE_FixedSegmentEntry;

typedef struct {
  uint8_t  EntryFlags;       // see ENTRY_FLAGS_*
  uint16_t Int3fh;           // constant value (interrupt opcode)
  uint8_t  SegmentNumber;    // number of the movable segment
  uint16_t EntryPointOffset; // entry point offset within segment
} NE_MovableSegmentEntry;

The entry table can be decoded bundle for bundle in a loop as follows:

  1. Read uint8_t EntryCount.
  2. If EntryCount is 0x00, this is the end of the entry table; break out of the loop.
  3. Read uint8_t SegmentIndicator.
  4. If SegmentIndicator is 0x00, this is a bundle of unused entries. Read no further bytes for this bundle, mark the next EntryCount ordinals as unused and continue with step 1.
  5. If SegmentIndicator is 0xFF, these are movable segment entries. Read EntryCount structures of type NE_MovableSegmentEntry, then continue with step 1.
  6. If SegmentIndicator is any other value, these are fixed segment entries. Read EntryCount structures of type NE_FixedSegmentEntry (using SegmentIndicator as the segment number), then continue with step 1.

Nonresident-Name Table

The nonresident-name table contains a description of the module at index 0 followed by the module's exported procedures that do not stay resident in memory.

It is found at neHeader.OffStartNonResTab from the beginning of the executable file, in contrast to all other NE tables. Its length depends on the length of the strings contained within.

Its format is the same as that of the resident-name table, described in a previous section.

Relocation Data

If a segment has the SEGFLAGS_HAS_RELOCS in SegFlags set, its content is followed directly by its relocation table.

To access a segment's relocation data, seek to segEnt.SectorBase * (1 << neHeader.FileAlnSzShftCnt) + segEnt.SegBytes. The structure is as follows:

#define ENTRY_FLAGS_EXPORTED   0x01
#define ENTRY_FLAGS_GLOBALDATA 0x02

typedef struct {
  uint16_t RelocEntryCount;
  NE_SegRelocationEntry Entries[RelocEntryCount];
} NE_SegRelocationInfo;

typedef struct {
  uint8_t  Source;            // see RELOC_SOURCE_*
  uint8_t  FlagsAndTarget;    // one of RELOC_TARGET_* OR'ed with any number of RELOC_FLAGS_*
  uint16_t SourceChainOffset;
  union { // variant depends on target within FlagsAndTarget (see RELOC_TARGET_*)
    struct {
      uint8_t  SegmentNumber; // 0xFF == movable segment, other value == fixed segment
      uint8_t  Zero;          // always 0
      uint16_t SegmentIndex;  // if fixed segment: offset into segment;
                              // if movable segment: ordinal number index into Entry Table
    } InternalRef;
    struct {
      uint16_t ModRefTableIndex; // index into module reference table
      uint16_t ProcNameOffset;   // offset from start of imported-names table to procedure name string
    } ImportName;
    struct {
      uint16_t ModRefTableIndex; // index into module reference table
      uint16_t OrdinalNumber;    // ordinal number of the procedure
    } ImportOrdinal;
    struct {
      uint16_t OSFixupType; // see OS_FIXUP_TYPE_*
      uint16_t Zero;        // always 0
    } OSFixup;
  } Value;
} NE_SegRelocationEntry;

#define RELOC_SOURCE_LOW_BYTE 0x00
#define RELOC_SOURCE_SEGMENT 0x02
#define RELOC_SOURCE_FAR_ADDR 0x03 /* (32-bit pointer) */
#define RELOC_SOURCE_OFFSET 0x05 /* (16-bit offset) */

#define RELOC_TARGET_INTERNAL_REF 0x00
#define RELOC_TARGET_IMPORT_ORDINAL 0x01
#define RELOC_TARGET_IMPORT_NAME 0x02
#define RELOC_TARGET_OS_FIXUP 0x03
#define RELOC_TARGET_MASK 0x03

#define RELOC_FLAGS_ADDITIVE 0x04
#define RELOC_FLAGS_MASK 0xFC

#define OS_FIXUP_TYPE_FIARQQ_FJARQQ 0x0001
#define OS_FIXUP_TYPE_FISRQQ_FJSRQQ 0x0002
#define OS_FIXUP_TYPE_FICRQQ_FJCRQQ 0x0003
#define OS_FIXUP_TYPE_FIERQQ        0x0004
#define OS_FIXUP_TYPE_FIDRQQ        0x0005
#define OS_FIXUP_TYPE_FIWRQQ        0x0006

To find the module name for an IMPORT_NAME or IMPORT_ORDINAL relocation, do the following:

  1. Remember your current location in the file.
  2. Calculate the offset startOfNeHeader + neHeader.ModRefTable + 2 * (segRelocationEntry.ImportName.ModRefTableIndex - 1) and seek to it.
  3. Read uint16_t moduleNameOffset.
  4. Calculate the offset startOfNeHeader + neHeader.ImportNameTable + moduleNameOffset and seek to it.
  5. Read the Pascal string (one uint8_t of length, that many bytes of character data) at that location. (Make sure to add the NUL terminator at the end of the string if you are using Pascal strings with C string functions!)
  6. Seek to the location remembered in the first step.

(You may wish to store the names in a lookup table mapping ModRefTableIndex to module names to speed up repeated lookups of the same index.)

To find the procedure name for an IMPORT_NAME relocation, do the following:

  1. Remember your current location in the file.
  2. Calculate the offset startOfNeHeader + neHeader.ImportNameTable + segRelocationEntry.ImportName.ProcNameOffset and seek to it.
  3. Read the Pascal string (one uint8_t of length, that many bytes of character data) at that location. (Make sure to add the NUL terminator at the end of the string if you are using Pascal strings with C string functions!)
  4. Seek to the location remembered in the first step.

Dynamic Libraries

If the module is a DLL, the CS:IP in the header is instead the load procedure of the DLL and SS:SP is unused (since libraries do not get their own stacks). It takes the module handle in AX, which is an opaque data type that represents an EXE or a DLL when loaded in memory. On return it outputs a AX!=0 for SUCCESS and AX==0 for failure, which is unusual but very important.

See Also

NE

Similar Formats