#BEGIN_LEGAL
#
#Copyright (c) 2025 Intel Corporation
#
#  Licensed under the Apache License, Version 2.0 (the "License");
#  you may not use this file except in compliance with the License.
#  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
#  limitations under the License.
#  
#END_LEGAL

// This file does not contain any code
// it just contains additional information for
// inclusion with doxygen


// ===========================================================================  
/*! 
@mainpage Intel&reg; X86 Encoder Decoder User Guide

September 2024

@section INTRO Introduction


Intel&reg; XED is an acronym for X86 Encoder Decoder. The
latter part is pronounced like the (British) English "z".

Intel&reg; X86 Encoder Decoder (Intel&reg; XED) is a software library 
(and associated headers) written in C for encoding and decoding X86 
(IA-32 instruction set and Intel&reg; 64 instruction set) instructions. 
The decoder takes sequences of 1-15 bytes along with machine mode information 
and produces a data structure describing the opcode and operands, and flags. 
The generic encoder takes a similar data structure and produces a sequence 
of 1 to 15 bytes.

There is another encoder called "enc2" available that is much faster than
the generic encoder mentioned above.  Rather than using a generic
interface, in enc2, instruction encoding is done by calling one of a
very large number of functions, passing as arguments the registers and
constants that would be used in the assembly language description of the
instruction.  There are two interfaces to the enc2 encoder:
unchecked and checked. The unchecked version is faster and assumes
that the arguments passed in are in the correct ranges. The checked
version validates that the arguments passed in are in the correct
ranges and if that succeeds, it calls the corresponding unchecked
version of the function.  The checking can be skipped if desired using
a runtime setting. The enc2 encoder is available in builds with
the "--enc2" option. Due to the large amount of code generated, that
build takes longer.

Intel&reg; XED is multi-thread safe.

Intel&reg; XED was designed to be very fast and extensible.

Intel&reg; XED compiles with the following compilers:
    <ul>
    <li> GNU GCC
    <li> Microsoft Visual Studio
    <li> LLVM/Clang
    </ul>


Intel&reg; XED works with the following operating systems:
    <ul>
    <li> Linux
    <li> Microsoft Windows  (with and without cygwin)
    <li> FreeBSD
    </ul>

The Intel&reg; XED examples (@ref EXAMPLES) also include binary image readers for
Windows PECOFF, ELF, and Mac OS X* MACHO binary file formats for 32b and
64b. These allow Intel&reg; XED to be used as a simple (not symbolic)
disassembler. The Intel&reg; XED disassembler supports 3 output formats: Intel,
ATT SYSV, and a more detailed internal format describing all resources
read and written.


@section TOC Table of Contents
    - @ref BUILD    "Building"      Building your program with Intel&reg; XED
    - @ref EXTERN   "External"      External Requirements
    - @ref TERMS    "Terms"         Terminology
    - @ref OVERVIEW "Overview"      Overview of the Intel&reg; XED approach
    - @ref API_REF  "API reference" Detailed descriptions of the API
    - @ref EXAMPLES "Examples"      Examples
    - @ref LEGAL    "Disclaimer and Legal Information"     


@section BUILD Building your program using Intel&reg; XED.

This section describes the requirements for compiling with Intel&reg; XED and
linking the libxed.a library. It assumes you are building from an
Intel&reg; XED kit and not directly from the sources. (See the "install"
option in the Intel&reg; XED build manual for information on making kits).

The structure of a Intel&reg; XED kit is as follows:
@code

                              |-bin------
                              |-doc------|-html-
                              |-examples-
               |-xed-kit-name-|-include--
                              |-lib------
                              |-misc-----
@endcode


To use Intel&reg; XED your sources should include the top-most header file: xed-interface.h. 

Your compilation statement must include:
@code
-Ixed-kit-name/include
@endcode
where "xed-kit-name" is where you've unpacked the Intel&reg; XED kit.

Your Linux or Mac OS X* link statement must reference the libxed library:
@code
-Lxed-kit-name/lib -lxed
@endcode

(or link against xed.lib for Windows).

Intel&reg; XED uses base types with the following names: xed_uint8_t,
xed_uint16_t, xed_uint32_t, xed_uint64_t xed_int8_t, xed_int16_t,
xed_int32_t, and xed_int64_t. Intel&reg; XED also defines a "xed_uint_t" type
that is shorthand for "unsigned int".


Please see the section @ref INIT for more information about using
Intel&reg; XED, and also the examples in @ref EXAMPLES.

@section EXTERN External Requirements

Intel&reg; XED was designed to have minimal external requirements. Intel&reg; XED makes no
system calls. Intel&reg; XED allocates no memory. (The examples are
different). The following external functions/symbols are required for
linking a program with libxed, with one caveat: The functions fprint
and abort and the data object stderr are optional. If users register
their own abort handler using #xed_register_abort_function () , then
fprintf, stderr, and abort are not required and can be stubbed out to
satisfy the linker.

Required:
<ul>
<li>memcmp
<li>memcpy
<li>memset
<li>strcmp
<li>strlen
<li>strncat
</ul>

Optional:
<ul>
<li>abort
<li>fprintf
<li>stderr
</ul>

@section TERMS Terminology


X86 instructions are 1-15 byte values. They consist of several
well-defined components:
    <ul>
    <li> Prefix bytes. 
         <ul>
            <li> Legacy prefix bytes used for many purposes (described further below).
            
            <li> REX prefix byte but only in 64b mode. It has 4 1-bit
            fields: W, R, X, and B.  The W bit modifies the operation
            width. The R, X and B fields extend the register
            encodings. The REX byte must be right before the opcode
            bytes else it is ignored.

            <li> REX2 prefix, a 2-byte variant of the REX prefix, introduced with Intel&reg; APX extensions (see @ref APX), 
            adds 16 Extended General Purpose Registers (EGPRs) across the legacy instruction set. 
            It has eight 1-bit fields: M0, R4, X4, B4, W, R3, X3 and B3.
            R3, X3, B3 and W bits are the same as R, X and B bits in the REX prefix.
            While R4, X4, and B4 are additional bits used to encode the 32 EGPR registers.
            M0 bit selects between legacy maps 0 and 1 (1-byte opcodes no escape and 2-byte opcodes escape 0x0F respectively).
       
            <li> VEX prefix byte sequence. The VEX prefix is used
            mostly for AVX1 and AVX2 instructions as well as BMI1/2
            instructions and mask operations in Intel&reg; AVX512. The VEX prefix
            comes in two forms. The 2-byte sequence begins with an
            0xC5 byte. The 3-byte sequence begins with an 0xC4 byte.

            <li> EVEX prefix. The EVEX 4-byte sequence used for
            encoding Intel&reg; AVX512 instructions and begins with an 0x62 byte. Intel&reg; APX provides
            an extended version of the prefix, where the semantics of several payload bits are redefined.
            The extension is essentially used to provide Intel&reg; APX features for legacy instructions that cannot be provided 
            by other prefixes, such as support for the new data destination (see @ref APX) or status flags update suppression
            "no flags" which are represented by the ND and NF bits respectively in the third payload byte.
            Note that the byte following the extended EVEX prefix is always interpreted as the main opcode byte.
            
         </ul>

         There are somewhat complex rules about which prefixes are
         allowed, in what order, and in what modes. Intel&reg; XED handles that
         complexity.
         
    <li> 1-3 opcode bytes. When more than one opcode byte is required
    the leading bytes (called escapes) are either 0x0F, 0x0F 0x38, or
    0x0F 0x3A.  With VEX and EVEX prefixes, the escape bytes are
    encoded differently.
    
    <li> MODRM byte. Used for addressing memory, refining opcodes,
    and specifying registers.  Optional, but common.  It has three fields: the
    2-bit "mod", the 3-bit "reg" and 3-bit "r/m" fields.
       
    <li> SIB byte. Used for specifying memory addressing, optional.
     It has three fields: the 2-bit scale, 3-bit index, and 3-bit base.
       
    <li> Displacement bytes. Used for specifying memory offsets, optional.
    <li> Immediate bytes.  Optional
    </ul>


Immediates and displacements are usually limited to 4 bytes, but several 
variants of the MOV instruction can take 8B values. 
The AMD 3DNow ISA extension uses the immediate field to
provide additional opcode information.

The legacy prefix bytes are used for:
    <ul>
    <li> operand size overrides (1 prefix), 
    <li> address size overrides (1 prefix), 
    <li> atomic locking (1 prefix), 
    <li> default segment overrides (6 prefixes), 
    <li> repeating certain instructions (2 prefixes), and
    <li> opcode refinement. 
    </ul>

There are 11 distinct legacy prefixes. Three of them (operand size
and the two repeat prefixes) have different meanings in different
contexts. Sometimes they are used for opcode refinement and do not
have their default meaning. Less frequently, two of the segment
overrides can be used for conditional branch hints.

There are also multiple ways to encode certain instructions, with the
same or differing length. 

For additional information on the instruction semantics and encodings:
<ul>
<li>  <a href="http://www.intel.com/sdm">http://www.intel.com/sdm</a> The Intel&reg; 64 and IA-32 Architectures Software Developers Manuals
<li>  <a href="http://www.intel.com/software/isa">http://www.intel.com/software/isa</a> Information on future ISA extensions.
</ul>

@subsection APX Intel&reg; APX

Intel&reg; Advanced Performance Extensions (Intel&reg; APX) expands the Intel&reg; 64 instruction set architecture with
access to more registers and adds various new features that improve general-purpose performance. The
extensions are designed to provide efficient performance gains across a variety of workloads without
significantly increasing the silicon area or power consumption of the core.
The main features of Intel&reg; APX include:
<ul>
    <li> Extended GPRs, also known as EGPRs (see @ref APX_OPERANDS)
    <li> Three-operand instructions with a new data destination (NDD); legacy integer instructions can now use EVEX to encode a dedicated
    destination register operand – turning them into three-operand instructions and reducing the need for extra register move instructions.
    The NDD receives the result of the computation, and all other operands (including the original destination operand) become read-only source operands
    <li> Legacy-promoted instructions that support status flag update suppression "no flags" (NF); an option for the compiler to suppress the status flags writes 
    of common instructions (no CSPAZO flags, such as Parity, Overflow...)
    <li> Conditional ISA improvements: New conditional load, store and, compare instructions
    <li> Optimized register state save/restore operations
    <li> A new 64-bit absolute direct jump instruction
    <li> Zero Upper (ZU) support for several APX-Promoted instructions, which zero the upper bits of a destination GPR. The destination GPR will get the 
    instruction’s result in bits [OSIZE-1:0] and, if OSIZE < 64b, have its upper bits [63:OSIZE] zeroed
</ul>

Intel&reg; APX instructions' definition by Intel&reg; XED;

Legacy:
  - Instructions with REX2 prefix are not defined with new iforms or new ISA-SETs
  - The APXLEGACY extension group includes new APX-F instructions

EVEX:
  - Existing (non-APX) EVEX instructions with EGPRs are not defined with new iforms or new ISA-SETs
  - Promoted and new instructions are defined with new iforms using the '_APX' suffix
  - Promoted new data destination instructions with the 'APX_NDD' attribute
  - Promoted no flags instructions with the 'APX_NF' attribute
  - The APXEVEX extension group includes new and promoted APX-F instructions

@subsection AVX10 Intel&reg; AVX10

Intel&reg; Advanced Vector Extensions 10 (Intel&reg; AVX10) establishes a common, converged vector instruction set across all Intel&reg; architectures, incorporating the modern 
vectorization aspects of Intel&reg; AVX-512.

The Intel&reg; AVX10 architecture introduces several new features and capabilities;
<ul>
    <li> Introduces a version-based instruction set enumeration
    <li> Allows a converged implementation supported on all Intel&reg; CPUs to include all the existing Intel&reg;  AVX-512 capabilities such 
    as EVEX encoding, 32 vector registers and 8 32-bit opmask registers at maximum vector length of 256 (Intel&reg; AVX10/256)
    <li> Allows an implementation to include support for 512-bit vector and 64-bit opmask registers on P-Core CPUs (Intel&reg; AVX10/512) for 
    heavy vector compute applications that can leverage the additional vector length
    <li> Introduces embedded rounding and Suppress All Exceptions (SAE) control for YMM versions of the instructions
</ul>


@section OVERVIEW Overview of Intel&reg; XED approach

Intel&reg; XED has two fundamental interfaces: encoding and decoding. Supporting
these interfaces are many data structures, but the two starting points
are the #xed_encoder_request_t and the #xed_decoded_inst_t .  The
#xed_decoded_inst_t has more information than the
#xed_encoder_request_t , but both types are derived from a set of
common fields called the #xed_operand_values_t. 

The output of the decoder, the #xed_decoded_inst_t , includes additional
information that is not required for encoding but provides more
information about the instruction resources.

The common operand fields, used by both the encoder and decoder, hold
the operands and the memory addressing information. 

The decoder has an operands array that holds the order of the decoded
operands. This array indicates whether or not the operands are read or
written.

The encoder has an operand array where the encoder user must specify
the order of the operands used for encoding.

@subsection CPUID CPUID

Intel&reg; XED ISA-SETs can be mapped to one or more CPUID groups, each being mapped to one or more CPUID records.
The CPUID record contains information about the register containing the bits to be set, the leaf, subleaf and bit indices. 
When the leaf and subleaf values are loaded into the EAX and ECX registers, respectively, the CPUID instruction sets the specified
bits of the specified register, indicating support for the ISA or the feature, which is often the CPUID name field.

Intel&reg; AVX10 introduced a versioned approach for enumeration that ensures that all Intel&reg; CPUs support the same features 
and instructions at a given Intel&reg; AVX10 version number. This approach also reduced the required number of CPUID feature flags
to be checked to determine feature support. This way, usually, it is only needed to check three fields:
1.	A CPUID feature bit indicating that the Intel&reg; AVX10 ISA is supported
2.	A version number to ensure that the supported version is greater than or equal to the desired version
3.	A vector length bit indicating the maximum supported vector length 

Determining whether an ISA-SET is supported by a chip:
For ISA-SETs with a single CPUID group, all of its CPUID records must be set in order to be supported by the chip.
For ISA-SETs with multiple CPUID groups, at least one CPUID group must be satisfied. In order to match one group, all of its cpuid records
must be set. To simplify things, we can transform it into a logical expression -
@code
        "CPUID GROUP A"                                        OR     "CPUID GROUP B"                                       OR ...
        ("CPUID RECORD A.A" AND "CPUID RECORD A.B" AND ... )   OR    ("CPUID RECORD B.A" AND "CPUID RECORD B.B" AND ... )   OR ...
@endcode
If one CPUID group is satisfied, the whole expression will be satisfied ("OR" relationship), thus indicating chip support for the ISA.
Since the CPUID group itself is an "AND" expression between all of its CPUID records, all CPUID records must be set (satisfied)
in order to satisfy the sub-expression.

For instance, the ISA-SET AVX512F_512 has the following CPUIDS groups: 
The Intel&reg; AVX10 CPUID group with three CPUID records:
<ul>
    <li> CPUID name avx10_enabled, leaf 0x7, sub-leaf 0x1, register EDX, bit 19
    <li> CPUID name avx10_ver1, leaf 0x24, sub-leaf 0x0, register EBX, bits 0 to 7
    <li> CPUID name avx10_512vl, leaf 0x24, sub-leaf 0x0, register EBX, bit 18
</ul>
The feature group with a single CPUID record:
<ul>
    <li> CPUID name avx512f, leaf 0x7, sub-leaf 0x0, register EBX, bit 16
</ul>
This means that a chip supports AVX512F_512 ISA if at least one of the two groups has a match.
In order to match one CPUID group, all of its records must be set. So either the first group's three CPUID records or the second 
group's single CPUID record must be set.

To provide further insight on Intel&reg; AVX10 CPUID, let's discuss the first CPUID group of AVX512F_512:
The first record ("AVX10 Converged Vector ISA Enable" bit) is indicative of processor support of Intel&reg; AVX10 ISA. The second CPUID record 
specifies the processor's minimal required Intel&reg; AVX10 version (in this case, AVX10.1). The last CPUID record is the vector length bit indicating 
the maximum supported VL (512).

For the recommended usage of the Intel&reg; XED CPUID APIS, see @ref SMALLEXAMPLES .

// ===========================================================================  
@section ICLASS Instruction classes 

The #xed_iclass_enum_t class describes the instruction names. The
names are (mostly) taken from the Intel manual, with exceptions only
for certain ambiguities.  This is what is typically thought of as the
instruction mnemonic. Note, Intel&reg; XED does not typically distinguish
instructions based on width unless the ISA manuals do so as well.  For
example, #xed_iclass_enum_t's are not suffixed with "w", "l" or "q"
typically. There are instructions whose #xed_iclass_enum_t ends in a
"B" or a "Q" (including all byte operations and certain string
operations) and those names are preserved as described in the Intel
programmers' reference manuals.




@subsection SPECIAL Special Cases

There are many special cases that must be accounted for in attempting
to handle all the nuances of the ISA. This is an attempt to explain
the nonstandard handling of certain instruction names.

The FAR versions of 3 opcodes (really 6 distinct opcodes) are given
the opcode names CALL_FAR, JMP_FAR, and RET_FAR. The AMD documentation
lists the far return as RETF. I call that RET_FAR to be consistent
with the other far operations.

To distinguish the SSE2 MOVSD instruction from the base string
instruction MOVSD, Intel&reg; XED calls the SSE version MOVSD_XMM.

In March 2015, a change was made to certain Intel&reg; XED iclasses to simplify
the implementation. The changes are as follows:
    <ul>
    <li> XED_ICLASS_JRCXZ was split in to three distinct iclasses:
    XED_ICLASS_JCXZ, XED_ICLASS_JECXZ and XED_ICLASS_JRCXZ.
    <li> The REP-prefixed (0xF2, 0xF3) string instructions were split
    in to new iclasses making them distinct from the underlying
    non-REP-prefixed instructions.  For example XED_ICLASS_REP_STOSW
    is distinct from XED_ICLASS_STOSW.  The CMPS{B,W,D,Q} and
    SCAS{B,W,D,Q} instructions have "REPE_" or "REPNE_" prefixes to
    correspond to REPE (0xF3) or REPNE (0xF2).
    <li> LOCK-prefixed (0xF0) atomic read-modify-write memory
    instructions were split in to separate iclasses that contain the
    substring "_LOCK".  LOCK-prefixed instructions had an attribute
    XED_ATTRIBUTE_LOCKED. Memory instructions that could have a lock
    prefix added to them when encoding, have an attribute
    XED_ATTRIBUTE_LOCKABLE.  For example, XED_ICLASS_CMPXCHG16B_LOCK
    has a lock prefix, but XED_ICLASS_CMPXCHG16B does not have a lock
    prefix.  As always, XCHG is atomic with or without a LOCK prefix
    as per the rules of the ISA, so XED_ICLASS_XCHG does not have a
    _LOCK suffix in the xed_iclass_enum_t name.
    </ul>

@subsection NOPs

NOPs are very special. Intel&reg; XED allows for encoding NOPs of 1 to 9 bytes
through the use of the XED_ICLASS_NOP (the one-byte nop), and
XED_ICLASS_NOP2 ... XED_ICLASS_NOP9. These use the recommended NOP
sequences from the Intel&reg; 64 and IA-32 Architectures Software Developers Manual.

The instruction 0x90 is very special in the instruction set because it
gets special treatment in 64b mode. In 64b mode, 32b register writes
normally zero the upper 32 bits of a 64b register. Not so for 0x90. If
it did zero the upper 32 bits, it would not be a NOP.

There are two important NOP categories. XED_CATEGORY_NOP and
XED_CATEGORY_WIDENOP. The XED_CATEGORY_NOP applies only to the 0x90
opcode. The WIDENOP category applies to the NOPs in the two-byte table
row 0F19...0F1F. The WIDENOPs take MODRM bytes, and optional SIB and
displacements.

// ===========================================================================
// @section X86-OPERANDS Operands


Intel&reg; XED uses the operand order documented in the Intel Programmers'
Reference Manual.  In most cases, the first operand is a source and
destination (read and written) and the second operand is just a source
(read).

For decode requests (#xed_decoded_inst_t), the operands array is
stored in the #xed_inst_t structure once the instruction is
decoded. The request's operand order is stored in the #xed_encoder_request_t
for encode requests.

There are several types of operands: 
      <ul>
      <li> registers (#xed_reg_enum_t)
      <li> branch displacements  
      <li> memory operations (which include base, index, segment and memory displacements)
      <li> immediates
      <li> pseudo resources (which are listed in the #xed_reg_enum_t)
      </ul>

Each operand has two associated attributes: the R/W action and a
visibility. The R/W actions (#xed_operand_action_enum_t) indicate
whether the operand is read, written or both read-and-written, or
conditionally read or written.  The visibility attribute
(#xed_operand_visibility_enum_t) is described in the next subsection.

The memory operation operand is really a pointer to separate fields
that hold the memory operation information. The memory operation information is comprised of the following:
     <ul>
     <li> a segment register
     <li> a base register
     <li> an index register
     <li> a displacement
     </ul>

There are several important things to note:
      <ul>
      <li> There can only be two memory operations, MEM0 and MEM1.
      
      <li> MEM0 could also be an AGEN, which stands for "Address
         Generation". AGEN is a special operand that uses memory
         information but does not actually read memory. This is only
         used for the LEA instruction.
         
      <li> There can only be an index and displacement associated with
         MEM0 (or AGEN).
      
      <li> There is just one displacement associated with the common
         fields. It could be associated with either the AGEN/MEM0 or
         with a branch or call instruction.
         
      </ul>

@subsection AVX512_OPERANDS Intel&reg; AVX512 Operands

Intel&reg; AVX512 adds write masking, merging, and zeroing to the
instruction set via the EVEX encodings.  Write masking, merging, and
zeroing are properties of the instruction encoding and are not visible
by looking at individual operands. Write masking with merging makes it
possible for values of the destination register to live on from prior
to the execution of the instruction. Write masking with merging
results in an extra register read of the destination operand. In
contrast write masking with zeroing always completely overwrites the
destination operand, either with values computed by the instruction or
with zeros for elements that are "masked off".

For most operands, to learn if the operand reads or writes its
associated resource, one can use #xed_operand_rw(const xed_operand_t*
p). However, because masking, merging and zeroing are properties of the
instruction, and not just the operand, use of a different function is
required.

To handle this, Intel&reg; XED has a new interface function
#xed_decoded_inst_operand_action(), which takes a #xed_decoded_inst_t
pointer and an operand index and indicates how the read/write behavior
is modified in the presence of masking with merging or masking with
zeroing.

The following list attempts to summarize how the value returned from
xed_operand_rw() is internally modified for the 0th operand, except
for stores:
<ul>
<li> no masking: no change. 
<li> masking with zeroing: no change. 
<li> masking with merging : destination register operands 
     that are nominally "rw" or "w" become "rcw" indicating
     a read with a conditional write.
</ul>

@subsection APX_OPERANDS Intel&reg; APX Operands

2023 saw the introduction of Intel&reg; Advanced Performance Extensions (Intel&reg; APX),
which expands the entire x86 instruction set with access to more registers.
Intel&reg; APX doubles the number of general-purpose registers (GPRs) from 16 to 32 (Extended GPRs or EGPRs).

New and promoted APX-F instructions are defined in one of the following Intel&reg; XED extension groups:
- XED_EXTENSION_APXLEGACY: For new APX-F instructions within the Legacy encoding space
- XED_EXTENSION_APXEVEX: For new and promoted APX-F instructions within the EVEX encoding space

CCMP and CTEST are new instruction sets for conditional CMP and TEST, introducing a 4-bit "Default Flags Values" 
behavior, represented in Intel&reg; XED as the DFV XED-operand.
The DFV XED-operand encodes default flag bits as:
- DFV.bit[0]: Carry Flag (CF)
- DFV.bit[1]: Zero Flag (ZF)
- DFV.bit[2]: Sign Flag (SF)
- DFV.bit[3]: Overflow Flag (OF)

To use DFV:
- Decoder:
  - Use xed_decoded_inst_has_default_flags_values() to detect DFV-supported instructions.
  - Use xed_decoded_inst_get_default_flags_values() to retrieve default flags as a xed_flag_dfv_t struct.
- Intel&reg; XED CLI encode request:
  - Set the DFV XED-operand as an integer representing these 4 bits in the encoder request.

Developers can, however, dynamically disable Intel&reg; APX architecture encoder support using the 'NO_APX' API xed3_operand_set_no_apx().
The xed3_operand_set_must_use_evex() API can also be used for APX promoted instructions in order to force EVEX space upon the encoding request.

Developers wishing to encode No-Flags Intel&reg; APX instructions should set the NF Intel&reg; XED operand.

@subsection OPERAND_VISIBILITY Operand Resource Visibilities

See #xed_operand_visibility_enum_t .

There are three basic types of resource visibilities: 
      <ul>
      <li> EXPLICIT (EXPL), 
      <li> IMPLICIT (IMPL), and
      <li> IMPLICIT SUPPRESSED (SUPP) (usually referred to as just "SUPPRESSED").
      </ul>

Explicit are what you think they are: resources that
are required for the encoding, and for each explicit resource, and there is
a field in the corresponding instruction encoding for each explicit resource.  The implicit and
suppressed resources are more subtle.


SUPP operands are:
 <ul>
 <li> not used in picking an encoding, 
 <li> not printed in disassembly, 
 <li> not represented using operand bits in the encoding.
 </ul>
IMPL operands are:
 <ul>
 <li> used in picking an encoding, 
 <li> expressed in disassembly, and 
 <li> not represented using operand bits in the encoding (like SUPP).
 </ul>

The implicit resources are required for selecting an encoding but do
not show up as a specific field in the instruction
representation. Implicit resources do show up in a conventional
instruction disassembly. In the IA-32 instruction set or Intel64
instruction set, there are many instructions that use EAX or RAX
implicitly, for example.  Sometimes, the CL or RCX register is
implicit. Also, some instructions have an implicit 1 immediate. The
opcode you chose fixes your choice of implicit register or immediate.

The suppressed resources are a form of implicit resource, but they are
resources not required for encoding. The suppressed operands are not
normally displayed in a conventional disassembly.  The suppressed
operands are emitted by the decoder but are not used when
encoding. They are ignored by the encoder. Examples are the stack
pointer for PUSH and POP operations. There are many others, like
pseudo resources. 


The explicit and implicit resources are expressed resources -- they show
up in disassembly and are required for encoding.
The suppressed resources are considered a kind of implicit 
resources that are not expressed in ATT System V or Intel disassembly formats.

The suppressed operands are always after the implicit and explicit operands
in the operand order.  



@subsection X87_REG_STACK x87 Register stack popping

The Intel&reg; 64 and IA-32 Architectures Software Developers Manual indicates that "FADDP st2",
reads st0, st2 writes st2 and pops the x87 stack. The result ends up
in st1 after the instruction executes. That is not how Intel&reg; XED represents
the operation.  Intel&reg; XED will say that "FADDP st2" reads st0 and st2 and
writes st2. The output register that Intel&reg; XED provides is essentially "pre
pop". The pop occurs afterward, conceptually. The actual result ends
up in the st1 register after the stack pop operation.  Intel&reg; XED also lists
the pseudo resources indicating that a stack pop has occurred. This
behavior affects the output register of the following instructions: FADDP,
FMULP, FSUBRP, FSUBP, FDIVRP, FDIVP.

@subsection PSEUDO_RESOURCES Pseudo Resources

Some instructions reference machine registers or perform interesting
operations that we need to represent.  For example, the IDTR and GDTR
are represented as pseudo resources. Operations that pop the x87
floating point register stack can have an X87POP or X87POP2 "register"
to indicate if the x87 register stack is popped once or twice. These
are part of the #xed_reg_enum_t.

@subsection IMM_DIS Immediates and Displacements

Using the API functions for setting immediates, memory displacements,
and branch displacements.  Immediates and Displacements are stored in
normal integers internally, but they are stored endian swapped and
left justified.  The API functions take care of all the endian
swapping and positioning so you don't have to worry about that detail.

Immediates and displacements are different things in the ISA. They can
be 1, 2, 4 or 8 bytes.  Branch displacements (1, 2 or 4 bytes) and
Memory displacements (1, 2, 4 or 8 bytes) refer to the signed
constants that are used for relative distances or memory "offsets"
from a base register (including the instruction pointer) or start of a
memory region.

Immediates are signed or unsigned and are used for numerical
computations and shift distances. They also hold things like segment
selectors for far pointers for certain jump or call instructions.

There is also a second 1B immediate used only for the ENTER
instruction.

Intel&reg; XED will try to use the shortest allowed width for a displacement or
immediate. You can control Intel&reg; XED's selection of allowed widths using a
notion of "legal widths".  A "legal width" is a binary number where
each bit represents a legal desired width. For example, when you have
a valid base register in 32 or 64b addressing, and a displacement is
required, your displacement must be either 1 byte or 4 bytes
long. This is expressed by OR'ing 1 and 4 together to get 0101 (base
2) or 5 (base 10).

If a four-byte displacement was required, but the value was
representable in fewer than four bytes, then the legal width should be
set to 0100 (base 2) or 4 (base 10). 

@section API_REF  API Reference

  - @ref INIT         "INIT"       Initialization
  - @ref DEC          "DEC"        Decoding instructions
  - @ref ENC          "ENC"        Generic API for encoding instructions
  - @ref ENCHL        "ENCHL"      High level API for the generic encoder
  - @ref ENCHLPATCH   "ENCHLPATCH" Patching instructions
  - @ref ENC2         "ENC2"      Fast encoder for specific instructions
  
  - @ref OPERANDS     "OPERANDS"   Operand storage fields
  - @ref IFORM        "IFORM"      Iforms
  - @ref ISASET       "ISASET"     ISA-sets and chips
  - @ref PRINT        "PRINT"      Printing (disassembling) instructions
  - @ref REGINTFC     "REGINTFC"   Register interface functions
  - @ref FLAGS        "FLAGS"      Flags interface functions
  - @ref AGEN         "AGEN"       Address generation calculation support
  - @ref ENUM         "ENUM"       Enumerations
  - @ref EXAMPLES     "Examples"   Examples



@section LEGAL  Disclaimer and Legal Information

The information in this manual is subject to change without notice and
Intel Corporation assumes no responsibility or liability for any
errors or inaccuracies that may appear in this document or any
software that may be provided in association with this document. This
document and the software described in it are furnished under license
and may only be used or copied in accordance with the terms of the
license. No license, express or implied, by estoppel or otherwise, to
any intellectual property rights is granted by this document. The
information in this document is provided in connection with Intel
products and should not be construed as a commitment by Intel
Corporation.

EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH
PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS
ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL
PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A
PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT,
COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not
intended for use in medical, life saving, life sustaining, critical
control or safety systems, or in nuclear facility applications.

Designers must not rely on the absence or characteristics of any
features or instructions marked "reserved" or "undefined." Intel
reserves these for future definition and shall have no responsibility
whatsoever for conflicts or incompat- ibilities arising from future
changes to them.

The software described in this document may contain software defects
that may cause the product to deviate from published
specifications. Current characterized software defects are available
on request.

Intel, the Intel logo, Intel SpeedStep, Intel NetBurst, Intel
NetStructure, MMX, Intel386, Intel486, Celeron, Intel Centrino, Intel
Xeon, Intel XScale, Itanium, Pentium, Pentium II Xeon, Pentium III
Xeon, Pentium M, and VTune are trademarks or registered trademarks of
Intel Corporation or its subsidiaries in the United States and other
countries.

Other names and brands may be claimed as the property of others.

Copyright (c) 2002-2023 Intel Corporation. All Rights Reserved.

*/

// =============================================================
/*! @defgroup DEC Decoding Instructions

    To decode an instruction you are required to provide 
      <ul>
      <li> a machine state (operating mode and stack addressing width)
      <li> a pointer to the instruction text array of bytes
      <li> a length of the text array
      </ul>
 
    The machine state is passed in to decoder via the class 
    #xed_state_t .
    That
    state is set via the constructor of each
    #xed_decoded_inst_t .

    The 
    #xed_decoded_inst_t 
     contains the results of decoding after a successful decode.

    The #xed_decoded_inst_t includes an array of #xed_operand_values_t
    and that is where most of the information about the operands,
    resources etc. are stored. See the @ref OPERANDS interface. The
    array is indexed by the #xed_operand_enum_t enumeration. Do not
    access it directly though; use the interface functions in the @ref
    OPERANDS interface for portability.

    After decoding the #xed_decoded_inst_t contains a pointer to the
    #xed_inst_t which acts like a kind of template giving static
    information about the decoded instruction: what are the types of
    the operands, the iclass, category extension, etc. The #xed_inst_t
    is accessed via the #xed_decoded_inst_inst(cont
    xed_decoded_inst_t* xedd) function.

    Before every decode, you must call one of the initialization
    functions. The most common case would be to use
    #xed_decoded_inst_zero_keep_mode() or maybe
    #xed_decoded_inst_zero_set_mode().

  */


/*! @defgroup ENC Encoding Instructions

    When you call xed_encode() to encode instruction you must pass:
       <ul>
       <li> an encode structure that includes a machine state ( #xed_state_t )
       <li> a pointer to the instruction text
       <li> a length of the text array
       </ul>
    The class #xed_encoder_request_t includes a #xed_operand_values_t and
    that is where most of the information about the operands,
    resources etc. are stored.


    To get nondefault width operands, during encoding, you have to
    call #xed_encoder_request_set_effective_operand_width() .


    To set nondefault addressing widths, you must call
    #xed_encoder_request_set_effective_address_size().


       
    To encode instructions you must set the following 
in the #xed_encoder_request_t.
    <ol>
    <li> the machine mode (machine width, and stack addressing width)
    <li> the effective operand width
    <li> the iclass
    <li> for some instructions you need to specify prefixes (like REP,
    REPNE or LOCK).
    <li> the operands:
           <ol>

           <li>operand kind
            (XED_OPERAND_{AGEN,MEM0,MEM1,IMM0,IMM1,RELBR,ABSBR,PTR,REG0...REG15}

           <li>operand order <BR>
       xed_encoder_request_set_operand_order(&req,operand_index, XED_OPERAND_*);
        where the operand_index is a sequential index starting at zero.

           <li>operand details
                 <ol>
                 <li>    FOR MEMOPS: base,segment,index,scale,displacement 
                 for memops, 
                 <li>  FOR REGISTERS: register name
                 <li> FOR IMMEDIATES: immediate values
                 </ol>
           </ol>
    </ol>

    See @ref ENCODE_EXAMPLE for an example of using the encoder.
 
 */

/*! @defgroup ENCHL High Level API for Encoding Instructions

This is a higher level API for encoding instructions.

A full example is present in examples/xed-ex5-enc.c

In the following example we create one instructions template that can
be passed to the encoder.

    @code
 xed_encoder_instruction_t x; 
 xed_encoder_request_t enc_req;
 xed_state_t dstate;

 dstate.mmode=XED_MACHINE_MODE_LEGACY_32;
 dstate.stack_addr_width=XED_ADDRESS_WIDTH_32b;

 xed_inst2(&x, dstate, XED_ICLASS_ADD, 0, 
           xreg(XED_REG_EAX), 
           xmem_bd(XED_REG_EDX, xdisp(0x11223344, 32), 32));
  
 xed_encoder_request_zero_set_mode(&enc_req, &dstate);
 convert_ok = xed_convert_to_encoder_request(&enc_req, &x);
 if (!convert_ok) {
      fprintf(stderr,"conversion to encode request failed\n");
      continue;
 }
 xed_error = xed_encode(&enc_req, itext, ilen, &olen);

    @endcode


The high-level encoder interface allows passing the effective operand
width for the xed_inst*() function as 0 (zero) when the effective
operand width is the default.

The default width in 16b mode is 16b. The default width in 32b or 64b
modes is 32b.  So if you do a 16b operation in 32b/64b mode, you must
set the effective operand width. If you do a 64b operation in 64b
mode, you must set it (the default is 32). Or if you do a more rare
32b operation in 16b mode you must also set it.

When all the operands are "suppressed" operands, then the effective
operand width must be supplied for nondefault operation widths.

*/

/*! @defgroup ENCHLPATCH Patching instructions

These functions are useful for JITs and other uses where one must
modify certain fields of instructions after encoding. To modify an
instruction, one must encode it (creating an itext array of bytes) and
then decode it (so that the patching routines know where the various
fields are located.). Once the itext and the decoded instruction are
available, certain fields can be modified.

The decode step required to create patchable instructions obviously
takes additional time so it is suggested one only create patchable
instructions once as templates and re-use them as needed.

See examples/xed-ex9-patch.c for an example.
*/


/*! @defgroup ENC2 Fast Encoder for Specific Instructions

The basic idea for the ENC2 fast encoder is that there is one encode
function per variant of every instruction. The instructions are
encoded in 3 encoding spaces (legacy, VEX and EVEX). We need to have
different function names for every variation as well. To come up with
unique names, ENC2 uses a few function naming conventions.  For legacy
encoded instructions, we often have 3 variations in 64b mode (2 in
other modes) to handle 16-bit, 32-bit and 64-bit operands. Those 3
sizes are usually differentiated with "_o16", "_o32" and "_o64" in the
ENC2 function names.  Having unique names is complicated as there are
often multiple encodings for the same operation in the instruction
set. To disambiguate alias encodings the same function names include
substring "_vrN" where N is a integer.  Similarly, VEX and EVEX
encodings for related instructions often need to be distinguished when
their instruction name and operands are the same. To accomplish that
all ENC2 EVEX encoding functions names contain the substring "_e".
The checked interface functions end with "_chk".

For instructions that take conventional x86 memory operands, there are
6 functions generated depending on the addressing mode required. The 6
functions are denoted: b, bd8, bd32, bis, bids8, and bisd32 where:
<ul>
<li> "b" indicates a base register,
<li> "d8" indicates an 8-bit displacement,
<li> "d32" indicates an 32-bit displacement,
<li> "i" indicates an index register, and
<li> "s" indicates an a scale factor (1,2,4,8) for the index register.
</ul>
The idea behind having different functions for the different addressing
modes is to make the encode functions simpler and more straight-line code.
Memory instructions also indicate their effective addressing width
with one of "_a16", "_a32" or "_a64" substrings.


The libraries for the ENC2 encoder are built when the "--enc2" switch 
is included during the build process.  There is one set of
libraries and headers generated for each supported
configuration. Currently Intel&reg; XED ENC2 supports 64b mode with 64b addressing
(m64,a64) and 32b mode with 32b addressing (m32,a32).  The build
process creates an enc2-m64-a64 directory and an enc2-m32-a32
directory, each with two libraries for the checked and unchecked
interfaces. There are 2 headers as well, one for each version of each
library in the hdr/xed subdirectory of their respective enc2-*
directory. On linux, for a static build, you'd see:
@code
enc2-m64-a64/
            libxed-chk-enc2-m64-a64.a
            libxed-enc2-m64-a64.a
            hdr/
                xed/
                    xed-chk-enc2-m64-a64.h
                    xed-enc2-m64-a64.h
@endcode

Given the large size of the generated ENC2 headers, doxygen
documentation is not created for those header files. Please view the
headers directly in your editor.

Even with the unchecked interface, some register checking is done for the
addressing registers.  In the x86 encoding system, some choices of
base register require that an 8-bit or 32-bit displacement is also
used. In those cases, the ENC2 encoder is capable of supplying a
zero-valued displacement.

Intel&reg; XED also offers the capability to test ENC2 with either the "--enc2-test-checked" flag or
the "--enc2-operands-checked" flag. Building XED with any of these flags consequently leads to a longer build.
The former flag allows developers to test the ENC2 checked interface in a more sparing matter, where each
instruction is then decoded and its IFORM gets validated. The latter flag offers a more rigid testing. Each
instruction is decoded and then its IFORM and all operands involved in the encoding get validated as well.

Users can install their own error handler by calling
#xed_enc2_set_error_handler() passing a function pointer that takes
stdarg variable arguments.  See examples/xed-enc2-2.c for an example.

When using the checked interface, one can disable the checking at
runtime by calling
#xed_enc2_set_check_args() with an integer value 0.
With a nonzero argument, the argument checking can be re-enabled.

To minimize copying, ENC2 users are required to supply a pointer to an
output buffer where the encoding bytes will be placed. That buffer is
required to be 15 bytes in length. Valid x86 encodings are shorter
than 15 bytes and only reach that length if redundant legacy prefixes
are employed. XED ENC2 does not generate redundant legacy prefixes.

Here is an example of creating an LEA instruction using the checked
interface and several fixed registers:
@code
xed_uint32_t create_lea_64b(xed_uint8_t* output_buffer)
{
    xed_reg_enum_t dest, base, index;
    xed_uint_t scale;
    xed_int32_t disp32;
    xed_enc2_req_t request;
    xed_enc2_req_t_init(&request, output_buffer);
    dest = XED_REG_R11;
    base = XED_REG_R12;
    index = XED_REG_R13;
    scale = 1;
    disp32 = 0x11223344;
    xed_enc_lea_rm_q_bisd32_a64_chk(&request,
                                    dest,
                                    base, index, scale, disp32);
    return xed_enc2_encoded_length(&request);
}
@endcode

The call to #xed_enc2_req_t_init() zeros out the request structure and
sets up the pointer to the output buffer.  It is very important to
zero the request structure before using it as much of the ENC2 code is
optimized to not set zero-valued bits to zero.  The call to
#xed_enc2_encoded_length() returns the number of bytes placed in the
output buffer. Getting the length of the encoding is useful for
setting the correct buffer pointer for subsequent encoder requests.


See examples/xed-enc2-1.c and
    examples/xed-enc2-2.c 
for examples.
 */


/*! @defgroup OPERANDS Operand storage fields

The operand storage fields are an array of values used for decoding
and for encoding.  This holds derived semantic information from decode
or required fields used during encoding.  They are accessible from a
#xed_decoded_inst_t or a #xed_encoder_request_t .  */


/*! @defgroup IFORM Iforms

Intel&reg; XED classifies instructions as iclasses (ADD, SUB, MUL, etc.) of type
#xed_iclass_enum_t.  To get more information about instructions and
their operands, Intel&reg; XED creates iforms of type #xed_iform_enum_t. The
iforms are supposed to aid in creating dispatch tables for
instructions. You can often use a flat array indexed by iform. The
maximum iform is #XED_IFORM_LAST.

The iforms sometimes do not uniquely identify instructions. For
example, many instructions in the ISA are "scalable" in that their
operand width depends on the machine mode and the prefixes. The memory
operation of these scalable opcodes is either 16 bits, 32 bits or 64
bits. The same opcode can represent several instructions if you factor
in the machine mode and prefixes. Those instructions often map to a
single iform and need to be further refined by the
#xed_operand_values_get_effective_operand_width function.

The names of the iforms are derived from information about the
#xed_iclass_enum_t and the names of their explicit operands (the name of 
nonterminals in the Intel&reg; XED internal grammar) and the data types of those
operands. Other information is sometimes included to disambiguate
similar instructions. For example, there are several opcodes and
operands for encoding certain a 1-byte register-register ADD
instruction as well as the 1-byte register-immediate ADD, so to
differentiate those, Intel&reg; XED includes the opcode bytes as suffixes for the
iform name:

@code
  ADD_GPR8_GPR8_00      
  ADD_GPR8_GPR8_02    
  ADD_GPR8_IMMb_80r0  
  ADD_GPR8_IMMb_82r0  
@endcode

The naming scheme for iforms can get rather complex and continues to
evolve over time as the instruction set architecture grows.  They
mostly use the lower-case letter codes found in the opcode map found
in the appendix to the Intel&reg; 64 and IA-32 Architectures Software
Developers Manual.  For example the scalable instructions
mentioned above use the "v" code which the manuals describe as
representing 16, 32 or 64b operands depending on the effective operand
size.  The code "z" implies either 16 or 32b operation; When the
effective operand size is 64, the operand is still 32b. Other common
suffixes one might see are "d" for 32b and "q" for 64b. The codes "ps"
and "pd" stand for packed scalar (single precision floating point) and
packed double (double precision floating point). The code "dq" is used
to describe 128b (16B) quantities typically in memory or an XMM
register. Similarly "qq" describes a 256b (32B) quantity in memory or
a YMM register.  In many cases the codes were sufficient to describe
what is needed; in other cases I had to improvise.

All the iclasses and iforms are listed in the misc/idata.txt file in
the Intel&reg; XED kit.  

The iform enumeration #xed_iform_enum_t is dense and it has some
built-in structure. All the iforms for a particular iclass are sequential.
The function #xed_iform_max_per_iclass() indicates the number of iforms
for a particular iclass. 

To get the first iform of a particular iclass you can use
#xed_iform_first_per_iclass() at runtime.  There is also the
#xed_iformfl_enum_t which indicates for every iclass, the first and
last iform in the #xed_iform_enum_t.

Given an iform, to get #xed_category_enum_t, #xed_extension_enum_t,
and #xed_iclass_enum_t information, you can use #xed_iform_map(), or
there are accessors listed below to get the iclass, category or
extension from that table directly.  */


/*! @defgroup ISASET Groupings of features for chips

Every Intel&reg; XED iform belongs to one #xed_isa_set_enum_t. Each Intel&reg; XED chip of
type #xed_chip_enum_t represents a collection of xed "isa-sets".  If
you have a #xed_decoded_inst_t, you can get the isa set via
the function #xed_decoded_inst_get_isa_set.

Intel&reg; XED chip-check supports the detection of all Intel&reg; APX instructions and flavors, whether be it a new Intel&reg; APX instruction,
legacy instruction with REX2 prefix, EVEX instruction with EGPR as one of its operands (register or memory) or EVEX instruction with ignored EGPR encoding.

*/

/*! @defgroup CPUID CPUID Interface

Each Intel&reg; XED ISA-SET can be mapped to one or more CPUID groups (feature bit, Intel&reg; AVX10...) and each CPUID group is mapped to one or
more CPUID records.
For each ISA-SET, the CPUID scan should accrue in two nested loops; by iterating through all ISA-SET's CPUID groups and by iterating through all group's CPUID records.
For more usage example, please check the xed-ex1.c example (@ref SMALLEXAMPLES).

*/

/*! @defgroup PRINT Printing (disassembling) Instructions

    There are two primary instruction printing
    functions: #xed_format_generic() and #xed_format_context() .
    Both emit disassembly to a user specified buffer.
    #xed_format_generic() takes all the required information in a
    pointer to a structure of type #xed_print_info_t.  In contrast,
    #xed_format_context(), takes its arguments individually. Both
    versions can take a void* context argument that is passed to
    an optional symbolic disassembly callback function.  

    The disassembly dialect (order of operands and formatting) is
    specified by the #xed_syntax_enum_t parameter. For finer control
    on certain aspects of disassembly, the parameter to
    #xed_format_generic() has a field specifying lower level formatting
    options (#xed_format_options_t).

 */

/*! @defgroup REGINTFC Register Interface

    There are several functions that provide more information about
    the GPRs and the nesting of GPRs.

 */

/*! @defgroup FLAGS Flags Interface

    There are several functions that provide more information about
    the flags read and written.

    The flags are available from the #xed_decoded_inst_t via the
    #xed_decoded_inst_get_rflags_info()  function which
    returns a #xed_simple_flag_t pointer.

    The type #xed_flag_set_t keeps the integer flags in the order
    specified by the RFLAGS register. The x87 flags are stored in the
    most significant 4 bits of the flag set. This should not affect use
    by the normal integer operations; Those bits are reserved as zero
    in the RFLAGS.

 */


/*! @defgroup AGEN Address generation calculation support

    There are several functions available that help with computation
    of addresses.  Note the "big real" or "unreal" address calculation
    is not currently supported.  Two callbacks are defined for
    providing register values or segment base values.  For real mode,
    the selector value is usedin the address computation. In protected
    mode or long mode, the segment descriptor callbacks are used.

 */


/*! @defgroup ENUM Intel&reg; XED enumerations

Almost all the enumerations in Intel&reg; XED are automatically generated and
have conversion functions to and from strings. There is also a
function for finding out what the last element of the enumeration is.

 */


/*! @defgroup INIT Intel&reg; XED initialization

    This section describes the base class used for initializing the
    encoder / decoder requests and the Intel&reg; XED library initialization
    function.

    To use Intel&reg; XED, you must
    include "xed-interface.h" 

    @code
    #include "xed-interface.h"
    @endcode

    If you are calling Intel&reg; XED from C++, you must wrap this include:

    @code
    extern "C" {
    #include "xed-interface.h"
    }
    @endcode

    Once, before using Intel&reg; XED, you must call #xed_tables_init() to
    initialize the tables Intel&reg; XED uses for encoding and decoding:
    @code
    xed_tables_init();
    @endcode

    Once initialized, Intel&reg; XED is reentrant (multithread safe). All values
    used for encoding and decoding live on the caller's stack or in
    the passed-in parameters.

    If your program is multithreaded, initialize Intel&reg; XED once (and only
    once) using the above call before you attempt to decode or encode
    from any thread. Each thread does NOT need to initialize Intel&reg; XED. The
    idea is to initialize Intel&reg; XED before creating your threads. 

   */

/*! @defgroup CMDLINE Intel&reg; XED command interface

The command line tool called xed or xed.exe is built when you build
the examples (@ref EXAMPLES) that come with Intel&reg; XED.


This tool is useful for encoding and decoding or even
decoding-then-re-encoding a single instruction or all the instructions
in the text segment of an ELF binary (32 or 64b). For decoding, just
jump to the examples.


This section also explains a little language for writing the
instructions for encode requests (-e option).  This tool is constantly updated.
The xed-ex3 (xed-ex3.exe) example is just
the encoder portion of the xed command line tool.

The SUPPRESSED operands emitted by the decoder are not used when
encoding. They are ignored. They are not required to select an
encoding.

The syntax for encodable strings is as follows:
@code
             Opcode[/width]   [operand [operand]]
@endcode

The width is a 8, 16, 32 or 64, indicating the effective operand width
if it differs from the default. 8b operations generally require
this. Or since most operations that default to 32b widths in 64b mode,
it is required for 64b operation widths in 64b mode.

The operand specifier is one of the following.  

- A register name such as EAX or R8B, etc. Case does not matter.

- An immediate specifier such as IMM:12ff 

- A branch displacement specifier such as BRDISP:0000001f

- A memory specifier that indicates the base register, index register,
scale value, and displacement value. If one of the fields is not
required, a - is necessary.  The displacement is omittable. For
example: MEM4:ESI,EAX,8,ff or MEM4:EBX. The first one specifies that
the memory address 4 bytes and should be ESI + EAX * 8 + 0xff.  The
second one specifies that EBX should be used to access 4 bytes of
memory; note the displacement is omitted.  A segment override can be
specified as follows: MEM4:GS:EAX by using a segment-name followed by
a ":" before the base register. If there is no base register, you can
use a "-", for example: MEM4:GS:-,-,11223344.  One also needs to
specify a memory operation width. This can be accomplished by
indicating a number of bytes just after the MEM specifier. For
example: MEM2:EAX indicates a 2 byte memory operation. 


- An address generation specifer that has the same syntax as the above
MEM: specifier, but is only used for LEA instructions.  Example:
AGEN:EAX,EBX,2,-


Here is the help message:

@code

% xed -h
Usage: xed [options]
One of the following is required:
    -i input_file             (decode pecoff-format file)
    -ir raw_input_file        (decode a raw unformatted binary file)
    -ih hex_input_file        (decode a raw unformatted ASCII hex file)
    -d hex-string             (decode a sequence of bytes, must be last)
    -j                        (just decode one instruction when using -d)
    -F prefix                 (decode ascii hex bytes after prefix)
                              (running in filter mode from stdin)
    -ide input_file           (decode/encode file)
    -e instruction            (encode, must be last)
    -f                        (encode force, skip encoder chip check)
    -ie file-to-assemble      (assemble the contents of the file)
    -de hex-string            (decode-then-encode, must be last)

Optional arguments:

    -v N          (0=quiet, 1=errors, 2=useful-info, 3=trace,
                    5=very verbose)
    -xv N         (XED engine verbosity, 0...99)

    -chip-check CHIP   (count instructions that are not valid for CHIP)
    -chip-check-list   (list the valid chips)

    -s section    (target section for file disassembly,
                    PECOFF and ELF formats only)

    -n N          (number of instructions to decode. Default 100M,
                    accepts K/M/G qualifiers)

    -b addr       (Base address offset, for DLLs/shared libraries.
                    Use 0x for hex addresses)
    -as addr      (Address to start disassembling.
                    Use 0x for hex addresses)
    -ae addr      (Address to end   disassembling.
                    Use 0x for hex addresses)
    -no-resync    (Disable symbol-based resynchronization algorithm
                    for disassembly)
    -ast          (Show the AVX/SSE transition classfication)
    -histo        (Histogram decode times)

    -I            (Intel syntax for disassembly)
    -A            (ATT SYSV syntax for disassembly)
    -isa-set      (Emit the XED "ISA set" in dissasembly)
    -xml          (XML formatting)
    -uc           (upper case hex formatting)
    -pmd          (positive memory displacement formatting)
    -nwm          (Format AVX512 without curly braces for writemasks, include k0)
    -emit-ignored-branch-hint (emit ignored branch hints during disassembly)
    -emit         (Output __emit statements for the Intel compiler)
    -S file       Read symbol table in "nm" format from file
    -dot FN       (Emit a register dependence graph file in dot format.
                    Best used with -as ADDR -ae ADDR to limit graph size.)

    -r            (for REAL_16 mode, 16b addressing (20b addresses),
                    16b default data size)
    -r32          (for REAL_32 mode, 16b addressing (20b addresses),
                    32b default data size)
    -16           (for LEGACY_16 mode, 16b addressing,
                    16b default data size)
    -32           (for LEGACY_32 mode, 32b addressing,
                    32b default data size -- default)
    -64           (for LONG_64 mode w/64b addressing
                    Optional on windows/linux)
    -mpx          (Turn on MPX mode for disassembly, default is off)
    -cet          (Turn on CET mode for disassembly, default is off)
    -s32          (32b stack addressing, default, not in LONG_64 mode)
    -s16          (16b stack addressing, not in LONG_64 mode)
    -set OP VAL   (Set a XED operands to some integer value, repeatable)
    -version      (The version message)
    -help         (This help message)
@endcode

Here are a couple of examples:

@code
% xed -d 0000
ICLASS:     ADD
CATEGORY:   BINARY
EXTENSION:  BASE
IFORM:      ADD_MEMb_GPR8
ISA_SET:    I86
ATTRIBUTES: BYTEOP LOCKABLE
SHORT:      add byte ptr [eax], al

% xed -e ADD EAX EBX
Request: ADD MODE:1, REG0:EAX, REG1:EBX, SMODE:1
OPERAND ORDER: REG0 REG1
Encodable! 01D8
.byte 0x01,0xd8

% xed -e ADD EAX MEM4:ESP,EBX,4
Request: ADD EASZ:2, MEM0:dword ptr [ESP+EBX*4], MEM_WIDTH:4, MODE:1, REG0:EAX, SMODE:1
OPERAND ORDER: REG0 MEM0
Encodable! 03049C
.byte 0x03,0x04,0x9c

% xed -d 6a00
ICLASS:     PUSH
CATEGORY:   PUSH
EXTENSION:  BASE
IFORM:      PUSH_IMMb
ISA_SET:    I186
ATTRIBUTES: FIXED_BASE0 SCALABLE STACKPUSH0
SHORT:      push 0x0

% xed -e MOV EAX MEM4:SS:ESP
Request: MOV EASZ:2, MEM0:dword ptr SS[ESP], MEM_WIDTH:4, MODE:1, REG0:EAX, SMODE:1
OPERAND ORDER: REG0 MEM0
Encodable! 8B0424
.byte 0x8b,0x04,0x24

% xed -64 -e CCMPB r8b r9b dfv14
Request: CCMPB MODE:2, REG0:R8B, REG1:R9B, REG2:DFV14, SMODE:2
OPERAND ORDER: REG0 REG1 REG2 
Encodable! 6254740238C8
.byte 0x62,0x54,0x74,0x02,0x38,0xc8

@endcode 

Or using the xed-ex3 example tool:
@code 
% obj/xed-ex3
Usage: obj/xed-ex3 [-16|-32|-64] [-s16|-s32] encode-string
@endcode

The -16, -32 or -64 are for specifying the major mode of the machine.
The major mode of the machine determines the default data operand
size and default addressing width.  In 64b mode, the default data
size is 32b and the default addressing mode is 64b
addressing.  In 32b mode, the default addressing width is 32b. In 16b
mode, the default addressing width is 16b. In 32b mode or 16b mode,
the stack addressing width must also be specified. Usually it matches
the major mode.  The -s16 option is for specifying 16b stack
addressing in 32b mode. The -s32 is for specifying 32b stack
addressing in 16 bit mode.

@code
% obj/xed-ex3 -64 PUSH/64 RAX
Encode request:
PUSH EOSZ:3, MODE:2, REG0:RAX, SMODE:2
OPERAND ORDER: REG0

Encodable! 50

% obj/xed-ex3 MOV MEM4:EAX IMM:11223344
Encode request:
MOV EASZ:2, IMM0:0x11223344, IMM_WIDTH:32, MEM0:dword ptr [EAX], MEM_WIDTH:4, MODE:1, SMODE:1
OPERAND ORDER: MEM0 IMM0

Encodable! C70044332211
@endcode

@section ENCODE_EXAMPLE An example of using the encoder

The encoder language file which is part of the xed command line tool
shows how to build up instructions from scratch.  The example uses a
string to drive the creation of the instruction, but that is just an
example. Look at the parse_encode_request function for the required
pieces.

\include xed-enc-lang.c
 

 */

/*! @defgroup EXAMPLES  Examples of using Intel&reg; XED

The source code for the examples is in the "examples" subdirectory.

There is a makefile that will build all the examples on Linux or
windows.

There are several examples, mainly:
      <ul>
      
      <li> xed.c: This is the main Intel&reg; XED example. It's a complete example touching down on most of Intel&reg; XED's capabilities
      and APIs, including decoding, encoding, image file reading and chip-check APIs and functionality.
      <li> xed-ex1.c: A quintessential decoder example that uses most of the decode APIs.
      It is recommended for users who want to deeply understand the numerous APIs for decoded instructions and their usages.
      This is included in the "Small Examples" section below. It is a good example for using the major decoder APIs.
      This example supports emitting detailed instruction metadata: prefixes, CPUID leaves, rounding modes, branch-hints, flags, operands and more.
      This example can also elaborate on EVEX/VEX prefix bit info using a specified verbosity mode. It supports CPUID-based defeaturing as well.
      <li> xed-ex3.c: A lightweight encoder example used to parse command line arguments and encode them into a stream of bytes. 
      For more insight on the encode request syntax and the encoder examples, please see @ref CMDLINE page.
      <li> xed-ex4.c: A simple decoder example with different disassembly output formats. It contains highly descriptive
      usages of the new decode-with-features APIs.
      <li> xed-ex5-enc.c: A simple non-dynamic encoder example using the high-level encoding API.
      <li> xed-ex6.c: This is a subset of the main example, which works as a concentrated example of the decode-encode (-de) feature.
      <li> xed-ex9-patch.c: A non-dynamic example that shows how to patch (modify) the xed_encoder_instruction data structure
      yielded from the encoder.
      <li> xed-asmparse-main.c: A unique encoder example showcasing how to process command-line encode requests in assembly format
      It allows processing multiple requests separated by a semicolon. This is different from other encoder examples in that the input
      is not in Intel&reg; XED encode request format.

      </ul>

Other specific use-case examples:
      <ul>
      
      <li> xed-dll-discovery.c: When using Intel&reg; XED as a DLL or shared object, the enumerations can change from one version to another if instructions (or features) 
      are added. Each Intel&reg; XED enumeration must be mapped to something your Intel&reg; XED client can use for indepedent compilation (custom fixed enumeration).
      This example shows how to discover the Intel&reg; XED values for the xed_iclass_enum_t and construct a mapping from names that are constant to your tool to 
      names that can vary. You would need to do this for each Intel&reg; XED enumeration that your Intel&reg; XED client uses.
      <li> xed-enc2-*.c: These examples showcase the use of ENC2 encode APIs. ENC2 is our much faster, low-level encoder.
      <li> xed-ex-agen.c: Creates an artificial AGEN (address generation) calculator for testing and works as a valuable example for using xed_agen API.
      <li> xed-ild-dec.c: An example for using the instruction length decoder API.
      <li> xed-ild-dec2.c: An example that compares the results of regular decoding and instruction length decoding for further validation.
      <li> xed-tables.c: An example that shows how to access Intel&reg; XED's internal tables, which contain static data for all Intel&reg; XED-supported instructions.
      <li> xed-size.c: Outputs the sizes of Intel&reg; XED data structures. Can be useful for analytical purposes and auto-generation workflows.
      <li> xed-reps.c: Shows the use of the REP prefix APIs, which basically returns the non-REP variant of a REP ICLASS and vice versa.

      </ul>


The above examples use the following utilities to process different inputs:
      <ul>
      
      <li> xed-disas-elf: Used to disassemble ELF (Executable and Linkable Format) binaries. 
      If the binary contains DWARF debugging information (generated by the compiler in a specific format), this example can map 
      addresses to source code line numbers, making it easier to correlate the disassembled code with the original source code.
      <li> xed-disas-macho: Useful for disassembling Mach-O binaries on macOS.
      <li> xed-disas-hex: Used to disassemble hexadecimal byte sequences from a file.
      <li> xed-disas-raw: Used to disassemble raw binary data.
      <li> xed-disas-pecoff: Used to disassemble PE/COFF (Portable Executable/Common Object File Format) binaries on Windows.
      <li> xed-enc-lang: A very detailed component that shows how to parse encode requests from the command line and then perform the encoding.
      <li> xed-dot: Generates a dependency graph from a set of decoded instructions and pours the results into a dot file.
      The dot file can be converted into a .png to visualize the results. Each node represents an instruction, an edge is added based on 
      dependency between involved regs (RAW, WAW, WAR).

      </ul>


The examples are described in the following subsections:
    - @ref SMALLEXAMPLES  "Small Examples"  Small Examples
    - @ref CMDLINE        "Command line"    Intel&reg; XED's command line testing tool
    - @ref ENCODE_EXAMPLE "Encode Example"  An example of using the encoder

*/

/*! @defgroup SMALLEXAMPLES  Small Examples of using Intel&reg; XED

Here is a minimal example of using Intel&reg; XED from the file examples/xed-min.c.

\include xed-min.c

There is a makefile in the examples directory. Here's how to compile
it from a kit:
@code
% gcc -Ipath-to-xed-kit/include -Ipath-to-xed-kit/examples \
      -c path-to-xed-kit/examples/xed-min.c
% gcc -o xed-min xed-min.o path-to-xed-kit/lib/libxed.a
@endcode
where path-to-xed-kit is where you have your include, examples and
lib directories from an installed Intel&reg; XED kit.


Here is a more detailed example (examples/xed-ex1.c) that walks the
operands much like the printing routines do for the
#xed_decoded_inst_t .

\include xed-ex1.c

Here are a few examples of running the program:

@code

% ./xed-ex1 01 c0

iclass ADD      category BINARY ISA-extension BASE      ISA-set I86
instruction-length 2
operand-width 32
effective-operand-width 32
effective-address-width 32
stack-address-width 32
iform-enum-name ADD_GPRv_GPRv_01
iform-enum-name-dispatch (zero based) 14
iclass-max-iform-dispatch 42
Nominal opcode position 0
Nominal opcode 0x01
Operands
#   TYPE               DETAILS        VIS  RW       OC2 BITS BYTES NELEM ELEMSZ   ELEMTYPE   REGCLASS
#   ====               =======        ===  ==       === ==== ===== ===== ======   ========   ========
0   REG0              REG0=EAX   EXPLICIT  RW         V   32     4     1     32        INT        GPR
1   REG1              REG1=EAX   EXPLICIT   R         V   32     4     1     32        INT        GPR
2   REG2           REG2=EFLAGS SUPPRESSED   W         Y   32     4     1     32        INT      FLAGS
Memory Operands
  MemopBytes = 0
FLAGS:
  must-write-rflags of-mod sf-mod zf-mod af-mod pf-mod cf-mod
       read:                                mask=0x0
    written:             of sf zf af pf cf  mask=0x8d5
  undefined:                                mask=0x0
ATTRIBUTES: SCALABLE
ISA SET: [I86]

===============================================================================

% ./xed-ex1 f2 0f 58 9c 24 e0 00 00 00

iclass ADDSD    category SSE    ISA-extension SSE2      ISA-set SSE2
instruction-length 9
operand-width 32
effective-operand-width 32
effective-address-width 32
stack-address-width 32
iform-enum-name ADDSD_XMMsd_MEMsd
iform-enum-name-dispatch (zero based) 0
iclass-max-iform-dispatch 2
Nominal opcode position 2
Nominal opcode 0x58
Operands
#   TYPE               DETAILS        VIS  RW       OC2 BITS BYTES NELEM ELEMSZ   ELEMTYPE   REGCLASS
#   ====               =======        ===  ==       === ==== ===== ===== ======   ========   ========
0   REG0             REG0=XMM3   EXPLICIT  RW        SD   64     8     1     64     DOUBLE        XMM
1   MEM0           (see below)   EXPLICIT   R        SD   64     8     1     64     DOUBLE    INVALID
Memory Operands
  0    read SEG= SS BASE= ESP/GPR DISPLACEMENT_BYTES= 4 0x00000000000000e0 base10=224 ASZ0=32
  MemopBytes = 8
ATTRIBUTES: MXCSR SIMD_SCALAR
F2 PREFIX
EXCEPTION TYPE: SSE_TYPE_3
SSE
SCALAR
Number of legacy prefixes: 1
ISA SET: [SSE2]
0       CPUID GROUP NAME: [SSE2]
        0       CPUID RECORD NAME: [SSE2]
                {Leaf 0x00000001, subleaf 0x00000000, EDX[26:26]} = 1

===============================================================================

./xed-ex1 f3 90

iclass PAUSE    category MISC   ISA-extension PAUSE     ISA-set PAUSE
instruction-length 2
operand-width 32
effective-operand-width 32
effective-address-width 32
stack-address-width 32
iform-enum-name PAUSE
iform-enum-name-dispatch (zero based) 0
iclass-max-iform-dispatch 1
Nominal opcode position 1
Nominal opcode 0x90
Operands
#   TYPE               DETAILS        VIS  RW       OC2 BITS BYTES NELEM ELEMSZ   ELEMTYPE   REGCLASS
#   ====               =======        ===  ==       === ==== ===== ===== ======   ========   ========
Memory Operands
  MemopBytes = 0
ATTRIBUTES: NOTSX
F3 PREFIX
Number of legacy prefixes: 1
ISA SET: [PAUSE]

===============================================================================

@endcode




*/
