# Architecture updater

This is Capstones updater for some architectures.
Unfortunately not all architectures are supported yet.

## Install dependencies

Install clang-format

```
sudo apt install clang-format-18
```

Setup Python environment and Tree-sitter

```
cd <root-dir-Capstone>
# Python version must be at least 3.11
sudo apt install python3-venv
# Setup virtual environment in Capstone root dir
python3 -m venv ./.venv
source ./.venv/bin/activate
pip3 install -r dev_requirements.txt
```

Clone C++ grammar

```
cd suite/auto-sync/
git submodule update --init --recursive ./vendor/
```

## Update

Check if your architecture is supported.

```
./Updater/ASUpdater.py -h
```

Clone Capstones LLVM fork and build `llvm-tblgen`

```
git clone https://github.com/capstone-engine/llvm-capstone
cd llvm-capstone
git checkout auto-sync
mkdir build
cd build
# You can also build the "Release" version
cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug ../llvm
cmake --build . --target llvm-tblgen --config Debug
cd ../../
```

Run the updater

```
./Updater/ASUpdater.py -a <ARCH>
```

## Post-processing steps

This update translates some LLVM C++ files to C.
Because the translation is not perfect (maybe it will some day)
you will get build errors if you try to compile Capstone.

The last step to finish the update is to fix those build errors by hand.

## Developer

### Overview updated files

This is a rough overview what files of an architecture are updated and where they are coming from.

**Files originating from LLVM** (Automatically updated)

These files are LLVM source files which were translated from C++ to C
Not all the listed files below are used by each architecture.
But those are the most common.

- `<ARCH>Disassembler.*`: Bytes to `MCInst` decoder.
- `<ARCH>InstPrinter.*` or `<ARCH>AsmPrinter.*`: `MCInst` to asm string decoder.
- `<ARCH>BaseInfo.*`: Commonly use functions and definitions.

`*.inc` files are exclusively generated by LLVM TableGen backends:

`*.inc` files for the LLVM component are named like this:
- `<ARCH>Gen*.inc` (note: no `CS` in the name)

Additionally, we generate more details for Capstone with `llvm-tblgen`.
Like enums, operand details and other things.

They are saved also to `*.inc` files, but have the `CS` in the name to make them distinct from the LLVM generated files.

- `<ARCH>GenCS*.inc`

**Capstone module files** (Not automatically updated)

Those files are written by us:

- `<ARCH>DisassemblerExtension.*` All kind of functions which are needed by the LLVM component, but could not be generated or translated.
- `<ARCH>Mapping.*`: Binding code between the architecture module and the LLVM files. This is also where the detail is set.
- `<ARCH>Module.*`: Interface to the Capstone core.

### Update procedure

1. Run the `ASUpdater.py` script.
2. Compare the functions in `<ARCH>DisassemblerExtension.*` to LLVM (search the function names in the LLVM root)
and update them if necessary.
3. Try to build Capstone and fix the build errors.

### Update details

**LLVM file translation**

For details about the C++ to C translation of the LLVM files refer to `CppTranslator/README.md`.

**Generated .inc files**

Documentation about the `.inc` file generation is in the [llvm-capstone](https://github.com/capstone-engine/llvm-capstone) repository.

**Troubleshooting**

- If some features aren't generated and are missing in the `.inc` files, make sure they are defined as `AssemblerPredicate` in the `.td` files.

  Correct:
  ```
  def In32BitMode  : Predicate<"!Subtarget->isPPC64()">,
    AssemblerPredicate<(all_of (not Feature64Bit)), "64bit">;
  ```
  Incorrect:
  ```
  def In32BitMode  : Predicate<"!Subtarget->isPPC64()">;
  ```

**Formatting**

- If you make changes to the `CppTranslator` please format the files with `black`
  ```
  source ./.venv/bin/activate
  pip3 install black
  python3 -m black --line-length=120 CppTranslator/*/*.py
  ```
