fix formatting & links

master
PoroCYon 3 years ago
parent 7d2e9cda74
commit 8a425344da
  1. 5
      code.md
  2. 3
      explain-dot-md.md
  3. 7
      explain/crt.md
  4. 13
      explain/rtld.md
  5. 52
      explain/tinyelf-arm.md
  6. 23
      tools.md

@ -4,7 +4,8 @@
* [liner (intro template)](https://github.com/shizmob/liner)
* [more GL init stuff](https://github.com/blackle/Linux-OpenGL-Examples/)
* [even smaller GL init if you only need a single fullscreen fragment shader, still works with default Ubuntu](https://github.com/blackle/Clutter-1k/)
* [even smaller GL init if you only need a single fullscreen fragment shader,
still works with default Ubuntu](https://github.com/blackle/Clutter-1k/)
### Basic synth setup
@ -30,4 +31,4 @@
### Other stuff
* [Tetris clone, in 2k](https://github.com/donnerbrenn/Tetris2k)
* [Tetris clone, in 2k](https://github.com/donnerbrenn/Tetris2k)

@ -4,7 +4,8 @@
* [Syscalls](/explain/syscalls)
* [Creating small static binaries, ELF header
hacks](https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html) ([notes on doing the same, on ARM](/explain/tinyelf-arm))
hacks](https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html)
([notes on doing the same, on ARM](/explain/tinyelf-arm))
* [Process creation](/explain/proc)
* [Dynamic linking](/explain/rtld)
* [C runtime](/explain/crt)

@ -10,9 +10,12 @@ function:
1. aligns the stack
2. reads `argc`, `argv`, `environ` and [the auxiliary vector
](https://refspecs.linuxfoundation.org/LSB_1.3.0/IA64/spec/auxiliaryvector.html) from the stack (see the links to LWN in [Process creation](https://linux.weeaboo.software/explain/proc) for stack content details)
](https://refspecs.linuxfoundation.org/LSB_1.3.0/IA64/spec/auxiliaryvector.html)
from the stack (see the links to LWN in [Process creation](/explain/proc)
for stack content details)
3. sets up TLS, locales, and other crap
4. calls `__libc_start_user`, which first calls the constructors of all global (C++) variables, then calls `main`, and `exit` automatically as
4. calls `__libc_start_user`, which first calls the constructors of all global
(C++) variables, then calls `main`, and `exit` automatically as
well.
Of course, it's huge, but we can provide our own: step 3 can be skipped, 1 and 2

@ -4,11 +4,14 @@
Dynamic linking is the process of loading code and data from shared libraries.
For an overview, see [this
](https://0x00sec.org/t/linux-internals-dynamic-linking-wizardry/1082) and [
this page](https://0x00sec.org/t/linux-internals-the-art-of-symbol-resolution/1488), or [this talk by Matt Godbolt](https://www.youtube.com/watch?v=dOfucXtyEsU).
this page](https://0x00sec.org/t/linux-internals-the-art-of-symbol-resolution/1488),
or [this talk by Matt Godbolt](https://www.youtube.com/watch?v=dOfucXtyEsU).
For a series of very in-depth articles, see [this page
](https://www.airs.com/blog/archives/38) etc.
[Here](https://sourceware.org/glibc/wiki/Debugging/Loader_Debugging)'s a page on how to debug ld.so, and [here](https://www.gnu.org/software/hurd/glibc/startup.html)'s one on the glibc startup process (GNU Hurd, but mostly applicable to Linux as well).
[Here](https://sourceware.org/glibc/wiki/Debugging/Loader_Debugging)'s a page
on how to debug ld.so, and [here](https://www.gnu.org/software/hurd/glibc/startup.html)'s
one on the glibc startup process (GNU Hurd, but mostly applicable to Linux as well).
### The interesting bits
@ -36,8 +39,8 @@ symbols, and performs the required relocations to glue all code together.
However, there's one entry, `DT_DEBUG`, which isn't documented (the docs say
"for debugging purposes"). What it actually does, is that the dynamic linker
places a pointer to its `r_debug` struct in the value field. This behavior
is mostly portable (as in, it works on at least glibc, musl, bionic and FreeBSD). If
you look at your system's `link.h` file (eg. in `/usr/include`), you can see
is mostly portable (as in, it works on at least glibc, musl, bionic and FreeBSD).
If you look at your system's `link.h` file (eg. in `/usr/include`), you can see
the contents of this struct. The second field is a pointer to the root
`link_map`. More about this one later.
@ -58,7 +61,7 @@ phdr. We can use the latter to traverse all the symbol tables and figure out
what the address of every symbol is. This is what **bold** and **dnload** do,
except they save a hash of the names of the required symbols, instead of the
symbol names themselves, computes the hashes of the symbol names from the
`link_map`, and compare thise.
`link_map`, and compare those.
However, there are a few ways to save even more bytes:

@ -1,30 +1,50 @@
# Tiny ELF binaries on ARM
Looks like breadbox didn't want to go down *this* rabbithole. So we'll have to do it instead.
Looks like breadbox didn't want to go down *this* rabbithole. So we'll have to
do it instead.
* Target platform: OABI is dead, so EABI it is. Anything running Linux seems to support `thumb`, `half` and `fastmult`, and usually `edsp` as well.
* Target platform: OABI is dead, so EABI it is. Anything running Linux seems to
support `thumb`, `half` and `fastmult`, and usually `edsp` as well.
* So what will the minimum target CPU be? ARMv6T (RPI1)? ARMv5TE?
* All ARM instructions are 4 bytes wide
* There's a "Thumb" mode, with reduced instructions (eg. no free shifts, no 3-operand instructions) where all instructions are 2 bytes wide
* There's a "Thumb" mode, with reduced instructions (eg. no free shifts, no
3-operand instructions) where all instructions are 2 bytes wide
* Switch to Thumb like this: `add lr, pc, #1; bx lr`
* On older ARMs, it was possible to directly write to `pc` and switch mode, but this doesn't seem to be possible on ARMv6 anymore.
* The `mov` opcode doesn't accept arbitrary immediate values, so you sometimes have to spill values to a "constant pool", or be creative with assignments and register write-backs in load/store ops
* Null bytes decode to `andeq r0, r0` in ARM mode, or to `movs r0, r0` in Thumb mode, both are no-ops, unlike in x86 where null bytes cause a segfault.
* Instruction encoding is relatively sane, so you can predict what low-value 32-bit ints will decode to so you can treat them as (almost-)no-ops.
* It's a RISC, so you don't have one-byte-instructions, flexible addressing modes or stringops, but there are a few useful parts in ARM:
* On older ARMs, it was possible to directly write to `pc` and switch mode,
but this doesn't seem to be possible on ARMv6 anymore.
* The `mov` opcode doesn't accept arbitrary immediate values, so you sometimes
have to spill values to a "constant pool", or be creative with assignments and
register write-backs in load/store ops
* Null bytes decode to `andeq r0, r0` in ARM mode, or to `movs r0, r0` in Thumb
mode, both are no-ops, unlike in x86 where null bytes cause a segfault.
* Instruction encoding is relatively sane, so you can predict what low-value
32-bit ints will decode to so you can treat them as (almost-)no-ops.
* It's a RISC, so you don't have one-byte-instructions, flexible addressing
modes or stringops, but there are a few useful parts in ARM:
* `pc`, `sp` etc. behave as regular registers
* You can shift one operand of an instruction by a constant value for free, it doesn't cost any bytes. (ARM mode only) This can also be used to do fixed-point multiplications etc. (eg. `add r0, r0, lsr #1` for `r0*1.5`)
* You can shift one operand of an instruction by a constant value for free,
it doesn't cost any bytes. (ARM mode only) This can also be used to do
fixed-point multiplications etc. (eg. `add r0, r0, lsr #1` for `r0*1.5`)
* `ldmia`/`stmia` are great for copying stuff around
* [`e_machine` (and `e_type`) seem to be the only checked header fields](https://code.woboq.org/linux/linux/arch/arm/kernel/elf.c.html) (`e_entry` alignment checks are normal, because if it wouldn't be aligned, the code would segfault on entry.)
* Of course, `e_entry`, `e_phoff`, `e_phnum` need to contain the right values, and `e_phentsize` and `e_ehsize` need to be correct as well.
* `phdr` parsing etc. is done architecture-independently, so the same tricks should be usable here as well.
* Turns out it's even more relaxed than x86 when messing with `p_paddr`, `p_padding` and `p_flags`. It seems to be the case that the kernel & CPU will happily let you execute code in read-write pages.
* Apparently the kernel doesn't look at the immediate field of `swi` and `bkpt` instructions __if it's configured as EABI-only__ (which we assume).
* [Dynamic linking stuff](https://linux.weeaboo.software/explain/rtld#dynamic-linking_arm)
* [`e_machine` (and `e_type`) seem to be the only checked header fields
](https://code.woboq.org/linux/linux/arch/arm/kernel/elf.c.html) (`e_entry`
alignment checks are normal, because if it wouldn't be aligned, the code
would segfault on entry.)
* Of course, `e_entry`, `e_phoff`, `e_phnum` need to contain the right
values, and `e_phentsize` and `e_ehsize` need to be correct as well.
* `phdr` parsing etc. is done architecture-independently, so the same tricks
should be usable here as well.
* Turns out it's even more relaxed than x86 when messing with `p_paddr`,
`p_padding` and `p_flags`. It seems to be the case that the kernel & CPU
will happily let you execute code in read-write pages.
* Apparently the kernel doesn't look at the immediate field of `swi` and
`bkpt` instructions __if it's configured as EABI-only__ (which we assume).
* [Dynamic linking stuff](/explain/rtld#dynamic-linking_arm)
### Minimal ELF Poc
Not *that* minimal :) (But it should be able to show you which fields can be bogus quite clearly.)
Not *that* minimal :) (But it should be able to show you which fields can be
bogus quite clearly.)
```
gcc -c -o tiny.o tiny.S

@ -10,10 +10,15 @@
* shell dropping: `cp $0 /tmp/M;(sed 1d $0|lzcat)>$_;exec $_;`
* ~~[fishypack
](https://bitbucket.org/blackle_mori/cenotaph4soda/src/master/packer/?at=master): In-memory `memfd`/`execveat`-based decompressor~~ (Deprecated in favor of vondehi)
* [vondehi](https://gitlab.com/PoroCYon/vondehi): Next-generation in-memory `memfd`/`execveat`-based decompressor (See list of known bugs)
* [autovndh](https://gitlab.com/snippets/1800243): script that bruteforces all possible gzip/lzma/xz/... options and automatically places a vondehi decompression stub at the beginning
* [oneKpaq](https://github.com/temisu/oneKpaq): Generic PAQ-based (de)compressor for 32-bit x86
](https://bitbucket.org/blackle_mori/cenotaph4soda/src/master/packer/?at=master):
In-memory `memfd`/`execveat`-based decompressor~~ (Deprecated in favor of vondehi)
* [vondehi](https://gitlab.com/PoroCYon/vondehi): Next-generation in-memory
`memfd`/`execveat`-based decompressor (See list of known bugs)
* [autovndh](https://gitlab.com/snippets/1800243): script that bruteforces
all possible gzip/lzma/xz/... options and automatically places a vondehi
decompression stub at the beginning
* [oneKpaq](https://github.com/temisu/oneKpaq): Generic PAQ-based (de)compressor
for 32-bit x86
### synths
@ -21,8 +26,12 @@
* [4klang](https://www.pouet.net/prod.php?which=53398), [ForkedKlang
](https://www.pouet.net/topic.php?which=11312). Note that the replayer code
is 32-bit only! You'll need some hacks to make it work on 64-bit.
* [Clinkster](https://www.pouet.net/prod.php?which=61592), replayer is 32-bit only as well.
* [Oidos](https://www.pouet.net/prod.php?which=69524), replayer is once more 32-bit only.
* [Clinkster](https://www.pouet.net/prod.php?which=61592), replayer is 32-bit
only as well.
* [Oidos](https://www.pouet.net/prod.php?which=69524), replayer is once more
32-bit only.
* [Axiom](https://github.com/monadgroup/axiom/) is "supposed" to work
See [here](/code#example-code_basic-synth-setup) for some synth example setup code, together with some Linux-specific patches to get some of the replayers to work.
See [here](/code#example-code_basic-synth-setup) for some
synth example setup code, together with some Linux-specific patches to get some
of the replayers to work.

Loading…
Cancel
Save