@@ -816,6 +816,205 @@ This plugin can limit the number of Instructions Per Second that are executed::
816816 The lower the number the more accurate time will be, but the less efficient the plugin.
817817 Defaults to ips/10
818818
819+ Uftrace
820+ .......
821+
822+ ``contrib/plugins/uftrace.c ``
823+
824+ This plugin generates a binary trace compatible with
825+ `uftrace <https://github.com/namhyung/uftrace >`_.
826+
827+ Plugin supports aarch64 and x64, and works in user and system mode, allowing to
828+ trace a system boot, which is not something possible usually.
829+
830+ In user mode, the memory mapping is directly copied from ``/proc/self/maps `` at
831+ the end of execution. Uftrace should be able to retrieve symbols by itself,
832+ without any additional step.
833+ In system mode, the default memory mapping is empty, and you can generate
834+ one (and associated symbols) using ``contrib/plugins/uftrace_symbols.py ``.
835+ Symbols must be present in ELF binaries.
836+
837+ It tracks the call stack (based on frame pointer analysis). Thus, your program
838+ and its dependencies must be compiled using ``-fno-omit-frame-pointer
839+ -mno-omit-leaf-frame-pointer ``. In 2024, `Ubuntu and Fedora enabled it by
840+ default again on x64
841+ <https://www.brendangregg.com/blog/2024-03-17/the-return-of-the-frame-pointers.html> `_.
842+ On aarch64, this is less of a problem, as they are usually part of the ABI,
843+ except for leaf functions. That's true for user space applications, but not
844+ necessarily for bare metal code. You can read this `section
845+ <uftrace_build_system_example> ` to easily build a system with frame pointers.
846+
847+ When tracing long scenarios (> 1 min), the generated trace can become very long,
848+ making it hard to extract data from it. In this case, a simple solution is to
849+ trace execution while generating a timestamped output log using
850+ ``qemu-system-aarch64 ... | ts "%s" ``. Then, ``uftrace --time-range=start~end ``
851+ can be used to reduce trace for only this part of execution.
852+
853+ Performance wise, overhead compared to normal tcg execution is around x5-x15.
854+
855+ .. list-table :: Uftrace plugin arguments
856+ :widths: 20 80
857+ :header-rows: 1
858+
859+ * - Option
860+ - Description
861+ * - trace-privilege-level=[on|off]
862+ - Generate separate traces for each privilege level (Exception Level +
863+ Security State on aarch64, Rings on x64).
864+
865+ .. list-table :: uftrace_symbols.py arguments
866+ :widths: 20 80
867+ :header-rows: 1
868+
869+ * - Option
870+ - Description
871+ * - elf_file [elf_file ...]
872+ - path to an ELF file. Use /path/to/file:0xdeadbeef to add a mapping offset.
873+ * - --prefix-symbols
874+ - prepend binary name to symbols
875+
876+ Example user trace
877+ ++++++++++++++++++
878+
879+ As an example, we can trace qemu itself running git::
880+
881+ $ ./build/qemu-aarch64 -plugin \
882+ build/contrib/plugins/libuftrace.so \
883+ ./build/qemu-aarch64 /usr/bin/git --help
884+
885+ # and generate a chrome trace directly
886+ $ uftrace dump --chrome | gzip > ~/qemu_aarch64_git_help.json.gz
887+
888+ For convenience, you can download this trace `qemu_aarch64_git_help.json.gz
889+ <https://fileserver.linaro.org/s/N8X8fnZ5yGRZLsT/download/qemu_aarch64_git_help.json.gz> `_.
890+ Download it and open this trace on https://ui.perfetto.dev/. You can zoom in/out
891+ using :kbd: `W `, :kbd: `A `, :kbd: `S `, :kbd: `D ` keys.
892+ Some sequences taken from this trace:
893+
894+ - Loading program and its interpreter
895+
896+ .. image :: https://fileserver.linaro.org/s/fie8JgX76yyL5cq/preview
897+ :height: 200px
898+
899+ - open syscall
900+
901+ .. image :: https://fileserver.linaro.org/s/rsXPTeZZPza4PcE/preview
902+ :height: 200px
903+
904+ - TB creation
905+
906+ .. image :: https://fileserver.linaro.org/s/GXY6NKMw5EeRCew/preview
907+ :height: 200px
908+
909+ It's usually better to use ``uftrace record `` directly. However, tracing
910+ binaries through qemu-user can be convenient when you don't want to recompile
911+ them (``uftrace record `` requires instrumentation), as long as symbols are
912+ present.
913+
914+ Example system trace
915+ ++++++++++++++++++++
916+
917+ A full trace example (chrome trace, from instructions below) generated from a
918+ system boot can be found `here
919+ <https://fileserver.linaro.org/s/WsemLboPEzo24nw/download/aarch64_boot.json.gz> `_.
920+ Download it and open this trace on https://ui.perfetto.dev/. You can see code
921+ executed for all privilege levels, and zoom in/out using
922+ :kbd: `W `, :kbd: `A `, :kbd: `S `, :kbd: `D ` keys. You can find below some sequences
923+ taken from this trace:
924+
925+ - Two first stages of boot sequence in Arm Trusted Firmware (EL3 and S-EL1)
926+
927+ .. image :: https://fileserver.linaro.org/s/kkxBS552W7nYESX/preview
928+ :height: 200px
929+
930+ - U-boot initialization (until code relocation, after which we can't track it)
931+
932+ .. image :: https://fileserver.linaro.org/s/LKTgsXNZFi5GFNC/preview
933+ :height: 200px
934+
935+ - Stat and open syscalls in kernel
936+
937+ .. image :: https://fileserver.linaro.org/s/dXe4MfraKg2F476/preview
938+ :height: 200px
939+
940+ - Timer interrupt
941+
942+ .. image :: https://fileserver.linaro.org/s/TM5yobYzJtP7P3C/preview
943+ :height: 200px
944+
945+ - Poweroff sequence (from kernel back to firmware, NS-EL2 to EL3)
946+
947+ .. image :: https://fileserver.linaro.org/s/oR2PtyGKJrqnfRf/preview
948+ :height: 200px
949+
950+ Build and run system example
951+ ++++++++++++++++++++++++++++
952+
953+ .. _uftrace_build_system_example :
954+
955+ Building a full system image with frame pointers is not trivial.
956+
957+ We provide a `simple way <https://github.com/pbo-linaro/qemu-linux-stack >`_ to
958+ build an aarch64 system, combining Arm Trusted firmware, U-boot, Linux kernel
959+ and debian userland. It's based on containers (``podman `` only) and
960+ ``qemu-user-static (binfmt) `` to make sure it's easily reproducible and does not depend
961+ on machine where you build it.
962+
963+ You can follow the exact same instructions for a x64 system, combining edk2,
964+ Linux, and Ubuntu, simply by switching to
965+ `x86_64 <https://github.com/pbo-linaro/qemu-linux-stack/tree/x86_64 >`_ branch.
966+
967+ To build the system::
968+
969+ # Install dependencies
970+ $ sudo apt install -y podman qemu-user-static
971+
972+ $ git clone https://github.com/pbo-linaro/qemu-linux-stack
973+ $ cd qemu-linux-stack
974+ $ ./build.sh
975+
976+ # system can be started using:
977+ $ ./run.sh /path/to/qemu-system-aarch64
978+
979+ To generate a uftrace for a system boot from that::
980+
981+ # run true and poweroff the system
982+ $ env INIT=true ./run.sh path/to/qemu-system-aarch64 \
983+ -plugin path/to/contrib/plugins/libuftrace.so,trace-privilege-level=on
984+
985+ # generate symbols and memory mapping
986+ $ path/to/contrib/plugins/uftrace_symbols.py \
987+ --prefix-symbols \
988+ arm-trusted-firmware/build/qemu/debug/bl1/bl1.elf \
989+ arm-trusted-firmware/build/qemu/debug/bl2/bl2.elf \
990+ arm-trusted-firmware/build/qemu/debug/bl31/bl31.elf \
991+ u-boot/u-boot:0x60000000 \
992+ linux/vmlinux
993+
994+ # inspect trace with
995+ $ uftrace replay
996+
997+ Uftrace allows to filter the trace, and dump flamegraphs, or a chrome trace.
998+ This last one is very interesting to see visually the boot process::
999+
1000+ $ uftrace dump --chrome > boot.json
1001+ # Open your browser, and load boot.json on https://ui.perfetto.dev/.
1002+
1003+ Long visual chrome traces can't be easily opened, thus, it might be
1004+ interesting to generate them around a particular point of execution::
1005+
1006+ # execute qemu and timestamp output log
1007+ $ env INIT=true ./run.sh path/to/qemu-system-aarch64 \
1008+ -plugin path/to/contrib/plugins/libuftrace.so,trace-privilege-level=on |&
1009+ ts "%s" | tee exec.log
1010+
1011+ $ cat exec.log | grep 'Run /init'
1012+ 1753122320 [ 11.834391] Run /init as init process
1013+ # init was launched at 1753122320
1014+
1015+ # generate trace around init execution (2 seconds):
1016+ $ uftrace dump --chrome --time-range=1753122320~1753122322 > init.json
1017+
8191018Other emulation features
8201019------------------------
8211020
0 commit comments