LLVM::DenseMap
- clang-check(syntax checking)
- automatic fixing of compile errors(clang-fixit)
- automatic code formatting(clang-format)
- strength reduction
- inlining
- Constant folding
- Constant propagation
- Common subexpression elimination
- Dead Code removal
- Instruction selection
- Loop invariant code movement
- Peephole optimizations
- Tail call removal
- Loop unrolling
LTO <==> 不错的优化措施
wrapper property =====> 重新理解
***
public,private open,fileprivate
读取文件相关信息
Only use the URL(string:) initializer if the string you pass is a valid URL beginning with a URL scheme
=====> linking
- Basic DefaultConstructible MoveConstructible CopyConstructible MoveAssignable CopyAssignable Destructible
- Type Properties
TriviallyCopyable
TrivialType
StandardLayoutType
- PODType scalar type class type(struct,union,class) an array of such type
- Libary-Wide EqualityComparable Allocator FunctionObject Callable Predicate
- Iterator LegacyIterator LegacyInputIterator LegacyOutputIterator LegacyForwardIterator LegacyBidirectionalIterator LegacyRandomAccessIterator
备注: Data Layout
Data Member Pointers Member Function Pointers Non-POD Class Types
>> VTableContextBase <<
c++11 c++14 google_abeil facebook_molly c++ technicals =====> state of the art
openssl library integrate ====> EVP
javascript pyhton lua ====> 脚本整合
packet process library
||
界面设计只用 =====> JUCE 重要
function_ref =====> 使用的具体细节
basic block
including some files for specific machine targets ======> 暂时默认
代码生成
- one in-memory compiler IR
- an on-disk bitcode representation(JIT)
- a human readable assembly language
- Identifiers global(‘@’) local(‘%’) [named value + unnamed value + constants]
compilation time memory allocation definitions must be initialized
- zeroext
- signext
- inreg
- byval
- byref
- preallocated
- inalloca
- sret
- align
- noalias
- nocapture
- nofree
- nest
- returned
- nonnull
- dereferenceable
- dereferenceable_or_null
- swiftself
- swifterror
- immarg
- noundef
- alignstack
- private current_module
- internal
- available_externally
- linkonce
- weak
- common
- appending
- extern_weak
- linkonce_ord,weak_odr
- external
- ccc
- fastcc
- default
- hidden
- protected
- dllimport
- dllexport
- localdynamic
- initialexec
- localexec
omit
allow arbitrary code to be inserted prior to the function body
- alignstack
- allocsize
- unordered
- monotonic
- acquire
- release
- acq_rel
- seq_cst
- Void Type
- Function Type
- First Class Type
…
Specialized Metadata Nodes DICompileUnit DIFile DIBasicType DISubroutineType DIDerivedType DICompositeType DISubrange DIEnumerator DITemplateTypeParameter DITemplateValueParameter DINamespace DIGlobalVariable DIGlobalVariableExpression DISubprogram DILexicalBlock DILexicalBlockFile DILocation DILocalVariable DIExpression DIArgList DIFlags DIObjCProperty DIImportedEntity DIMacro DIMacroFile DILabelOther Metadatas omit ==> use for learning- dsymutil ==> 处理dwarf debug infos
- llc
input format:
- llvm assembly language format
- llvm bitcode format
- lli
- terminal instructions ret br(conditional vs unconditional) switch indirectbr invoke(call) callbr() resume
- binary instructions fneg(unary operation)
- bitwise binary instructions
- memory instructions
- others
LLVM 10.0.0 =====> 重新构建
Comment by Vladimir N. Makarov: Muchnik book is a fat one. Muchnick’s book is rather encyclopedia of optimizations and can be considered as collection of articles with many details (sometimes too many). But some themes (like RA and scheduling) are described not deep.
Comment by Joe Buck: Also, as has been mentioned, many of his algorithms are buggy (I think it came from describing them all in his own artificial language that he had no compiler for). I suppose that if you really understand his text, you can debug his algorithms.
Comment by Steven Bosscher: Muchnick is also famous for its >150 A4 pages of errata, especially the 1st and 2nd print. I really wouldn’t recommend it to you unless you’re looking for a compiler algorithms cook book.
Comment by Dan Towner: Many of the algorithm examples leave crucial details poorly or incompletely explained. For example, some algorithms reference functions which have English-language description of their implementations, which could be interpreted in one of several ways. Despite this shortcoming however, this remains my preferred book on compilers, as it it contains enough information to provide an introduction to parts of the compiler I may be unfamiliar with.
Comment by Vladimir N. Makarov: Although the book volume is small, this is not an appetizer. This is practically description of Morgan’s integral approach for building optimizing compilers. The book contains very detail algorithms of all passes of the proposed compiler back-end. A very interesting book to read about RA but his proposed complicated approach (combined global/local/FAT/RA) is doubtful. I’ve tried it and found not working well for gcc. Scheduling and software pipelining description is weak too.
Comment by Steven Bosscher: This is my favorite book. If you’ve read the Dragon book and this one, you’re well under way to being a compiler expert. I agree with Vlad about the contents of the book, but it is the only fairly comprehensive introduction text I know of that deals with LCM and SSA at a level that even I can understand ;-)
Comment by Vladimir N. Makarov: It is close to their course in Rice University. A good book to start study compiler from parsing to code generation and basic optimizations. But if you are familiar with the compilers, you probably don’t find interesting thoughts and approaches.
Comment by Vladimir N. Makarov: Another good book to start to study compilers from parser to code generation and basic optimizations. I especially like the version in ML (Modern compiler implementation in ML).
Comment by Steven Bosscher: The version in ML is the best of the three. The other two look too much like “had to do this”-books where algorithms are translated from ML, which makes them look very unnatural in C/Java.
Comment by Vladimir N. Makarov: Personally I don’t like it because it is based on outdated (although classical) book. I attached a review of this book which I wrote more than year ago (when the book was not ready).
Comment by Steven Bosscher: This one is old, but it is a classic. The 1st edition should be on every compiler engineer’s book shelf, just because. I have never seen the 2nd edition myself.
Comment by Vladimir N. Makarov: It is book to study more advanced (not basic) optimizations like dependence analysis, loop optimizations, inter-procedural optimizations.
Comment by Vladimir N. Makarov: I am waiting for Fischer’s book. I like his lectures but I am afraid using Java for this book can hurt the book.
Comment by Steven Bosscher: is another good introduction text, especially if you’re interested in various parsing techniques.
CRC Press 2003. Upto page 916.
Comment by J.C.: Good topics:
Scalar Compiler Optimizations on the Static Single Assignment (SSA) Form and the Flow Graph by Y.N. Srikant. Pages 99 .. 140. Register Allocation (RA) by K. Gopinath. Pages 461 .. 529. Instruction Selection Using Tree Parsing by Priti Shankar. Pages 565 .. 599. Instruction Scheduling by R. Govindarajan. Pages 631 .. 678. Optimizations for Object-Oriented Languages by Andreas Krall and Nigel Horspool. Pages 219 ..244. Program Slicing by G.B. Mund, D. Goswami and Rajib Mall. Pages 269 ..291. Automatic Generation of Code Optimizers from Formal Specifications by Vineeth Kumar Paleri. Pages 61 .. 97. Data Flow Analysis by Uday. P. Khedker. Pages 1 .. 59.
Comment by Vladimir N. Makarov: Thanks for reminding. I know about this book but I did not read it. It looks very interesting but it is expensive one. I think about buying it because it looks promising for deeper study but I have some doubts because it looks like some articles from the book are available on Internet (like software pipelining algorithms overview by Vicki Alan etc).
Comment by Sebastian Pop: If you like maths, a short book provides more formal background than what you can find in classical compiler literature.
Comment by Sebastian Pop: classical book for a math audience that strangely don’t get “outdated”.
Comment by Sebastian Pop: classical book for a math audience that strangely don’t get “outdated”.
Comment by Sebastian Pop: classical book for a math audience that strangely don’t get “outdated”.
Comment by Vladimir N. Makarov: If you don’t want to be compiler savvy but want to understand the compiler, I’d recommend Appel’s, Cooper’s, Morgan’s book in the same priority.
Comment by Dan Towner: not exactly a compiler book in the sense of other books listed here, but a very valuable resource for anyone writing back-ends or low-level optimisation passes. This book describes how fundamental arithmetic and logic operations can be used to perform bit/byte rearrangement, overflow checks, fast division, multiplication, computing square roots, and much more. A fascinating and useful book.
Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman. Compilers. Principles, Techniques, and Tools.
Addison Wesley; 2nd ed. (August 2006)
Comment by Vladimir N. Makarov: Review_of_the_second_addition_of_the_Dragon_Book.
main executable overrides the symbols in the shared library Position independent code will call non-static functions via the Procedure Linkage Table or PLT. This PLT does not exist in .o files. In a .o file, use of the PLT is indicated by a special relocation. When the program linker processes such a relocation, it will create an entry in the PLT. It will adjust the instruction such that it becomes a PC-relative call to the PLT entry. PC-relative calls are inherently position independent and thus do not require a relocation entry themselves. The program linker will create a relocation for the PLT entry which tells the dynamic linker which symbol is associated with that entry.
This process reduces the number of dynamic relocations in the shared library from one per function call to one per function called. Further, PLT entries are normally relocated lazily by the dynamic linker. On most ELF systems this laziness may be overridden by setting the LD_BIND_NOW environment variable when running the program. However, by default, the dynamic linker will not actually apply a relocation to the PLT until some code actually calls the function in question. This also speeds up startup time, in that many invocations of a program will not call every possible function. This is particularly true when considering the shared C library, which has many more function calls than any typical program will execute.
In order to make this work, the program linker initializes the PLT entries to load an index into some register or push it on the stack, and then to branch to common code. The common code calls back into the dynamic linker, which uses the index to find the appropriate PLT relocation, and uses that to find the function being called. The dynamic linker then initializes the PLT entry with the address of the function, and then jumps to the code of the function. The next time the function is called, the PLT entry will branch directly to the function.
PLT ====> GOT identifier + address(第一次时 =====> dynamic linker<logic ===> 地址解析相关>) PLT ====> 第一次寻找符号 =====> got 表里面的符号初始化都为 ====> plt 前面几条指令 执行完plt 头部几条指令后,最后一条指令直接指向 =====> LD Linker =====> 参与搜索相关库里面的函数名以及地址对应关系
选中CODE32,按下ALT+G,然后选择T,value改成0x1,让代码统一,再按下C键 ====> OK
备注: 字符集(character sets)
- Storage
- Align
- UB(undefined behavior)
- Environment
- Translation Environment
- Execution Environment
- Trigraph Sequence ??= # ??( 1 ??/ \ ??) 1 ??’ ^ ??< I ??! I ??> 1 ??- -
- memory
- automatic storage duration
- static storage duration
- common token
- keyword
----------------+--------+--------+auto double int struct ----------------+--------+--------+break else long switch ----------------+--------+--------+case enum register typedef ----------------+--------+--------+char extern return union ----------------+--------+--------+const float short unsigned ----------------+--------+--------+continue for signed void ----------------+--------+--------+default goto sizeof volatile ----------------+--------+--------+do if static while ----------------+--------+--------+ - identifier
Scope:
function,file,block,function prototype
Linkage:
external,internal,none
Storage durations of objects:
static
automatic
type category:
- qualified type
- const
- volatile
- unqualified type(raw type)
Compatible type and Composite type:
- volatile
- constant
- float constant
- Integer constant
- Enumeration constant
- Character constant
- string-literal
- operator
- punctuator
- keyword
- preprocessing token
- header-name
- identifier
- pp-number
- character-element
- string-literal
- operator
- punctuator
Notes: Constraits <===>
Note: tricky
- primary-expression
- postfix-operators
- constant expression
- Initialization
- Declarator
- size of integral types <limits.h>
- float types <float.h>
- Physical source file multibyte characters are mapped in the source set
- Trigraph sequences ??= # ??) ] ??! | ??( [ ??’ ^ ??> } ??/ \ ??< { ??- ~
- Some Limits for Translatio
- Scope of identifier
- file
- block
- function
- function prototype
- linkage of identifier
- external
- internal
- none
- name spaces of identifier
- storage durations of objects
- static
- automatic
- allocated
- Types
- object types
- function types
- incomplete types
Classification: Single Type:
- char
- int
- float
- complex
- enumeration
- void(incomplete type)
Derived Type:
- array type
- structure type
- union type
- function type
- pointer type
Arithmetic types and Pointer types = scalar types Array types and Structure types = aggregate types qualified types = {const,volatile,restrict} unqualified types
implicit conversion vs explicit conversion
- Arithmetic Operands
- Other Operands
lvalue
{
- expression with one object type or an incomplete type(not including void)
- modifiable lvalue = not array type | does not have an incomplete type | does not have a const-qualified type | not embeded in structure or union
} function designator void
Syntax: token ==> keyword identifier constant string-literal punctuator
preprocessing-token: header-name identifier pp-number character-constant string-literal punctuator
auto enum restrict unsigned break extern return void case float short volatile char for signed while const goto sizeof _Bool continue if static _Complex default inline struct _Imaginary do int switch double long typedef else register union
unsigned-suffix: u U long-suffix: l L long-long-suffix: ll / LL
- Primary Expressions
- postfix operators
- array subscripting
- function calls
- Structure and Union members
- Postfix increment and decrement operators
- Compound literals
- Unary Operators
- prefix increment and decrement operators
- address and indirection operators
- Unary arithmetic operators
- sizeof(operator)
- cast operators
- Multiplicative Operators
- Additive Operators
- Bitwise shift Operators
- Relational Operators
- Equality Operators
- BitWise AND Operator
- BitWise exclusive OR Operator
- BitWise inclusive OR Operator
- Logical AND Operator
- Logical OR Operator
- Conditional Operator
- Assignment Operators
- simple assignment
- compound assignment
- Comma Operator
- Storage-Class specifiers {typedef,extern,static,auto,register}
- Type Specifiers
- Structure and Union Specifiers
- Enumeration Specifiers
- Tags(incomplete type)
- Type Qualifiers const volatile restrict
- Function Specifiers
- Declarators
function or object(scope,storage duration,type)
- pointer declarators
- Array declarators
- Function declarators(prototypes)(return type limit: not function type or array type)
*
- Type Names
- Type Definitions
- Initialization(is different from c++(initialization lists))
<statement + block + full expression>
- labeled-statement
- compound-statement
- expresssion-statement
- selection-statement
- iteration-statement
- jump-statement
- Core C Runtime Library(linux) crt1.o: provide the ‘_start’ symbol for the runtime linker(ld.so.1) jumping to, only use for building executables crti.o + crtn.o: provide prologue and epilogue(.init + .fini),use for executables and shared objects
- jmp | branch
- manually implement loop unrolling by interleaving two syntactic constructs of C : * do-while,* switch statement(Duff’s device)
- scanning
- parsing
- parse-tree generation
top-down parser scan input left to right,produce output of the left-most bits first,look ahead at most 1 symbol at a time
bottom-up parser scan input left to right,produce output of the right-most bits first,look ahead at most 1 symbol at a time
one subset of LR(1) grammars,requring smaller parsing table
- unlimited backtracking
- use more memories
a ****
- traditional linear compiler pipeline
- incremental/interactive/query-driven
- Simply Typed lambda Calculus interpreter with lexically scoped variables and capture-avoiding substitution
- tradeoffs of names vs De Bruijn indices vs De Bruijn levels vs locally nameless representation
- Hindley-Milner type inference with constraint solving(Algorithm W) vs.bidirectional type checking algorithms
- advances in dependent types,refinement types,subtyping…
- natural deduction notation
- various IRs - SSA,CPS,ANF
- implement things like liveness and escape analyses
- Fixpoint/lattice/abstract-interpretation techniques
- Loop-invariant code motion
- Sketh the implementation of various common compiler optimizations
- dead code elimination
- inlining
- common subexpression elimination
- program transformation techniques
- poluhedral model
- automatic vectorisation method
- strength reduction
- induction variable elimination
- explain how register allocation works with a greedy or heuristic graph-colouring approach
- instruction selection and calling conventions