Skip to content

Commit c51dcaf

Browse files
committed
Merge branch 'tc-l2' into dev
2 parents 80b48c5 + 607c4fd commit c51dcaf

File tree

20 files changed

+450
-285
lines changed

20 files changed

+450
-285
lines changed

README.md

Lines changed: 44 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -22,24 +22,32 @@
2222

2323

2424
## Overview
25-
The TreeCore processors are the riscv64 cores developed under the [Open Source Chip Project by University (OSCPU)](https://github.com/OSCPU). OSCPU was initiated by ICTCAS(**_Institute of computing Technology, Chinese Academy of Sciences_**), which aims to make students use all open-source toolchain to design, develop open-source chips by themselves. It also can be called "One Life, One Chip" project in Chinese which has achieved two season. Now Season 3 is in progress in 2021.
25+
The TreeCore processors are the riscv cores developed under the [Open Source Chip Project by University (OSCPU)](https://github.com/OSCPU) project. OSCPU was initiated by ICT, CAS(**_Institute of computing Technology, Chinese Academy of Sciences_**), which aims to make students use all open-source toolchains to design chips by themselves. It also can be called "One Life, One Chip" project in Chinese which has carried out two season. Now Season 3 is in progress(**_2021.7-2022.1_**).
2626

27-
Now the TreeCore has two version, TreeCoreL1(**_TreeCore Learning 1_**) and TreeCoreL2(**_TreeCore Learning 2_**). The TreeCore project is aim to help students to develop a series of riscv processor by step-to-step materials, So not just for high performance. Not like textbooks exhibit the all the knowledges in one time. TreeCore start a very simple model. provide necessary new concepts or knowledge you need to learn.
27+
Now the TreeCore has two version: TreeCoreL1(**_TreeCore Learning 1_**) and TreeCoreL2(**_TreeCore Learning 2_**). The TreeCore project is aim to help students to develop a series of riscv processor by step-to-step materials, So not just for high performance. Not like textbooks exhibit the all the knowledges in one time. TreeCore start a very simple model. provide necessary new concepts or knowledge you need to learn.
2828

29+
> NOTE: now the TreeCoreL2 is under phase.
2930
3031
## Motivation
31-
I heard the word **_RISCV_** first time in the second semester of my junior year(that is, the summer of 2016). My roommate participated in the pilot class of "Computer Architecture" organized by the college, and **their task was to design a simple soft-core CPU based on the RISCV instruction set**. At that time, I only knew that it was an open source RISC instruction set launched by the University of Berkeley. I felt that it was similar to the MIPS, so I didn't take it too seriously. But what is unexpected is that after just a few period of development, the RISCV has been supported by many Internet and semiconductor giants around the world, and more and more research institutions, start-ups begin to design their own proprietary processors based on it. Although now the performance and application of RISCV are still limited, **I believe RISCV will usher in a revolution that can change the old pattern in someday**.
32+
I heard the word '**_riscv_**' first time in sophomore year(that is, the summer of 2016). My roommate participated in the pilot class of **_Computer Architecture_**, and their final assignment was to **design a simple soft-core riscv processor**. At that time, I only knew it was an open source RISC ISA launched by the UC, Berkeley. What is unexpected to me is that just after a few period of time, the riscv has been supported by many semiconductor giants and research institutions. Although the performance of riscv are still limited now, **I believe riscv will usher in a revolution that can change the old pattern in someday**.
3233

33-
The ancients once said: **it’s always shallow on paper, and you must do it yourself**. For the learn of the computer architecture, there is no better way to realize it from scratch. So I started to collect materials from the Internet, and I found the learning threshold and cost is very high. In addition, in order to pursue the performance, some open-source CPU cores are very complex(such as using mulit-pipelines, multi-core processing, out-of-order execution technology, etc), it is very difficult for beginners to get started. So I decided to design a series of open source processors from scratch, which has **simple, understandable architecture, high-quality code with step-to-step tutorial**.
34+
The best way to learn the processor design is to implement it from scratch. When I searched online and found the learning threshold and cost is very high. In addition, in order to pursue high performance, some open-source riscv cores are very complex(such as using dynamics branch prediction, multi-core processing, out-of-order execution technology, etc), these are very difficult for beginners to learn. So I decided to design a series of open source processors from scratch, which has **simple, understandable architecture, high-quality code with step-to-step tutorial**.
3435

35-
I hope it can become a ABC project like Arduino and make more processor enthusiasts or computer related specialized students enter into the computer architecture field. In the future, under the mutual promotion of the software and hardware ecosystem, I believe more people will like CPU development and be willing to spend time on it.
36+
I hope it can become a ABC project like Arduino to make more processor enthusiasts and computer related specialized students enter into the computer architecture field. In the future, under the mutual promotion of the software and hardware ecosystem, I believe more people will like processor design and be willing to spend time on it.
3637

3738
## Feature
38-
TreeCoreL1(**under development**)
39-
* 64-bits single period riscv core
40-
* written by verilog
39+
IMG!!!!!!!!!!!!!!!! to intro three type processor and timeline.
4140

42-
TreeCoreL2
41+
**intro** the plan with the such as the target every type core need to meet. and timeline
42+
43+
**TreeCoreL1**
44+
* 64-bits FSM
45+
* written by chisel3
46+
47+
In fact, TreeCoreL1 is not just a processor, it only supplies the basic implement of Turing machine model: 'loop + '.
48+
IMG!!!!
49+
50+
**TreeCoreL2**
4351
* 64-bits single-issue, five-stage pipeline riscv core
4452
* written by chisel3
4553
* support RISCV integer(I) instruction set
@@ -48,23 +56,22 @@ TreeCoreL2
4856
* supports dynamics branch prediction
4957
* can boot rt-thread
5058
* develop under all open-source toolchain
59+
asdafafaadsfsafa
60+
IMG!!!!!!!!!!!!!!!
5161

52-
TreeCoreL3(**under development**)
62+
63+
**TreeCoreL3(_under development_)**
64+
65+
**TreeCoreL4(_under development_)**
5366
* 64-bits five-stage pipeline riscv core
54-
* written by chisel3
55-
* support RV64IMAC instruction set
56-
* supports machine mode privilege levels
57-
* supports AXI4 inst and mem acess
58-
* supports ICache, DCache(directed-map)
59-
* can boot rt-thread, xv6 and linux
60-
* develop under all open-source toolchain
67+
68+
6169

6270
## Develop Schedule
6371
Now, the develop schedule is recorded by the **Tencent Document**. You can click this link [schedule table](https://docs.qq.com/sheet/DY3lORW5Pa3pLRFpT?newPad=1&newPadType=clone&tab=BB08J2) to view it.
6472

65-
## Datapath Diagram
66-
6773
### Memory Map
74+
To compatible with SoC test, All types of TreeCore have same memory map range:
6875

6976
| Range | Description |
7077
| ------------------------- | --------------------------------------------------- |
@@ -81,7 +88,7 @@ Now, the develop schedule is recorded by the **Tencent Document**. You can click
8188
#### Configuration
8289

8390
## Usage
84-
91+
adsfadfasdfasf
8592
### Enviroment Setup
8693
> NOTE: All of the components are installed under linux operation system. To gurantee the compatibility and stability, I strongly recommend using `ubuntu 20.04 LTS`.
8794
@@ -134,25 +141,29 @@ $ make nemuBuild
134141
$ make dramsim3Build
135142
```
136143

137-
### Recursive test
138-
When you modify the processor design, you
144+
### Compile testcases
139145
```bash
140146
$ make riscvTestBuild
141147
$ make cpuTestBuild
148+
$ make amTestBuild
149+
```
150+
> NOTE: you need to enough memory to compile the
151+
152+
### Recursive test
153+
When you modify the processor design, you
154+
```bash
155+
$ make unit-tests
142156
```
157+
IMG!!!!!!!!!
143158

144159
### Software test
145160
```bash
146-
$ make amTestBuild
161+
$ make
147162
```
148163

149164
### SoC test
150165

151-
### Hardware test
152-
153-
- #### Hardware configuration
154-
155-
- #### Function test
166+
### Customize new core project
156167

157168
## Summary
158169

@@ -163,3 +174,8 @@ $ make amTestBuild
163174
## License
164175
All of the TreeCore codes are release under the [GPL-3.0 License](LICENSE).
165176

177+
## Acknowledgement
178+
179+
180+
## Reference
181+

report/tc_l2.md

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# 一生一芯第三期项目报告
2+
3+
## 个人介绍
4+
5+
我是缪宇驰,学号是20210324。西北工业大学航天学院精确制导与控制研究所在读研究生,将于2022年6月毕业。2018年本科毕业于西北工业大学航天学院探测制导与控制技术专业。现主要研究方向为微小卫星空间科学探测,星载计算机设计,小天体表面空间机器人运动规划和仿真。目前参与国家自然科学基金一项,发表国内论文一篇。曾获得研究生一等奖学金等。擅长FPGA板级电路设计开发和调试。热爱开源软硬件运动,业余时间从事开源工具类软件开发,[个人github地址](https://github.com/maksyuki)
6+
7+
以前没有实际设计过处理器核,参加一生一芯三期算是我第一次完整实现一个处理器。
8+
9+
## 项目概述
10+
11+
- 项目地址: [tree-core-cpu](https://github.com/microdynamics-cpu/tree-core-cpu)
12+
- 开发语言:chisel
13+
- 许可证:GPL-3.0
14+
15+
TreeCoreL2是一个支持RV64I的单发射5级流水线的开源处理器核。支持axi4总线取指和访存,支持动态分支预测(BTB, PHT, GHR),支持机器特权模式下的异常中断处理。能够在difftest和soc仿真环境下启动rt-thread。
16+
17+
## 微架构设计
18+
<p align="center">
19+
<img src="https://raw.githubusercontent.com/microdynamics-cpu/tree-core-cpu-res/main/treecore-l2-arch.drawio.svg"/>
20+
<p align="center">
21+
TreeCoreL2 总体数据流图
22+
</p>
23+
</p>
24+
25+
TreeCoreL2的微架构设计采用经典的5级流水线结构,取指和访存的请求通过crossbar进行汇总并转换成自定义的axi-like总线**data exchange(dxchg)**,最后通过转换桥将dxchg协议转换成axi4协议并进行仲裁。下面将着重介绍**取指****执行****访存****crossbar&axi4转换桥**四部分的具体实现。
26+
27+
### 取指单元
28+
取指单元主要功能是计算出下一个周期的pc并向axi总线发送读请求。pc通过多路选择器按照优先级从高到低依次选取`mtvec``mepc``jump target``branch predict target``pc + 4`的值。BPU采用基于全局历史的两级预测器。相关参数如下:
29+
1. Global History Reister(GHR): bit width = 5
30+
2. Pattern History Table(PHT): size = 32
31+
3. Branch Target Buffer(BTB): bit width = 64 size = 32
32+
33+
GHR每次从EXU得到分支是否taken的信息用于更新GHR移位寄存器的值,之后输出更新后值到PHT中并与当前pc求异或(**_gshare_**)。其结果作为PHT检索对应entry的地址,PHT每次从EXU得到分支执行后信息用于更新自己。BTB的每个Line记录一个1位的jump,64位的pc和64位的tgt值。1位的jump表示当前记录的指令是否是一个无条件跳转指令。
34+
35+
<p align="center">
36+
<img src="https://raw.githubusercontent.com/microdynamics-cpu/tree-core-cpu-res/main/treecore-l2-ifu.drawio.svg"/>
37+
<p align="center">
38+
取指单元主体部分
39+
</p>
40+
</p>
41+
42+
由于目前TreeCore2的取指和访存没有使用cache,处理器核需要大量时钟周期来等待axi的响应,所以采用动态分支预测技术后对ipc的提升较小。
43+
<p align="center">
44+
<img src="https://raw.githubusercontent.com/microdynamics-cpu/tree-core-cpu-res/main/treecore-l2-ipc.png"/>
45+
<p align="center">
46+
使用分支预测对性能的一点改进
47+
</p>
48+
</p>
49+
50+
### 执行单元
51+
执行单元主要用于执行算术逻辑计算、计算分支指令的跳转地址。另外还设计了一个乘除法单元(MDU)和加速计算单元(ACU)用于对矩阵乘除法进行加速,但是由于个人进度的影响,没能按期调通cache,故没有将MDU,ACU集成到提交的版本中。最后执行单元中还实现了CSR寄存器,用于对环境调用异常和中断进行处理。
52+
53+
### 访存单元
54+
访存单元集成了LSU和CLINT,其中LSU负责生成访存所需的读写控制信号(size, wmask等)。CLINT则读入生成的控制信号,若访存的地址处于`0x0200_0000 - 0x0200_ffff`之间,则处理访存的信号,否则将控制信号透传出去。
55+
56+
### Crossbar&Axi4转换桥
57+
crossbar负责将取值和访存的请求进行合并,统一成一个自定义的axi-like总线**data exchange(dxchg)**,dxchg其实和axi-lite很接近。不过考虑之后扩展的需要,故自定义了一个。axi4转换桥将crossbar的dxchg总线接口转换成标准axi4总线,其中实现了一个arbiter用于对取值和访存的请求进行仲裁。
58+
59+
## 项目结构和参考
60+
TreeCore的代码仓库结构借鉴了[riscv-sodor](https://github.com/ucb-bar/riscv-sodor)[oscpu-framework](https://github.com/OSCPU/oscpu-framework)组织代码的方式并使用make作为项目构建工具,同时Makefile里面添加了模板参数,可以支持多个不同处理器的独立开发,能够直接使用`make [target]`下载、配置相关依赖软件、生成、修改面向不同平台(difftest和soc)的verilog文件,执行回归测试等。
61+
62+
<p align="center">
63+
<img src="https://raw.githubusercontent.com/microdynamics-cpu/tree-core-cpu-res/main/treecore-l2-make.png"/>
64+
<p align="center">
65+
使用make自定义函数实现回归测试target
66+
</p>
67+
</p>
68+
69+
1. 另外TreeCore的实现和测试依赖于众多项目,其中包括:
70+
- [chisel3](https://github.com/chipsalliance/chisel3)
71+
- [verilator](https://github.com/verilator/verilator)
72+
- [NEMU](https://gitee.com/oscpu/NEMU)
73+
- [DRAMsim3](https://github.com/OpenXiangShan/DRAMsim3)
74+
- [difftest](https://gitee.com/oscpu/difftest)
75+
- [Abstract Machine](https://github.com/NJU-ProjectN/abstract-machine)
76+
- [ysyxSoC](https://github.com/OSCPU/ysyxSoC)
77+
- [riscv-tests](https://github.com/NJU-ProjectN/riscv-tests)
78+
2. 立即数扩展模块部分参考了[果壳处理器](https://github.com/OSCPU/NutShell)的实现方式
79+
3. 流水线结构和各功能单元安排部分参考了[蜂鸟E203](https://github.com/riscv-mcu/e203_hbirdv2)
80+
81+
## 总结
82+
83+
### 心得感想
84+
首先,要衷心地感谢一生一芯三期项目的所有老师,助教同学们一直以来的辛苦付出。去年自己有幸赶到上科大参加了RISCV中国峰会,香山处理器的系列报告让我大饱眼福。当听说新一期一生一芯项目准备面向全国高校学生开放后,作为一名研三临近毕业的学生,深感这次机会的来之不易,便毫不犹豫地报了名。在实际编码调试过程中让我重新学习了很多知识,比如内存地址对齐问题。我记得我第一次听说“地址对齐”这个名词还是13年我大一学c语言的时候,当时老师在讲解union类型时引出了这个概念。但是当时对这个概念没有深入学习下去,这导致我在刚开始调试axi仲裁的时候一直没搞对地址的掩码计算,花了很长时间。另外参加一生一芯三期对于我来说也是个不小的挑战,因为它要求独立开发,要在很短的时间内学习很多新知识,使用很多新工具,而这些是我以前做过的课程实验所没有的。在具体开发过程中,由于本人跨专业的原因,体系结构相关知识比较薄弱,所以很多内容都要从零开始学起。另外我还要兼顾科研任务,毕设实验和找工作等多项事情,时间很紧张,有时很长时间没法调试出一个bug也会让我感到沮丧和迷茫。但是相比于参加之前,自己也确实收获了实实在在的成长。通过参加一生一芯三期,我完整地实现了一个处理器核,虽然还不太完美。学习了chisel,verilator,difftest等众多开源处理器开发工具及其背后的敏捷开发思想,也加深了对软硬件之间工作原理的认知。当时的[进度表](https://docs.qq.com/sheet/DY3lORW5Pa3pLRFpT?newPad=1&newPadType=clone&tab=BB08J2)也记录下了自己开发调试过程中的点点滴滴。那种不停google->查书->编码->调试后bug被解决的喜悦让我终生难忘。
85+
86+
<p align="center">
87+
<img src="https://raw.githubusercontent.com/microdynamics-cpu/tree-core-cpu-res/main/treecore-l2-schedule.png"/>
88+
<p align="center">
89+
TreeCoreL2开发进度表
90+
</p>
91+
</p>
92+
93+
### 文档资料整理
94+
另外,在自己观看学习视频,编写、调试代码的过程中,为了方便自己复习、消化相关知识,我将自己平时曾踩过的坑以及qq群各位同学的问题记录了下来,并配以相关解答,总结成了一个FAQ文档。目前该文档有近3.7万字,202张图片,共126页。之后有时间将继续对文档中的相关内容进行补充,修改和更新。
95+
96+
<p align="center">
97+
<img src="https://raw.githubusercontent.com/microdynamics-cpu/tree-core-cpu-res/main/treecore-l2-guide.png"/>
98+
<p align="center">
99+
总结的常见问题文档
100+
</p>
101+
</p>
102+
103+
### 一点开发过程中的想法: 波形与回归测试联合调试工具的设计
104+
difftest进行差分测试可以快速定位到出错的指令,却无法像波形那样直观地展现多周期完整的信号变化。考虑设计一个工具,当difftest对比到出错的指令时能够触发事件,而这个事件传递到波形组件后可以直接定位到对应时钟周期并显示临近的波形,以方便调试。后期的话考虑直接对波形进行解析,就像嵌入式领域的逻辑分析仪一样,能够直接将诸如操作数,stall等信息标注到波形上。
105+
106+
## 计划
107+
目前开发的**TreeCoreL2**是TreeCore系列处理器核的第二个版本,目前基本达到设计目标,后续将会继续优化代码。而第三个版本(**TreeCoreL3**)和第四个版本(**TreeCoreL4**)将会追求更高的性能,也是规划中的参加一生一芯第四期和第五期的处理器。其中**TreeCoreL3**将在前代核的基础上,支持RV64IMAC指令,cache和mmu,并提高流水线级数,使其能够启动rt-thread,xv6和linux。**TreeCoreL4** 则会在**TreeCoreL3**的基础上实现浮点运算和多发射技术,进一步提高处理器性能。
108+
109+
对于TreeCoreL2来说:
110+
- 继续改进当前TreeCoreL2的微架构设计,能够使用更多chisel的特性来简化代码实现
111+
- 将处理器核移植到安路科技的fpga上
112+
113+

rtl/Makefile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -277,7 +277,7 @@ socPrevBuild: chiselBuild socTopModify
277277
socBuild: changeTargetToSoCTop chiselBuild socNameCheck
278278
@cp $(BUILD_DIR)/soc/ysyx_210324.v ../../oscpu/projects/soc/vsrc/
279279

280-
socSubmit:
280+
socTest:
281281
@cp $(BUILD_DIR)/soc/ysyx_210324.v ../../oscpu-submit/projects/soc/vsrc/
282282

283283
socRun:
@@ -307,5 +307,5 @@ cleanAll: cleanBuild cleanMillOut cleanDepRepo
307307
simpleTestBuild riscvTestBuild cpuTestBuild amTestBuild coremarkTestBuild \
308308
dhrystoneTestBuild microbenchTestBuild fecmuxTestBuild demoTest \
309309
simpleRecursiveTest riscvRecursiveTest cpuRecursiveTest unit-test \
310-
socTopModify socNameCheck socLintCheck socPrevBuild socBuild socSubmit socRun\
310+
socTopModify socNameCheck socLintCheck socPrevBuild socBuild socTest socRun\
311311
cleanBuild cleanMillOut cleanDepRepo cleanAll

0 commit comments

Comments
 (0)