benchmarks: readd README.md

2020-09-28 13:42:17 +08:00 · 2020-09-28 13:42:17 +08:00 · a1edccd647
commit a1edccd647
parent fc7b5f832b
3 changed files with 301 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@ -6,6 +6,7 @@
 !*.S
 !Makefile
 !README
 !README.md
 !LICENSE
 .*
 _*
--- a/benchmarks/coremark/README.md
+++ b/benchmarks/coremark/README.md
@ -0,0 +1,231 @@
 # Coremark
 '''
 File: CoreMark
 Topic: Welcome
 Copyright <20> 2009 EEMBC All rights reserved. 
 CoreMark is a trademark of EEMBC and EEMBC is a registered trademark of the Embedded Microprocessor Benchmark Consortium.
 CoreMark<EFBFBD>s primary goals are simplicity and providing a method for testing only a processor<6F>s core features. 
 For more information about EEMBC's comprehensive embedded benchmark suites, please see www.eembc.org.
 Topic: Building and running
 	Download the release files from the www.coremark.org.
 	You can verify the download using the coremark_<version>.md5 file
 	> md5sum -c coremark_<version>.md5
 	Unpack the distribution (tar -vzxf coremark_<version>.tgz && tar -vzxf coremark_<version>_docs.tgz) 
 	then change to the coremark_<version> folder.
 	To build and run the benchmark, type 
 	> make
 	Full results are available in the files run1.log and run2.log.
 	CoreMark result can be found in run1.log.
 	For self hosted Linux or Cygwin platforms, a simple make should work.
 	Cross Compile:
 	For cross compile platforms please adjust <core_portme.mak>, <core_portme.h> (and possibly <core_portme.c>) 
 	according to the specific platform used.
 	When porting to a new platform, it is recommended to copy one of the default port folders 
 	(e.g. mkdir <platform> && cp linux/* <platform>), adjust the porting files, and run 
 	> make PORT_DIR=<platform>
 	Systems without make:
 	The following files need to be compiled:
 	- <core_list_join.c> 
 	- <core_main.c> 
 	- <core_matrix.c> 
 	- <core_state.c>	
 	- <core_util.c>	
 	- <PORT_DIR>/<core_portme.c>
 	For example
 	> gcc -O2 -o coremark.exe core_list_join.c core_main.c core_matrix.c core_state.c core_util.c simple/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=1000 
 	> ./coremark.exe > run1.log
 	The above will compile the benchmark for a performance run and 1000 iterations. Output is redirected to run1.log.
 	Make targets:
 	run - Default target, creates run1.log and run2.log.
 	run1.log - Run the benchmark with performance parameters, and output to run1.log
 	run2.log - Run the benchmark with validation parameters, and output to run2.log
 	run3.log - Run the benchmark with profile generation parameters, and output to run3.log
 	compile - compile the benchmark executable 
 	link - link the benchmark executable
 	check - test MD5 of sources that may not be modified
 	clean - clean temporary files
 	ITERATIONS: 
 	By default, the benchmark will run between 10-100 seconds.
 	To override, use ITERATIONS=N
 	> make ITERATIONS=10 
 	Will run the benchmark for 10 iterations. 
 	It is recommended to set a specific number of iterations in certain situations e.g.:
 	- Running with a simulator
 	- Measuring power/energy
 	- Timing cannot be restarted
 	Minimum required run time: 
 	Results are only valid for reporting if the benchmark ran for at least 10 secs!
 	XCFLAGS:
 	To add compiler flags from the command line, use XCFLAGS e.g.
 	> make XCFLAGS="-g -DMULTITHREAD=4 -DUSE_FORK=1"
 	o CORE_DEBUG
 	Define to compile for a debug run if you get incorrect CRC.
 	> make XCFLAGS="-DCORE_DEBUG=1"
 	o Parallel Execution
 	Use XCFLAGS=-DMULTITHREAD=N where N is number of threads to run in parallel.
 	Several implementations are available to execute in multiple contexts,
 	or you can implement your own in <core_portme.c>.
 	> make XCFLAGS="-DMULTITHREAD=4 -DUSE_PTHREAD" 
 	Above will compile the benchmark for execution on 4 cores, using POSIX Threads API.
 	REBUILD:
 	To force rebuild, add the flag REBUILD to the command line
 	> make REBUILD=1
 	Check core_portme.mak for more important options.
 	Run parameters for the benchmark executable:
 	Coremark executable takes several parameters as follows (if main accepts arguments).
 	1st - A seed value used for initialization of data.
 	2nd - A seed value used for initialization of data.
 	3rd - A seed value used for initialization of data.
 	4th - Number of iterations (0 for auto : default value)
 	5th - Reserved for internal use. 
 	6th - Reserved for internal use. 
 	7th - For malloc users only, ovreride the size of the input data buffer.
 	The run target from make will run coremark with 2 different data initialization seeds.
 	Alternative parameters: 
 	If not using malloc or command line arguments are not supported, the buffer size
 	for the algorithms must be defined via the compiler define TOTAL_DATA_SIZE.
 	TOTAL_DATA_SIZE must be set to 2000 bytes (default) for standard runs.
 	The default for such a target when testing different configurations could be ...
 	> make XCFLAGS="-DTOTAL_DATA_SIZE=6000 -DMAIN_HAS_NOARGC=1"
 Topic: Documentation
 	When you unpack the documentation (tar -vzxf coremark_<version>_docs.tgz) a docs folder will be created.
 	Check the file docs/html/index.html and the website http://www.coremark.org for more info.
 Topic: Submitting results
 	CoreMark results can be submitted on the web.
 	Open a web browser and go to http://www.coremark.org/benchmark/index.php?pg=benchmark
 	Select the link to add a new score and follow the instructions.
 Topic: Run rules
 	What is and is not allowed.
 	Required:
 	1 - The benchmark needs to run for at least 10 seconds.
 	2 - All validation must succeed for seeds 0,0,0x66 and 0x3415,0x3415,0x66, 
 		buffer size of 2000 bytes total.
 		o If not using command line arguments to main:
 		> make XCFLAGS="-DPERFORMANCE_RUN=1" REBUILD=1 run1.log
 		> make XCFLAGS="-DVALIDATION_RUN=1" REBUILD=1 run2.log
 	3 - If using profile guided optimization, profile must be generated using seeds of 8,8,8,
 		and buffer size of 1200 bytes total.
 		> make XCFLAGS="-DTOTAL_DATA_SIZE=1200 -DPROFILE_RUN=1" REBUILD=1 run3.log
 	4 - All source files must be compiled with the same flags.
 	5 - All data type sizes must match size in bits such that:
 		o ee_u8 is an 8 bits datatype.
 		o ee_s16 is an 16 bits datatype.
 		o ee_u16 is an 16 bits datatype.
 		o ee_s32 is an 32 bits datatype.
 		o ee_u32 is an 32 bits datatype.
 	Allowed:
 	- Changing number of iterations
 	- Changing toolchain and build/load/run options
 	- Changing method of acquiring a data memory block
 	- Changing the method of acquiring seed values
 	- Changing implementation in core_portme.c
 	- Changing configuration values in core_portme.h
 	- Changing core_portme.mak
 	Not allowed:
 	- Changing of source file other then core_portme* (use make check to validate)
 Topic: Reporting rules
 	How to report results on a data sheet?
 	CoreMark 1.0 : N / C [/ P] [/ M]
 	N - Number of iterations per second with seeds 0,0,0x66,size=2000)
 	C - Compiler version and flags
 	P - Parameters such as data and code allocation specifics
 		- This parameter *may* be omitted if all data was allocated on the heap in RAM.
 		- This parameter *may not* be omitted when reporting CoreMark/MHz
 	M - Type of parallel execution (if used) and number of contexts
 		This parameter may be omitted if parallel execution was not used.
 	e.g. 
 	> CoreMark 1.0 : 128 / GCC 4.1.2 -O2 -fprofile-use / Heap in TCRAM / FORK:2 
 	or
 	> CoreMark 1.0 : 1400 / GCC 3.4 -O4 
 	If reporting scaling results, the results must be reported as follows:
 	CoreMark/MHz 1.0 : N / C / P [/ M]
 	P - When reporting scaling results, memory parameter must also indicate memory frequency:core frequency ratio.
 		- If the core has cache and cache frequency to core frequency ratio is configurable, that must also be included.
 	e.g.
 	> CoreMark/MHz 1.0 : 1.47 / GCC 4.1.2 -O2 / DDR3(Heap) 30:1 Memory 1:1 Cache
 Topic: Log File Format
 	The log files have the following format
 (start example)
 2K performance run parameters for coremark.	(Run type)
 CoreMark Size    	: 666					(Buffer size)
 Total ticks			: 25875					(platform dependent value)
 Total time (secs) 	: 25.875000				(actual time in seconds)
 Iterations/Sec 		: 3864.734300			(Performance value to report)
 Iterations			: 100000				(number of iterations used)
 Compiler version	: GCC3.4.4				(Compiler and version)	
 Compiler flags		: -O2					(Compiler and linker flags)
 Memory location		: Code in flash, data in on chip RAM
 seedcrc				: 0xe9f5				(identifier for the input seeds)
 [0]crclist			: 0xe714				(validation for list part)
 [0]crcmatrix		: 0x1fd7				(validation for matrix part)
 [0]crcstate			: 0x8e3a				(validation for state part)
 [0]crcfinal			: 0x33ff				(iteration dependent output)
 Correct operation validated. See readme.txt for run and reporting rules.  (*Only when run is successful*)
 CoreMark 1.0 : 6508.490622 / GCC3.4.4 -O2 / Heap 						  (*Only on a successful performance run*)		
 (end example)
 Topic: Legal
 See LICENSE.txt or the word document file under docs/LICENSE.doc.
 For more information on your legal rights to use this benchmark, please see
 http://www.coremark.org/download/register.php?pg=register	
 Topic: Credits
 Many thanks to all of the individuals who helped with the development or testing of CoreMark including (Sorted by company name)
 o Alan Anderson, ADI
 o Adhikary Rajiv, ADI
 o Elena Stohr, ARM
 o Ian Rickards, ARM
 o Andrew Pickard, ARM
 o Trent Parker, CAVIUM
 o Shay Gal-On, EEMBC
 o Markus Levy, EEMBC
 o Ron Olson, IBM
 o Eyal Barzilay, MIPS
 o Jens Eltze, NEC
 o Hirohiko Ono, NEC
 o Ulrich Drees, NEC
 o Frank Roscheda, NEC
 o Rob Cosaro, NXP
 o Shumpei Kawasaki, RENESAS
 '''
--- a/benchmarks/microbench/README.md
+++ b/benchmarks/microbench/README.md
@ -0,0 +1,69 @@
 # MicroBench
 CPU正确性和性能测试用基准程序。对AbstractMachine的要求：
 1. 需要实现TRM和IOE的API。
 2. 在IOE的全部实现均留空的情况下仍可运行。如果有正确实现的`AM_TIMER_UPTIME`，可以输出正确的统计时间。若这个功能没有实现(返回`0`)，仍可进行正确性测试。
 3. 使用`putch(ch)`输出。
 4. 堆区`heap`必须初始化(堆区可为空)。如果`heap.start == heap.end`，即分配了空的堆区，只能运行不使用堆区的测试程序。每个基准程序会预先指定堆区的大小，堆区不足的基准程序将被忽略。
 ## 使用方法
 同一组程序分成三组：test，train和ref。
 test数据规模很小，作为测试用，不计时不评分。
 train数据规模中等，可用于在仿真环境研究微结构行为，计时不评分。
 ref数据规模较大，作为衡量CPU性能用，计时并评分。
 默认运行ref数据规模，使用
 ```bash
 make ARCH=native run mainargs=test
 ```
 运行test数据规模，使用
 ```bash
 make ARCH=native run mainargs=train
 ```
 运行train数据规模。
 ## 评分根据
 每个benchmark都记录以`REF_CPU`为基础测得的运行时间微秒数。每个benchmark的评分是相对于`REF_CPU`的运行速度，与基准处理器一样快的得分为`REF_SCORE=100000`。
 所有benchmark的平均得分是整体得分。
 ## 已有的基准程序
 | 名称    | 描述                                   | ref堆区使用  |
 | ----- | ------------------------------------ | ----- |
 | qsort | 快速排序随机整数数组                           | 640KB |
 | queen | 位运算实现的n皇后问题                          | 0     |
 | bf    | Brainf**k解释器，快速排序输入的字符串              | 32KB  |
 | fib   | Fibonacci数列f(n)=f(n-1)+…+f(n-m)的矩阵求解 | 256KB |
 | sieve | Eratosthenes筛法求素数                    | 2MB   |
 | 15pz  | A*算法求解4x4数码问题                        | 2MB   |
 | dinic | Dinic算法求解二分图最大流                      | 1MB   |
 | lzip  | Lzip数据压缩                             | 4MB   |
 | ssort | Skew算法后缀排序                           | 4MB   |
 | md5   | 计算长随机字符串的MD5校验和                      | 16MB  |
 ## 增加一个基准程序`foo`
 在`src/`目录下建立名为`foo`的目录，将源代码文件放入。
 每个基准程序需要实现三个函数：
 * `void bench_foo_prepare();`：进行准备工作，如初始化随机数种子、为数组分配内存等。运行时环境不保证全局变量和堆区的初始值，因此基准程序使用的全局数据必须全部初始化。
 * `void bench_foo_run();`：实际运行基准程序。只有这个函数会被计时。
 * `int bench_foo_validate();`：验证基准程序运行结果。正确返回1，错误返回0。
 在`benchmark.h`的`BENCHMARK_LIST`中增加相应的`def`项，格式参考已有的benchmark。
 ## 基准程序可以使用的库函数
 虽然klib中提供了一些函数，但不同的klib实现会导致性能测试结果有差异。
 因此MicroBench中内置一些简单的库函数:
 * `bench_memcpy(void *dst, const void *src, size_t n)`: 内存复制。
 * `bench_srand(uint seed)`：用seed初始化随机数种子。
 * `bench_rand()`：返回一个0..32767之间的随机数。
 * `bench_alloc`/`bench_free`：内存分配/回收。目前回收是空操作。