Coprocessor Overview
What is a co-processor?
When using an Arm processor (except for the Cortex-M series), you must understand the coprocessor registers because the coprocessor is responsible for setting up the cache of processor functions, MMU, etc. The coprocessor is not located in memory space and uses dedicated instructions to read and write. The advantage of not being located in memory space is that all 4G bytes of memory space can be used effectively. Some coprocessor registers can only be accessed in privileged mode, so please refer to your processor’s “Technical Reference Manual (*1)”.
- There are 16 coprocessors defined.
- The CP15 (Coprocessor 15) provides system control functions.
- The CP11 (co-processor 11) supports double precision floating point operations.
- The CP10 (co-processor 10) supports single precision floating point operations as well as extensions to both VFP advanced SIMD architectures.
(*1) The English version of “Cortex-A9 Technical Reference Manual Revision:r4p1” or the Japanese version “ See Cortex-A9 Technical Reference Manual Revision: r2p2“
Coprocessor Access Instructions
The coprocessor is accessed by the MRC/MCR instructions. There is a coprocessor register that can be accessed only in the privileged mode, so use it with caution.
MRC instruction
The MRC instruction performs a read from the coprocessor register to the processor register.
MRC{ cond } coproc,# opcode1,Rd,CRn,CRm {, # opcode2 }
Settings | Settings |
---|---|
cond | Specify an arbitrary condition code. |
coproc | Specify the name of the coprocessor where the instruction is to be executed. In the case of the coprocessor cp15, this is the p15 setting. |
opcode1 | Specifies a 3-bit coprocessor-specific opcode. |
opcode2 | Specifies an optional 3-bit coprocessor-specific opcode. |
Rd | Specifies the destination processor register. r15(PC) cannot be set. |
CRn | Specify the coprocessor register. |
CRm | Specify the coprocessor register. |
MCR Instruction
The MCR instruction writes to the coprocessor register from the processor register.
MCR{ cond } coproc,#opcode1,Rd,CRn,CRm {, # opcode2 }
Settings | Settings |
---|---|
cond | Specify an arbitrary condition code. |
coproc | Specify the name of the coprocessor where the instruction is to be executed. In the case of the coprocessor cp15, this is the p15 setting. |
opcode1 | Specifies a 3-bit coprocessor-specific opcode. |
opcode2 | Specifies an optional 3-bit coprocessor-specific opcode. |
Rd | Specify the source processor register. r15(PC) cannot be set. |
CRn | Specify the coprocessor register. |
CRm | Specify the coprocessor register. |
How to access the coprocessor
If you use the Arm compiler to access the coprocessor, create it in an inline assembly language or assembly language using the “__asm
” keyword.
Example of accessing the coprocessor in the C/C++ compiler
If you use inline assembly language, you can write assembly instructions together with __asm{….} You can use __asm{…} for all of your assembly instructions.
void enable_caches( void ) { unsigned long reg; __asm { MRC p15,0,reg,c1,c0,0 // Reads the SCTLR (system control register) ORR reg,reg,#(0x1 << 12) // Allow instruction caching (SCTLR.I [12]) ORR reg,reg,#(0x1 << 2) // Allow data caching (SCTLR.D [2]) ORR reg,reg,#(0x1 << 11) // Allow program flow prediction (SCTLR.Z [11]) MCR p15,0,reg,c1,c0,0 // Writing SCTLR //================================================================== // Allows Cortex-A9 data prefetching //================================================================== MRC p15,0,reg,c1,c0,1 // Reads the ACTLR (auxiliary control register) ORR reg,reg,#(0x1 << 2) // Enable data prefetch activation (ATRR.DP[2]) MCR p15,0,reg,c1,c0,1 // Write the ACTLR } }
Example of accessing the assembly language coprocessor
AAPCS (Procedure on Arm Architecture) It is possible to execute coprocessor access instructions as a function from C/C++ language by writing an assembly program according to the "Call Standard".
EXPORT enable_caches ; allow cache and branch prediction enable_caches FUNCTION MRC p15,0,r0,c1,c0,0 ; Read SCTLR ORR r0,r0,#(0x1 << 12) ; Allow instruction cache (SCTLR.I[12]) ORR r0,r0,#(0x1 << 2) ; Allow data caching. (SCTLR.D[2]) ORR r0,r0,#(0x1 << 11) ; Allows program flow prediction. (SCTLR.Z[11]) MCR p15,0,r0,c1,c0,0 ; Write SCTLR ;================================================================== ; Allow data prefetching function ;================================================================== MRC p15,0,r0,c1,c0,1 ; Read ACTLR ORR r0,r0,#(0x1 << 2) ; enable data prefetching (ATRR.DP[2]) MCR p15,0,r0,c1,c0,1 ; Write ACTLR BX lr ENDFUNC
Coprocessor configuration example
When initializing the Cortex-A9 processor, you need to set up the coprocessor 15 (CP15). In this article, we will explain the following settings.
- Disable TLB (Transform Look-Aside Buffer) and Array of Branch Prediction
- Disable instruction/data cache
- VFP/NEON access rights and configuration
- Setting to change the vector address
- Setting up cache, MMU and branch predictions
Disabling TLBs and Branch Predictor Arrays
The TLB and branch predictor are not initialized at reset, so they are disabled at initialization.
【sample program】
;================================================================== ; Disabling TLBs and Branch Predictor Arrays ;================================================================== MOV r0,#0 ; Set the initial value MCR p15,0,r0,c8,c7,0 ; Total TLB disablement MCR p15,0,r0,c7,c5,6 ; Branch Predictor Array Disabled
TLBIALL (Global Disable Register)
- Disable the entire unified TLB, and if independent instruction TLBs and data TLBs are implemented, disable them simultaneously.
- The setting of the argument register <Rd> is ignored and can be executed in privileged mode only.
【Register Access Instructions】
MCR p15,0,<Rd>,c8,c7,0 ; Writing a TLB-wide disablement
BPIALL (Branch Predictor Array Invalidator Register)
- Disable the entire branch predictor array.
- The setting of the argument register <Rd> is ignored and can be executed in privileged mode only.
【Register Access Instructions】
MCR p15,0,<Rd>,c7,c5,6 ; Writing of Branch Predictor Array Disablement
Disabling instruction/data caches
The cache is not disabled on reset, so the L1 data cache and L1 instruction cache must be disabled. The data cache needs to be disabled for each line and way, so the information about the cache is obtained from CCSIDR (cache size identification register) and disabled.
【sample program】
MOV r0,#0 ; Set to the default value. MCR p15,0,r0,c7,c5,0 ; Disabling the entire instruction cache ; ; Disable the data cache. ; MOV r10,#0 ; Select Data Cache MCR p15,2,r10,c0,c0,0 ; In the cache size selection register (CSSELR) ; Select Data Cache ISB ; instruction-synchronous barrier instruction to re-fetch MRC p15,1,r1,c0,c0,0 ; Read CCSIDR AND r2,r1,#7 ; Get the cache line size (b001=8 words/line) ADD r2,r2,#4 ; Find the number of shifts in the set number of DCISW registers. LDR r4,=0x3FF ; Set the maximum number of ways mask ANDS r4,r4,r1,LSR #3 ; Set the number of ways in the r4 register. CLZ r5,r4 ; Find the number of shifts of the way number in the DCISW register. LDR r7,=0x7FFF ; Set the set number mask setting. ANDS r7,r7,r1,LSR #13 ; Set the number of sets in the r7 register. ; 0x7F=12Kbyte/0xFF=32Kbyte/0x1FF=64Kbyte Loop2 MOV r9,r4 ; Set the number of ways to the r9 register. Loop3 ORR r11,r10,r9,LSL r5 ; Set the way number and cache number. ORR r11,r11,r7,LSL r2 ; Set the set number MCR p15,0,r11,c7,c6,2 ; by set/way in DCISW register ; Disabling the data cache line SUBS r9,r9,#1 ; Way number -1 BGE Loop3 ; Initialize each way. SUBS r7,r7,#1 ; Set the set number to -1 BGE Loop2 ; initialize each set
ICIALLU (instruction cache-wide invalidation register)
- Disable all instruction caches and flush the branch destination cache.
- argument register<Rd>setting is ignored and can only be executed in privileged mode.
【Register Access Instructions】
MCR p15,0,<Rd>,c7,c5,0 ; ICIALLU writes
CSSELR (Cache size selection register)
- Used to select the CCSIDR.
- You can read and write in privileged mode only.
【Register Access Instructions】
MCR p15,2,<Rd>,c0,c0,0 ; Loading CSSELR MCR p15,2,<Rd>,c0,c0,0 ; CSSELR writing
【register format】
Bits | Name | Features |
---|---|---|
3:1 | Level | Indicates the level of the cache. 0b000: level 1 cache to 0b110: level 7 cache. The cache level in the Cortex-A9 processor is 0b000, which is one level. |
0 | InD | Instruction/data bits. 0: Data or unified cache. 1: Instruction cache |
CCSIDR (Cache Size Identification Register)
- Provides information about the architecture of the cache.
- CCSIDR is implemented for each cache accessible by the processor.
- In CSSELR, select CCSIDR.
- You can read it in privileged mode only.
【Register Access Instructions】
MRC p15,1,<Rd>,c0,c0,0 ; CCSIDR Reading
【register format】
Bits | Name | Features |
---|---|---|
31 | WT | Indicates write-through support status. 0=The feature is not supported. 1=The feature is supported. |
30 | WB | Write-back support status. 0=The feature is not supported. 1=The feature is supported. |
29 | RA | Indicates read-allocation support status. 0=The feature is not supported. 1=The feature is supported. |
28 | WA | Indicates Write Allocation support status. 0=The feature is not supported. 1=The feature is supported. |
27:13 | NumSets | Indicates the number of sets of cash lines -1. 0 means 1 set. |
12:3 | Associativity | Indicates "Cash associativity-1". If the value is 0, the associativity is 1. |
2:0 | LineSize | (log2(the number of words in the cache line))-2. If the line length is 4 words: log2(4)=2, LineSize entry=0 If the line length is 8 words: log2(8)=2, LineSize entry=1 |
- When the instruction cache value is read, the [31:28] bit is 0b0010.
- When the data cache value is read, the [31:28] bit is 0b0111.
DCISW (data cache invalidation register) by set/way
- Disable the data cache lines set by the set/way.
- You can write in privileged mode only.
【Register Access Instructions】
MCR p15,0,<Rd>,c7,c6,2 ; Writing data cache invalidation by set/way
【Cortex-A9 register format】
Name | Features |
---|---|
Level | Cache level-1 for the operation |
Set | The set number to be operated on |
Way | Way number to be operated on |
DCCSW (Data Cache Cleaning Register) by set/way
- Clean the data cache line by set/way.
- You can write in privileged mode only.
【Register Access Instructions】
MCR p15,0,<Rd>,c10,c6,2 ; Writing a clean data cache by set/way
【Cortex-A9 register format】
Name | Features |
---|---|
Level | Cache level-1 for the operation |
Set | The set number to be operated on |
Way | Way number to be operated on |
VFP/NEON access rights and uptime settings
At reset, the VFP/NEON access is initialized to the prohibited and non-operational state, so it is necessary to set the access privileges in the initialization process and set it to the operational state.
In the sample program, the coprocessor 10 (CP10) and coprocessor 11 (CP11) are set up for full access, and the advanced SIMD extension and VFP extension are set up for operation.
【sample program】
;===================================================================== ; NEON/VFP access permissions ; CP10/CP11 full access settings ;===================================================================== MRC p15,0,r0,c1,c0,2 ; Read CPACR ORR r0,r0,#(0xF << 20) ; CP10/CP11 full access configuration MCR p15,0,r0,c1,c0,2 ; Write CPACR ISB ; re-fetch with instruction synchronization barriers ;===================================================================== ; Start to NEON the VFP. ;===================================================================== MOV r0,#0x40000000 ; VMSR FPEXC,r0 ; Write EN bit in the floating-point exception register
Coprocessor Access Control Register (CPACR)
- Set up access permissions to the coprocessors CP10 and CP11 and set up advanced SIMD extensions and register access restrictions for the VFP registers.
- You can read and write in privileged mode only.
【Register Access Instructions】
MRC p15,0,<Rd>,c1,c0,2 ; CPACR reading MCR p15,0,<Rd>,c1,c0,2 ; CPACR writing
【register format】
Bits | Name | Content |
---|---|---|
31 | ASEDIS | Sets the inactivity of the advanced SIMD extension. 0=This will not be an undefined instruction. 1=All instruction encodings in the Arm Architecture Reference Manual that are part of the Advanced SIMD extension but are not VFPv3 instructions will be undefined. |
30 | D32DIS | Disables the use of VFP registers D16 through D31. 0=This will not be an undefined instruction. 1=If you access any of the registers D16 through D31, all instruction encodings that are considered VFPv3 instructions in the Arm Architecture Reference Manual will be undefined. |
23:22 | cp11 | Sets coprocessor access permissions. 0b00 = Access denied (access will cause an undefined exception to be raised). 0b01=Privileged access only (user mode access will cause an undefined exception to be raised). 0b10=Reserved. 0b11=Full access. |
21:20 | cp10 | Sets coprocessor access permissions. 0b00 = Access denied (access will cause an undefined exception to be raised). 0b01=Privileged access only (user mode access will cause an undefined exception to be raised). 0b10=Reserved. 0b11=Full access. |
FPEXC (Floating Point Exception Register)
- FPEXC provides global enable/disable control of advanced SIMD and VFP extensions.
【Register Access Instructions】
VMRS { cond } Rd, extsysreg ; Reading the Arm register from the NEON/VFP register VMSR { cond } extsysreg , Rd ; Writing the NEON/VFP register from the Arm register
【register format】
Bits | Name | Content |
---|---|---|
31 | EX | Exception Bit. Status bits to indicate how much information needs to be stored in order to store the Advanced SIMD and VPF systems. 0=D0 to D15/D16 to D31 (if implemented)/FPSCR/FPEXC 1= There are other states that need to be handled by all context-switching systems. |
30 | EN | Enable/Disable advanced SIMD and VFP extensions. 0 = Advanced SIMD and VFP extensions are disabled (default value at reset). (Default value at reset) 1= Advanced SIMD extensions and VFP extensions become operational. |
Setting to change the vector address
The exception vector address can be changed by setting an address in the VBAR (vector-based register).
If the V bit of SCTLR is 1, remapping by VBAR is not possible.
【sample program】
;=================================================================== ; Use VBAR to change vector addresses ;=================================================================== LDR r0,=Vectors ; Set the first address of the vector MCR p15,0,r0,c12,c0,0 ; Writing to VBAR
VBAR
- When the security extensions are implemented and no high-level exception vector is selected, the vector-based address register (VBAR) provides an exception base dress for exceptions that are not handled in monitor mode.
- The high-level exception vector always has a base dress of 0xFFFF_0000 and is not affected by the vector base address register.
- You can read and write in privileged mode only.
【Register Access Instructions】
MRC p15,0,<Rd>,c12,c0,0 ; VBAR loading MCR p15,0,<Rd>,c12,c0,0 ; VBAR writing
【register format】
Bits | Name | Content |
---|---|---|
31:4 | Vector-based address | Sets the vector base address on a 16-byte boundary. |
Operation settings for cache, MMU and branch prediction
You need to configure the cache and MMU (see Part 12) in the SCTLR.
合わせて読みたい
【sample program】
MRC p15,0,r0,c1,c0,0 ; Read SCTLR ORR r0,r0,#(0x1 << 12) ; Run instruction quiche (SCTLR.I [12]) ORR r0,r0,#(0x1 << 2) ; run data quiche (SCTLR.D [2]) ORR r0,r0,#(0x1 << 11) ; run program flow prediction (SCTLR.Z [11]) ORR r0,r0,#(0x1 << 0) ; Running MMU (SCTLR.M [0]) MCR p15,0,r0,c1,c0,0 ; Write SCTLR ;================================================================== ; Allows the data prefetching function. ;================================================================== MRC p15,0,r0,c1,c0,1 ; Read ACTLR ORR r0,r0,#(0x1 << 2) ; run data prefetch (ATRR.DP[2]) MCR p15,0,r0,c1,c0,1 ; Write ACTLR
SCTLR
- Configure various processor settings such as cache and MMU.
- Read/write in privileged mode only.
【Register Access Instructions】
MRC p15,0,<Rd>,c1,c0,0 ; SCTLR reading. MCR p15,0,<Rd>,c1,c0,0 ; SCTLR writing.
【register format】
Bits | Name | Access | Content |
---|---|---|---|
30 | TE | Bank | Thumb exception enable. |
29 | AFE | Bank | Access flag enable bit. |
28 | TRE | Bank | Controls the TEX remap feature in the MMU. |
27 | NMFI | Read-only | Support for unmasking FIQ. |
25 | EE | Bank | Determines how to set the meaning CPSR.E bit in case of an exception. 0=set little endian. 1=set big endian. |
17 | HA | - | RAZ/WI (read value 0 or ignore write) |
15 | RR | For Secure Change | Set cache, BTAC, and micro TLB replacement method. 0=Random replacement (initial value at reset). 1=Round-robin replacement (non-secure, read only). |
13 | V | Bank | Selects the base address of the exception vector. 0=normal exception vector 0x00000000. 1=higher exception vector 0xFFFF0000 (no remapping). |
12 | I | Bank | Sets the operation of the instruction cache. 0 = instruction cache is inactive (default value at reset). 1=instruction cache is active. |
11 | Z | Bank | Enables program flow prediction. 0=Program flow prediction is inactive (default value at reset). 1=Program flow prediction is active. |
10 | SW | Bank | SWP/SWPB permission bits 0=SWP and SWPB are undefined (default value at reset). 1=SWP and SWPB operate normally. |
2 | C | Bank | Configures the operation of the data cache. 0 = Data cache is inactive (default value at reset). 1=data cache is active. |
1 | A | Bank | Detects access alignment faults for unaligned data. 0 = Prohibits alignment fault checking (initial value at reset). 1=Allows alignment fault checking. |
0 | M | Bank | Running MMU. 0 = MMU is inactive (initial value at reset). 1=Operating MMU. |
ACTLR
- Read/write in privileged mode only.
【Register Access Instructions】
MRC p15,0,<Rd>,c1,c0,1 ; ACTLR reading MCR p15,0,<Rd>,c1,c0,1 ; ACTLR writing
【register format】
Bits | Name | Content |
---|---|---|
9 | Parity | Support for parity checking (if implemented) 0=Not running (default value at reset). 1=Operating. |
8 | 1-Way Allocation | Allows 1 cacheway allocation. 0=prohibited (initial value at reset). |
7 | EXCL | exclusive cache bit 0=Not running (default value at reset). 1=Operating. |
6 | SMP | Indicates whether the Cortex-A9 processor is part of the coherence. |
3 | 0 full line write | 0 enables full line write mode. On reset, it is 0. |
2 | L1 prefetch enable | Data-side prefetching 0=Not working (default value at reset). 1=Operating. |
1 | L2 prefetch hint enable | Prefetch hint enable setting.|
0 | FW | Cache and TLB maintenance broadcasts. 0=Non-operational (initial value at reset). 1=Operating. |
In this article, I've focused on the coprocessors that need to be set up when you initialize them. In normal programming, you're unlikely to have access to them (laughs), but there are some registers that can help you solve problems when an exception occurs, so I recommend reading the manual.
“もっと見る” カテゴリーなし
Mbed TLS overview and features
In this article, I'd like to discuss Mbed TLS, which I've touched on a few times in the past, Transport …
What is an “IoT device development platform”?
I started using Mbed because I wanted a microcontroller board that could connect natively to the Internet. At that time, …
Mbed OS overview and features
In this article, I would like to write about one of the components of Arm Mbed, and probably the most …