1
0
Files
2022-09-29 17:59:04 +03:00
..
2022-09-29 17:59:04 +03:00
2022-09-29 17:59:04 +03:00
2022-09-29 17:59:04 +03:00
2022-09-29 17:59:04 +03:00
2022-09-29 17:59:04 +03:00

#ident "ide/godzilla/ecc/README: $Revision 1.1$"

			ECC testing of Heart-R10k
			=========================


This file documents the ecc_errgen and ecc_heartecc functions (ecc.c):

	----------		----------
	|	 |		|	 |
	|  R10k	 |		| Heart  |
	|	 |<------------>|        |<------> Memory
	|	 |		|	 |
	----------		----------

The system includes only the microprocessor, the external agent 
(Heart), and memory (part of level1 diags).


Using the SD/DB_ErrGen bits in HEART:
====================================
The tests are implemented by the ecc_errgen function.

Test 1:
    Hardware tested:
        a) Bad ECC generator from HEART to memory
        b) Single bit ECC reporting in R10k (if such reporting exists)
    Procedure:
        1. set SB_ErrGen in HEART, Enable ECC reporting in R10k
        2. Write memory
        3. read sync register (make sure write gets to memory)
        4. read memory (on read R10k should detect bad ECC)
	5. Inspect Bad Memory Data registers etc.

Test 2: (similar to test 1)
    Hardware tested:
        a) Bad ECC generator from HEART to memory
        b) Double bit ECC reporting in R10k (if such reporting exists)
    Procedure:
        1. set DB_ErrGen in HEART, Enable ECC reporting in R10k
        2. Write memory
        3. read sync register (make sure write gets to memory)
        4. read memory (on read R10k should detect bad ECC)
	5. Inspect Bad Memory Data registers etc.

Using the propagation of bad data through the processor (R10000 only):
======================================================================
The tests are implemented by the ecc_heartecc function.

Test 3 is NOT implemented. It is believed that SB error data
cannot propagate through R10k.
Test 4 is implemented: it does not use he interrupt but 
polls the exception cause register.
The "read sync register" function is missing. (XXX)

Test 3:
    Hardware tested:
        a) ECC checker/detector for processor requests in HEART
	b) Processor single-bit ECC error reporting in HEART
    Procedure:
	1) Put single-bit ECC-error'd data in a R10k register (see below) ***
        2) set GlobalECCen in HEART
        3) set CorSysERE in HEART, unmask approriate bit in HEART mask register
	4) enable ECC reporting in R10k
        5) write bad register to memory location
          (HEART should correct data and capture address)
        6) Wait for CorSysErr Interrupt
        7) service interrupt (includes inspecting PBErrMisc and PBErrSysAddr)
	8) read sync register (make sure write gets to memory)
        9) Read memory
          (on read there should be no error since the data was fixed on write)

	Repeat 1-8 except ...
	3) clear CorSysERE in HEART
	6) Poll CorSysErr bit in HEART Exception cause register

Two tests for double bit ECC
Test4: (similar to test 2)
    Hardware tested:
        a) ECC checker/detector for processor requests in HEART
	b) Processor ECC double-bit error reporting in HEART
    Procedure:
	1) Put double-bit ECC-error'd data in a R10k register (see below) ***
        2) set GlobalECCen in HEART
        3) set NCorSysERE in HEART, unmask approriate bit in HEART mask
register
	4) enable ECC reporting in R10k
        5) write bad register to memory location
          (HEART will discard the data but capture the address correctly)
        6) Wait for NCorSysErr Interrupt
        7) service interrupt (includes inspecting PBErrMisc and PBErrSysAddr)
	8) read sync register (make sure write gets to memory)
        9) Read memory
          (the data should the old memory data since the write was squashed)

	Repeat 1-8 except ...
	3) clear CorSysERE in HEART
	6) Poll CorSysErr bit in HEART Exception cause register

------------------------------------------------------------

*** To put bad ECC data in a R10k register
    1) set SB_ErrGen/DB_ErrGen in HEART, clear ECC mode in R10k
    2) Write memory
    3) read sync register
    4) Disable ECC in R10k so it does not correct the read response
    5) read memory. This puts single bit ECC error'd data in R10k register
    6) clear SB_ErrGen /DB_ErrGen in HEART