Discussion:
cache invalidate in user space
(too old to reply)
m***@yahoo.com
2006-03-15 16:11:50 UTC
Permalink
Does anybody know how can I invalidate content of the cache in user
space = user process/thread? Thanks.
Anton Ertl
2006-03-15 17:48:31 UTC
Permalink
Post by m***@yahoo.com
Does anybody know how can I invalidate content of the cache in user
space = user process/thread? Thanks.
There are semi-architected instructions called dcb* and icb* that deal
with cache lines. E.g., here's how I ensure that the I-cache does not
constain stale lines:

#include <stddef.h>
#include <sys/types.h>

/* the name is from an AIX (4.3) call (thanks to Dan Prener
<***@watson.ibm.com> for this information) */
void _sync_cache_range(caddr_t addr, size_t size)
{
size_t cache_block_size=32;
caddr_t p=(caddr_t)(((long)addr)&-cache_block_size);

/* this works for a single-processor PPC 604e, but may have
portability problems for other machines; the ultimate solution is
a system call, because the architecture is pretty shoddy in this
area */
for (; p < (addr+size); p+=cache_block_size)
asm("dcbst 0,%0\n sync\n icbi 0,%0"::"r"(p));
asm("sync\n isync"); /* PPC 604e needs the additional sync
according to Tim Olson */
}

The sync between the dcbst and the icbi ensures that the I-cache is
not reloaded from memory before the D-cache has stored its data to
memory. For larger blocks, it is faster to do all the dcbsts, then
one sync, then all the icbis.

Despite the comment about the non-portabilty, this seems to work on
all PPCs I have tried.

- anton
--
M. Anton Ertl Some things have to be seen to be believed
***@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html
m***@yahoo.com
2006-03-16 15:46:22 UTC
Permalink
If I use malloc or posix_memalign do I have to translate address from
virtual to physical and how before calling function for cache? Thanks.
Anton Ertl
2006-03-16 16:58:16 UTC
Permalink
Post by m***@yahoo.com
If I use malloc or posix_memalign do I have to translate address from
virtual to physical and how before calling function for cache? Thanks.
No, the cache-control instructions work with virtual addresses.

- anton
--
M. Anton Ertl Some things have to be seen to be believed
***@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html
Mikael Pettersson
2006-03-16 15:20:30 UTC
Permalink
Post by Anton Ertl
Post by m***@yahoo.com
Does anybody know how can I invalidate content of the cache in user
space = user process/thread? Thanks.
There are semi-architected instructions called dcb* and icb* that deal
with cache lines. E.g., here's how I ensure that the I-cache does not
#include <stddef.h>
#include <sys/types.h>
/* the name is from an AIX (4.3) call (thanks to Dan Prener
void _sync_cache_range(caddr_t addr, size_t size)
{
size_t cache_block_size=32;
caddr_t p=(caddr_t)(((long)addr)&-cache_block_size);
/* this works for a single-processor PPC 604e, but may have
portability problems for other machines; the ultimate solution is
a system call, because the architecture is pretty shoddy in this
area */
for (; p < (addr+size); p+=cache_block_size)
asm("dcbst 0,%0\n sync\n icbi 0,%0"::"r"(p));
asm("sync\n isync"); /* PPC 604e needs the additional sync
according to Tim Olson */
}
The sync between the dcbst and the icbi ensures that the I-cache is
not reloaded from memory before the D-cache has stored its data to
memory. For larger blocks, it is faster to do all the dcbsts, then
one sync, then all the icbis.
Despite the comment about the non-portabilty, this seems to work on
all PPCs I have tried.
The version I wrote (for use in the runtime system for the HiPE JIT
compiler for Erlang/OTP) does the dcbsts first, then a sync, then the
icbis, and finally a sync;isync. It's been known to work on 603ev, 750,
G4s, and a POWER4.

I suspect only the weird embedded chips might need something else.
--
Mikael Pettersson (***@csd.uu.se)
Computing Science Department, Uppsala University
Loading...