11

If I have a pointer and I care about memory access performance I may check whether the next operation on it will trigger a page fault. If it will, an algorithm could be designed so that it reorders loop operations to minimize page faults.

Is there any portable (or linux/windows non-portable) way to check for a particular memory address whether access will trigger a page fault?

AndresR
  • 596
  • 3
  • 17
  • Why don't you just use this algorithm all the time? It would also presumably maximize cache usage, which would still improve performance. – Cody Gray - on strike Jun 11 '16 at 13:47
  • Definitely no portable way, there's nothing for this in the C++ library. And very unlikely in practice; the necessary structures that the operating system kernel uses to manage virtual memory must, of course, be in protected kernel space. I don't immediately see an obvious security issue with read-only access; however this is such an esoteric chunk of data, it's unlikely that any OS will expend any effort to expose this data. – Sam Varshavchik Jun 11 '16 at 13:48
  • 1
    [There is in Windows and you don't want to use it](https://blogs.msdn.microsoft.com/oldnewthing/20060927-07/?p=29563/). – nwp Jun 11 '16 at 13:50
  • 3
    No, that Windows function does not check if accessing the pointer will trigger a page fault; it checks whether the pointer is pointing to a mapped page, or not. The two are not the same. – Sam Varshavchik Jun 11 '16 at 13:53
  • @CodyGray I want to design such an algorithm, but I need to know if the page I'm accessing is in physical memory. That is the question here, _Is that possible?_ – AndresR Jun 11 '16 at 14:07
  • @SamVarshavchik I can see how the data needs to be protected, but I also see possible benefits of knowing that information. I thought that maybe some OS had that implemented, at least for current process' memory – AndresR Jun 11 '16 at 14:09
  • 2
    To paraphrase Cody Gray's suggestion, why don't you implement an algorithm that will always prefetch data ahead of time? Isn't that the most straight forward way to deal with this situation? (This isn't portable either, but at least it is possible.) – IInspectable Jun 11 '16 at 14:14
  • 3
    Most basic issue that, if such a check existed, its promise would be quite worthless. It could be unmapped a nanosecond later. If it matters then you just make sure it is never unmapped, mlock() in Linux and VirtualLock() in Windows. – Hans Passant Jun 11 '16 at 15:11
  • To add to @HansPassant's post, that's a race condition. The "test and next-operation" would have to be an atomic operation - and that would give quite a performance hit. – cdarke Jun 11 '16 at 16:08
  • 1
    @cdarke Well, as only performance is to be lost, there in no actual need for atomic operation enforcement, I guess (especially if that defeats the initial purpose of optiimization). – AndresR Jun 11 '16 at 17:03
  • 2
    Exactly. @hanspassant is certainly correct that there are no guarantees, but it is an optimization, not a precondition, so it only has to be correct most of the time. However, polling for status is much higher overhead than notification, and frequent kernel calls could well be more costly than the benefit. See my answer for another approach. – rici Jun 11 '16 at 17:46

4 Answers4

9

About ten years ago, Emery Berger proposed a VM-aware garbage collection strategy which required the application to know which pages were present in memory. For testing purposes, he and his students produced a kernel patch which notified the application of paging events using real-time signals, allowing the garbage collector to maintain its own database of resident pages. (Although that seems like duplication of effort, it is a lot more efficient than multiple system calls in order to obtain information every time it is needed.)

You can find information about this interesting research on his research page.

As far as I know, there is no implementation of this patch for a recent Linux kernel, but it would always be possible to resurrect it.

rici
  • 234,347
  • 28
  • 237
  • 341
  • This is really interesting! I wonder if the results they claim are actually that impressive _"By performing in-memory garbage collections, BC can speed up Java programs by orders of magnitude (up to 41X)."_ In that case I can see no reason why linux development didnt add this functionality to the recent kernels. Do you think there is some security concern? – AndresR Jun 11 '16 at 15:59
  • No, I don't think there was ever any security concern. I have an opinion about why the idea wasn't accepted, but suffice it to say that no-one in the linux team was sufficiently motivated to champion the idea. – rici Jun 11 '16 at 17:40
  • 1
    It speeds up 10-year-old Java programs that used an ineffective garbage-collection strategy. I'm not sure how that is applicable to C or C++, or any non-garbage collected language, for that matter. (Not to take away from the answer though; very interesting historical information.) – Cody Gray - on strike Jun 12 '16 at 04:59
  • @CodyGray: C and C++ **are** garbage collected languages. [Everybody thinks about garbage collection the wrong way](https://blogs.msdn.microsoft.com/oldnewthing/20100809-00/?p=13203): *"Garbage collection is simulating a computer with an infinite amount of memory."* Your point is still valid, though. – IInspectable Jun 12 '16 at 13:52
  • @IInspectable C and C++ don't actually do that, though. A virtual memory subsystem might, but that would be a feature of the operating system, not the language. I understand the point Raymond is making, certainly, and have said precisely the same thing many times before about finalizers, but I don't think it is reasonable to say that a "null garbage collector" (i.e., the absence of a garbage collector) is a type of garbage collector and therefore C and C++ are garbage collected because they have a null garbage collector. :-) – Cody Gray - on strike Jun 12 '16 at 15:01
  • @CodyGray: In C and C++ it is usually the heap manager that simulates an infinite amount of memory (in collaboration with the OS). This is not a *"null garbage collector"*. It's just one, that is always explicitly triggered (by calls to `free`/`delete`/etc.). – IInspectable Jun 12 '16 at 15:27
5

On Linux there is a mechanism, see man proc:

/[pid]/pagemap This file shows the mapping of each of the process's virtual pages into physical page frames or swap area. It contains one 64-bit value for each virtual page, with the bits set as follows:

  • 63 If set, the page is present in RAM.
  • 62 If set, the page is in swap space
  • ...

For example,

$ sudo hexdump -e '/0 "%08_ax "' -e '/8 "%016X" "\n"' /proc/self/pagemap 
00000000 0600000000000000
*
00002000 A6000000000032FE
00002008 A60000000007F3A6
00002010 A600000000094560
00002018 A60000000008D0C0
00002020 A60000000009EBE6
00002028 A6000000000C8E87
meuh
  • 11,500
  • 2
  • 29
  • 45
  • This looks really promising, I wonder how quick would be to keep polling here in order to take decisions. Is this accessible without root access? – AndresR Jun 11 '16 at 17:06
  • I'm not sure you'ld want to use this mechanism to do what you wanted, I just listed for completeness. You seem to need to be root, as even though the proc file seems to be owned by me, if I try to open it I get the error code `Operation not permitted`. – meuh Jun 11 '16 at 18:16
  • 1
    On my desktop, non-root users/processes can access their own `/proc/self/pagemap`. (mode=0600, and I did actually try with hexdump). There's probably a kernel setting for it. – Peter Cordes Sep 07 '18 at 17:54
  • @Peter I think it is standard that you can access your own pagemap and maybe only some types of specially hardened systems can't. – BeeOnRope Sep 08 '18 at 22:04
5

I wrote the page-info library to do this on Linux. It uses the pagemap file under the covers so won't be portable to other OSes.

Some information is restricted to root users, but you should be able to get the information about page presence (whether it is in RAM or not) without being root. Quoting from the README:

So [as a non-root user] you can determine if a page is present, swapped out, its soft-dirty status, whether it is exclusive and whether it is a file mapping, but not much more. On older kernels, you can also get the physical frame number (the pfn) field, which is essentially the physical address of the page (shifted right by 12).

The performance isn't exactly optimized for querying large ranges as it does a separate read for each page, but a PR to improve this would be greatfully accepted.

BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
3

No. There is no portable way to check whether a given address is currently in physical memory or swapped out in the swap file. In fact, I don't think either Linux or Windows offer the facility to check this in a non-portable way. (Of course, in Linux you could write it yourself).

As others have said in the comments, you also want to check whether the data is in cache or not (access from physical memory is much slower than from cache).

Your best bet is to reorder the loop to minimize page faults (== maximizes locality of reference) anyway.

  • Related: [Is it possible to “abort” when loading a register from memory rather the triggering a page fault?](https://stackoverflow.com/q/52221575). Other than a transactional-memory abort, x86 asm doesn't have a *non*-portable way to query the current HW page table except by attempting a load and having it page-fault. I'm not aware of other ISAs having an instruction that just checks the TLB or page table either, but it would be possible (and nice for writing a VM-aware GC that defers looking at data in unmapped pages until its done with data in mapped pages). – Peter Cordes Sep 07 '18 at 18:00