[stu@server ~]$ ps -p 12720 -o pcpu,cutime,cstime,cmin_flt,cmaj_flt,rss,size,vsize
%CPU - - - - RSS SZ VSZ
28.8 - - - - 3262316 1333832 8725584
ps -p 7429 -o pcpu,cutime,cstime,cmin_flt,cmaj_flt,rss,size,vsize
As you can seem by the ps output (edited into your answer), the 3GB number is accurate. (I started graphing over time after noticing large numbers from top and ps on long running instances.) It is what surprised me--something seems wrong if my RSS is 5 times the heap size. Hence this SO question.
I noticed that you mentioned having hundreds to low thousands of threads. The stacks for these threads will also add to the RSS, although it shouldn't be much. Assuming that the threads have a shallow call depth (typical for app-server handler threads), each should only consume a page or two of physical memory, even though there's a half-meg commit charge for each.
In your case, I don't think it's an issue. The graph appears to show 3 meg consumed, not 3 gig as you write in the text. That's really small, and is unlikely to be causing paging.
Normally, VSZ is slightly > SZ, and very much > RSS. This output indicates a very unusual situation.
RSS represents pages that are actively in use -- for Java, it's primarily the live objects in the heap, and the internal data structures in the JVM. There's not much that you can do to reduce its size except use fewer objects or do less processing.
RSS represents the number of pages resident in RAM -- the pages that are actively accessed. With Java, the garbage collector will periodically walk the entire object graph. If this object graph occupies most of the heap space, then the collector will touch every page in the heap, requiring all of those pages to become memory-resident. The GC is very good about compacting the heap after each major collection, so if you're running with a partial heap, there most of the pages should not need to be in RAM.
RSS, as noted, is the resident set size: the pages in physical memory. SZ holds the number of pages writable by the process (the commit charge); the manpage describes this value as "very rough". VSZ holds the size of the virtual memory map for the process: writable pages plus shared pages.
Regarding the 3M RSS size - yeah, that seemed too low for a Tomcat process (I checked my box, and have one at 89M that hasn't been active for a while). However, I don't necessarily expect it to be > heap size, and I certainly don't expect it to be almost 5 times heap size (you use -Xmx640) -- it should at worst be heap size + some per-app constant.
So what else is happening in your system? Is it a situation where you have lots of Tomcat servers, each consuming 3M of RSS? You're throwing in a lot of GC flags, do they indicate the process is spending most of its time in GC? Do you have a database running on the same machine?
The GC times look OK. I continue to monitor them. Like I said, the io wait time is increasing. And I can see that the system file cache shrinks to a very small number, compared to when the JVM is not sucking up huge swaths of real memory.
The RSS value is 3GB, not 3MB. The graph is in Kilo Bytes. 3 mega of KB = 3GB. I will update question for clarity. (Besides, one would logically expect real memory to be > java heap size. 3MB is a fraction of 400MB.)
Which causes me to suspect your numbers. So, rather than a graph over time, please run the following to get a snapshot (replace 7429 by whatever process ID you're using):