RFC: Keep a cache of frequently accessed files#517
RFC: Keep a cache of frequently accessed files#517nephros wants to merge 43 commits intosailfishos-patches:masterfrom
Conversation
take() gives ownership to the caller. object() does not.
- Try a Q_OBJECT macro for a template class... - Don't be a template class - Become Object-based, add contructor - Implement a constructor
|
One problem with evaluating this is getting a proper benchmark. Suggestions are highly welcome of a test scenario which can somewhat reliably So the whole thing is currently a "should theoretically improve things" thing. |
|
A comment about the logic. Without this change, the communication goes something like this:
With this change, the part of the Daemon becomes:
As the cache holds a list of unpatched files, it will continue to grow. we QCache also has logic to evaluate which entries are useful to keep in the This way, it should over time become a list of "hot", i.e. frequently accessed See the QCache docs about those internals. |
|
Thanks for the detailed write-up and for starting working on this! Indeed, I remember people having races at initialization that only happen when patchmanager is installed so this is indeed of interest for the whole system. In this phase, since I did not look at the code yet, my only questions are:
Thanks! |
The idea is to have a list of "hot false positives", i.e. files that are accessed by almost all processes (like libc, /etc/passwd, and so on) but are usually not patched, and return early (quickly, in minimal time) for those. And QCache with its auto-eviction logic (as well as a Bloom filter with its "maybe in set/definitely not in set" characteristic) isn't suitable for a list of patched files I think. Such a list would have to be persistent and managed by us.
Not at the moment, but this could (really should) be done, every time the "list of active patches" changes. |
|
Oh, and don't get scared by the long list of commits, the actual changes are not that many. |
b100dian
left a comment
There was a problem hiding this comment.
Again, thanks for trying out this in-memory-do-not-touch-the-storage cache.
I have a few comments on the code, and one suggestion for testing:
That is, a simple find /usr -type f -exec head -n1 {} \; could probably be run a couple of times with FSCache on and off to see if there are differences.
If /usr is too big for the cache then maybe some narrower path.
Also, I would be interested in the memory usage - since its a number of hashed keys times one allocated object - what would that mean to fragmentation in the end (that last part) ?
| payload = fakePath.toLatin1(); | ||
| bool passAsIs = true; | ||
| if ( | ||
| (!m_failed) // return unaltered for failed |
There was a problem hiding this comment.
Oh boy, looking up where m_failed is set reminds me we should rewrite patchmanager in a simpler way :))
Lipstick crash? Journal match ? 😮
| insert(entry, new QObject(), HOTCACHE_COST_STRONG); | ||
| } | ||
| } | ||
| // they may be wrong, so use a higher cost than default |
There was a problem hiding this comment.
Not sure I follow why do we pre-load a set of sonames if we're going to make them be the first out anyways.
| QString list; | ||
| list + "Filter Stats:" | ||
| + "\n===========================" | ||
| + "\n Hotcache entries:: .............." + size() |
There was a problem hiding this comment.
really curious some stats after a booted system and after some days of usage, how do they look like ?
There was a problem hiding this comment.
Statistics have been approved a bit in the latest commit, and you can now trigger logging (and a string reply) from DBus:
busctl call org.SfietKonstantin.patchmanager /org/SfietKonstantin/patchmanager org.SfietKonstantin.patchmanager statistics| xargs echo -e
Example output:
Patchmanager version: 3.2.12+obs.20251103104027.132.g5670c8e
Applied Patches: 20
Patched files: 97
Watched files: 50
Filter Stats:
===========================
Hotcache entries:: ..............126
Hotcache cost: ..................259/5000
Hotcache hit/miss: ..............1284/106 (0.00%)
===========================
Hotcache top entries: ...........
/usr/share/patchmanager/patches/patch-keyboard-no-vibration/patch.json
/usr/share/patchmanager/patches/patch-named-sims/patch.json
/usr/share/patchmanager/patches/patch-email-headers/unified_diff.patch
/usr/share/patchmanager/patches/patch-icon-filter/patch.json
/etc/passwd
/usr/lib64/libblkid.so.1.1.0
There was a problem hiding this comment.
Cool! what happens if you run find / -type f -exec grep something {} \; with the cache?
(honest question, I do this all the times ;D)
There was a problem hiding this comment.
(I should make my own build but I am away from any personal computer for the week)
There was a problem hiding this comment.
Cool! what happens if you run
find / -type f -exec grep something {} \;with the cache?(honest question, I do this all the times ;D)
You need plocate ;)
So it goes to something like this.
busctl call "org.SfietKonstantin.patchmanager" /org/SfietKonstantin/patc
hmanager org.SfietKonstantin.patchmanager statistics b true | xargs echo -e|head
s Patchmanager version: 3.2.12+obs.20251103192017.134.g49ef3e1
Applied Patches: 20
Patched files: 97
Watched files: 50
Filter Stats:
===========================
Hotcache entries:: ..............1512
Hotcache cost: ..................3033/5000
Hotcache hit/miss: ..............59162/1506 (0.98%)
===========================
There was a problem hiding this comment.
Btw, don't actually do find /, that can crash or hang the device (because of stuff in /proc or /sys or so).
There was a problem hiding this comment.
By the way, you MUST run find as non-root, otherwise the preload lib will do nothing :)
Co-authored-by: Vlad G. <vlad@grecescu.net>
e94e2e6 to
ec2aae2
Compare
| bool insert(const QString &key, quint8 value = 1, int cost = HOTCACHE_COST_DEFAULT); | ||
|
|
||
| // override QCache::contains() | ||
| bool contains(const QString &key) const; |
There was a problem hiding this comment.
Is our compiler so old? Pretty sure override keyword was working
| class PatchManagerFilter : public QObject, public QCache<QString, quint8> | ||
| { | ||
| Q_OBJECT | ||
| Q_PROPERTY(bool active READ active WRITE setActive NOTIFY activeChanged) |
There was a problem hiding this comment.
this looks a little bit over specified compared to the others ;) Maybe its needed
|
Note-to-self: The Settings switch currently isn't capable of disabling the cache on-the-fly. This should be fixed, if only to compare behaviour with and without the cache. |
RFC enhancement: Have the daemon keep track of files which are asked about by the preload lib, but are not patched.
This is the overwhelming majority of requests in normal runtime scenarios.
The idea is that calls to QFileInfo::exists() are minimized, and file names or unpatched files returned as soon as possible.
(Open question: Can we be smarter than the QFileInfo implementation and the kernel fs cacche? ;) )
The original idea was presented in
#431 (comment)
which suggests a Bloom filter to handle these file paths.
I experimented with bloom filters, but found the resulting logic too hard for
my small brain to handle.
So this implementation abuses Qt's QCache,
but uses a class which means the actual cache could be replaced with something better later.
@b100dian if you have the time, maybe look over the idea and review it.