Skip to content

[CLEAN] Inactive website visitors to free DB space of ~35GB #128

@arnaudlayec

Description

@arnaudlayec

Context

Currently, the database size is big. A .dump file generated by pg_dump is 22GB, and once installed it is ~39GB on the Postgresql server.
This is issue is only related to the DB, not speaking filestore which is about 24GB uncompressed.

Problem description

While the size of the DB does not seem to have any impact on performances cleaning can be good.
The DB size is also a drag for collaboration between Internal Working Groups : it can slow the developers work:

  • sharing the DB is long (upload/download on the Internet)
  • installing the DB on its personal computer is sometimes not possible (SSD drive can be small), and backing up can be long

Findings

Using the module database_size, we discover that the table website_track is 22GB with 80M (millions) indexed lines, and website_visitor is 13GB.

Morover, a native CRON Website Visitor : clean inactive visitors exists but is disabled.
It calls this Python method (just quoting), which unlinks records of website.visitor, which cascades the unlink to website.track.

    def _cron_unlink_old_visitors(self, batch_size=1000, limit=None):
        """ Unlink inactive visitors (see '_inactive_visitors_domain' for
        details).

        Visitors were previously archived but we came to the conclusion that
        archived visitors have very little value and bloat the database for no
        reason. """

Proposal

By default, inactive visitors are the ones older than 60 days. It can be controlled by the Global Parameter website.visitor.live.days which is not currently set for the OCA.

Unless there is a will to run statistics / BI on those data, I would suggest to simply re-activate the Odoo native CRON to easily make this clean of the database.

Next steps

  1. We maybe should run into the analysis of OCA modules like database_size and database_cleanup which can learn us more.
  2. We could also deep-dive why the filestore if 24GB, and if this is legitimate

Along working on RFQ2 (membership process improval), I can continue in a "quick-win" approach.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions