This wiki is obsolete, see the NorduGrid web pages for up to date information.

Controldir

From NorduGrid
Jump to navigationJump to search

Problem: controldir processing slow when many files (> ~50k)

Benchmarking is required. Check notes on performance link?

It is decided that within the project there is only chance for small optimizations and no big changes.

Solution

Several solutions have been proposed:

1. Aleksandr: create a file structure that follows the ID name

  • PRO:
    • should increase performance of 1/3 when reading directories with lots of files
  • CONS:
    • not easy for scanning tools, needs an index file
    • not easy to inspect the controldir with sysadmin

2. Andrii: try to stat files before opening

  • PRO: might increase performance
  • CONS: must hardcode a fixed time until a file has been changed, i.e. 5 mins? might not be flexible enough

3. Andrii: Try to rerun the test by mounting the controldir with noatime

  • PRO: might discover that setting atime is the bottleneck
  • CONS: remounting the controldir is not always possible. Maybe using bind mounts?

4. Iegven: try to use fuse driver for databases: https://git.kernel.org/cgit/fs/fuse/dbfs.git/

  • PRO:
    • might help testing of database
    • transparent to current system, no need to rework code
  • CONS:
    • Needs to play with some database as backend; streamline deployment is also not clear.

Tasks

Florido will re-test infoprovider performance using 2. and 3.

Depending on the above other solutions might be tested

Blocker: would be good to have the performance suite to perform those tests.