Performance & Profiling script¶
Note
The performance script got its own Sphinx extension: Sphinx-Performance.
This extension is based on the described script, but is more powerfull and better maintained. It can also be used for general performance analysis of Sphinx and its extensions.
The performance of Sphinx-Needs can be tested by a script called performance_test.py
inside
folder /performance
of the checked out github repository.
The performance can be tested with different amounts of needs
, needtables
and not Sphinx-Needs related
dummies
(simple rst code).
Test series¶
To start a series of test with some predefined values, run python performance_test.py series
Running 8 test configurations.
* Running on 5 pages with 50 needs, 5 needtables, 5 dummies per page. Using 1 cores.
Duration: 8.05 seconds
* Running on 5 pages with 50 needs, 5 needtables, 5 dummies per page. Using 4 cores.
Duration: 6.75 seconds
* Running on 1 pages with 50 needs, 5 needtables, 5 dummies per page. Using 1 cores.
Duration: 2.16 seconds
* Running on 1 pages with 50 needs, 5 needtables, 5 dummies per page. Using 4 cores.
Duration: 2.36 seconds
* Running on 5 pages with 10 needs, 1 needtables, 1 dummies per page. Using 1 cores.
Duration: 2.39 seconds
* Running on 5 pages with 10 needs, 1 needtables, 1 dummies per page. Using 4 cores.
Duration: 2.34 seconds
* Running on 1 pages with 10 needs, 1 needtables, 1 dummies per page. Using 1 cores.
Duration: 1.69 seconds
* Running on 1 pages with 10 needs, 1 needtables, 1 dummies per page. Using 4 cores.
Duration: 1.70 seconds
RESULTS:
runtime pages needs needs needtables dummies parallel
seconds overall per page overall overall overall cores
--------- --------- ---------- --------- ------------ --------- ----------
8.05 5 50 250 25 25 1
6.75 5 50 250 25 25 4
2.16 1 50 50 5 5 1
2.36 1 50 50 5 5 4
2.39 5 10 50 5 5 1
2.34 5 10 50 5 5 4
1.69 1 10 10 1 1 1
1.7 1 10 10 1 1 4
Overall runtime: 27.45 seconds.
But you can modify the details and set some static values by setting various parameters.
Just run python performance_test.py series --help
to get an overview
Usage: performance_test.py series [OPTIONS]
Generate and start a series of tests.
Options:
--profile TEXT Activates profiling for given area
--needs INTEGER Number of maximum needs.
--needtables INTEGER Number of maximum needtables.
--dummies INTEGER Number of standard rst dummies.
--pages INTEGER Number of additional pages with needs.
--parallel INTEGER Number of parallel processes to use. Same as -j for
sphinx-build
--keep Keeps the temporary src and build folders
--browser Opens the project in your browser
--snakeviz Opens snakeviz view for measured profiles in browser
--debug Prints more information, incl. sphinx build output
--basic Use only default config of Sphinx-Needs (e.g. no extra
options)
--help Show this message and exit.
Also if --needs
, --pages
or parallel
is set multiple times, one performance test is executed per it.
Example:: python performance_test.py series --needs 1 --needs 10 --pages 1 --pages 10 --parallel 1 --parallel 4 --needtables 0 --dummies 0
.
This will set 2 values for needs
, 2 for pages
and 2 for parallel. So in the end it will run 8 test
configurations (2 needs x 2 pages x 2 parallel = 8).
Running 8 test configurations.
* Running on 1 pages with 1 needs, 0 needtables, 0 dummies per page. Using 1 cores.
Duration: 1.53 seconds
* Running on 1 pages with 1 needs, 0 needtables, 0 dummies per page. Using 4 cores.
Duration: 1.64 seconds
* Running on 10 pages with 1 needs, 0 needtables, 0 dummies per page. Using 1 cores.
Duration: 1.96 seconds
* Running on 10 pages with 1 needs, 0 needtables, 0 dummies per page. Using 4 cores.
Duration: 2.01 seconds
* Running on 1 pages with 10 needs, 0 needtables, 0 dummies per page. Using 1 cores.
Duration: 1.91 seconds
* Running on 1 pages with 10 needs, 0 needtables, 0 dummies per page. Using 4 cores.
Duration: 1.93 seconds
* Running on 10 pages with 10 needs, 0 needtables, 0 dummies per page. Using 1 cores.
Duration: 2.94 seconds
* Running on 10 pages with 10 needs, 0 needtables, 0 dummies per page. Using 4 cores.
Duration: 2.48 seconds
RESULTS:
runtime pages needs needs needtables dummies parallel
seconds overall per page overall overall overall cores
--------- --------- ---------- --------- ------------ --------- ----------
1.53 1 1 1 0 0 1
1.64 1 1 1 0 0 4
1.96 10 1 10 0 0 1
2.01 10 1 10 0 0 4
1.91 1 10 10 0 0 1
1.93 1 10 10 0 0 4
2.94 10 10 100 0 0 1
2.48 10 10 100 0 0 4
Overall runtime: 16.41 seconds.
Parallel execution¶
- versionadded:
0.7.1
You may have noticed, the parallel execution on multiple cores can lower the needed runtime.
This parallel execution is using the “-j” option from sphinx-build. This mostly brings benefit, if dozens/hundreds of files need to be read and written. In this case sphinx starts several workers to deal with these files in parallel.
If the project contains only a few files, the benefit is not really measurable.
Here an example of a 500 page project, build once on 1 and 8 cores. The benefit is ~40%
of build time, if 8 cores
are used.
runtime s pages # needs per page needs # needtables # dummies # parallel cores
----------- --------- ---------------- --------- -------------- ----------- ----------------
169.46 500 10 5000 0 5000 1
103.08 500 10 5000 0 5000 8
Used command: python performance_test.py series --needs 10 --pages 500 --dummies 10 --needtables 0 --parallel 1 --parallel 8
The parallel execution can used by any documentation build , just use -j option.
Example, which uses 4 processes in parallel: sphinx-build -j 4 -b html . _build/html
Used rst template¶
For all performance tests the same rst-template is used:
index¶
Performance test
================
Config
------
:dummies: {{dummies}}
:needs: {{needs}}
:needtables: {{needtables}}
:keep: {{keep}}
:browser: {{browser}}
:debug: {{debug}}
Content
-------
.. contents::
.. toctree::
{% for page in range(pages) %}
page_{{page}}
{% endfor -%}
pages¶
{{ title}}
{{ "=" * title|length }}
Test Data
---------
Dummies
~~~~~~~
Amount of dummies: **{{dummies}}**
{% for n in range(dummies) %}
**Dummy {{n}}**
.. note:: This is dummy {{n}}
And some **dummy** *text* for dummy {{n}}
{% endfor %}
Needs
~~~~~
Amount of needs: **{{needs}}**
{% for n in range(needs) %}
.. req:: Test Need Page {{ page }} {{n}}
:id: R_{{page}}_{{n}}
{% if not basic %} :number: {{n}}{% endif %}
:links: R_{{page}}_{{needs-n-1}}
{% endfor %}
Needtable
~~~~~~~~~
Amount of needtables: **{{needtables}}**
{% if basic %}
.. needtable::
:show_filters:
:columns: id, title, number, links
{% else %}
{% for n in range(needtables) %}
.. needtable::
:show_filters:
:filter: int(number)**3 > 0 or len(links) > 0
:columns: id, title, number, links
{% endfor %}
{% endif %}
Profiling¶
With option --profile NAME
a code-area specific profile can be activated.
Currently supported are:
NEEDTABLE: Profiles the needtable processing (incl. printing)
NEED_PROCESS: Profiles the need processing (without printing)
NEED_PRINT: Profiles the need painting (creating final nodes)
If this option is used, a profile
folder gets created in the current working directory and a profile file with
<NAME>.prof
is created. This file contains
CProfile Stats information.
--profile
can be used several times.
These profile can be also created outside the performance test with each documentation project.
Simply set a environment variable called NEEDS_PROFILING
and set the value to the needed profiles.
Example for Linux: export NEEDS_PROFILING=NEEDTABLE,NEED_PRINT
.
Analysing profile¶
Use snakeviz
together with --profile <NAME>
to open automatically a graphical analysis of the generated
profile file.
For this snakeviz
must be installed: pip install snakeviz
.
Example:
python performance_test.py series --needs 10 --pages 10 --profile NEEDTABLE --profile NEED_PROCESS --snakeviz

Measurements¶
The measurements were performed with the following setup:
Sphinx-Needs 0.7.0 on 1 core as parallel build is not supported by version.
Sphinx-Needs 0.7.1, with 1 core.
Sphinx-Needs 0.7.1, with 4 cores.
Test details |
0.7.0 with 1 core |
0.7.1 with 1 core |
0.7.1 with 4 cores |
---|---|---|---|
30 pages with overall 1500 needs and 30 needtables |
55.02 s |
36.81 s |
34.31 s |
100 pages with overall 10.000 needs and 100 needtables |
6108.26 s |
728.82 s |
564.76 s |