Statistics: Difference between revisions
Jump to navigation
Jump to search
(wiki stats) |
No edit summary |
||
Line 4: | Line 4: | ||
To enhance that, we create statistics based on the wiki's web server log files using [https://goaccess.io/ GoAccess]. We anonymize and filter the logfiles to only show the actual wiki pages in the stats, so the numbers do not actually represent the full load on the server, but what people actually read (also, we filter out crawlers). To see that information, go to https://wiki.munichmakerlab.de/report.html | To enhance that, we create statistics based on the wiki's web server log files using [https://goaccess.io/ GoAccess]. We anonymize and filter the logfiles to only show the actual wiki pages in the stats, so the numbers do not actually represent the full load on the server, but what people actually read (also, we filter out crawlers). To see that information, go to https://wiki.munichmakerlab.de/report.html | ||
== Technical details == | |||
* Web server logs are rotated weekly | |||
* We take a whole lot of these logs, do some basic grep filtering, and then run them through GoAccess, with some additional filter options set. This looks like so: | |||
<pre> | |||
zcat -f ${LOGFILE} ${LOGFILE}.{1..12} \ | |||
| grep "/wiki/" \ | |||
| grep -v load.php \ | |||
| goaccess - \ | |||
-o ${OUTPUT} \ | |||
--log-format=COMBINED \ | |||
--anonymize-ip \ | |||
--ignore-crawlers \ | |||
--http-protocol=no \ | |||
--http-method=no \ | |||
--all-static-files \ | |||
--ignore-panel=HOSTS \ | |||
--geoip-database=${GEODB} \ | |||
--no-progress | |||
</pre> | |||
* The GoAccess report is generated once per day, every night at 2pm. |
Latest revision as of 22:19, 4 June 2020
We keep some statistics on this wiki.
For starters, we use the HitCounter extension to re-enable the MediaWiki internal statistics and show page hitcounters in the footer of the page.
To enhance that, we create statistics based on the wiki's web server log files using GoAccess. We anonymize and filter the logfiles to only show the actual wiki pages in the stats, so the numbers do not actually represent the full load on the server, but what people actually read (also, we filter out crawlers). To see that information, go to https://wiki.munichmakerlab.de/report.html
Technical details
- Web server logs are rotated weekly
- We take a whole lot of these logs, do some basic grep filtering, and then run them through GoAccess, with some additional filter options set. This looks like so:
zcat -f ${LOGFILE} ${LOGFILE}.{1..12} \ | grep "/wiki/" \ | grep -v load.php \ | goaccess - \ -o ${OUTPUT} \ --log-format=COMBINED \ --anonymize-ip \ --ignore-crawlers \ --http-protocol=no \ --http-method=no \ --all-static-files \ --ignore-panel=HOSTS \ --geoip-database=${GEODB} \ --no-progress
- The GoAccess report is generated once per day, every night at 2pm.