Statistics: Difference between revisions

From The Munich Maker Lab's Wiki
Jump to navigation Jump to search
(wiki stats)
 
No edit summary
 
Line 4: Line 4:


To enhance that, we create statistics based on the wiki's web server log files using [https://goaccess.io/ GoAccess]. We anonymize and filter the logfiles to only show the actual wiki pages in the stats, so the numbers do not actually represent the full load on the server, but what people actually read (also, we filter out crawlers). To see that information, go to https://wiki.munichmakerlab.de/report.html
To enhance that, we create statistics based on the wiki's web server log files using [https://goaccess.io/ GoAccess]. We anonymize and filter the logfiles to only show the actual wiki pages in the stats, so the numbers do not actually represent the full load on the server, but what people actually read (also, we filter out crawlers). To see that information, go to https://wiki.munichmakerlab.de/report.html
== Technical details ==
* Web server logs are rotated weekly
* We take a whole lot of these logs, do some basic grep filtering, and then run them through GoAccess, with some additional filter options set. This looks like so:
<pre>
zcat -f ${LOGFILE} ${LOGFILE}.{1..12} \
  | grep "/wiki/" \
  | grep -v load.php \
  | goaccess - \
    -o ${OUTPUT} \
    --log-format=COMBINED \
    --anonymize-ip \
    --ignore-crawlers \
    --http-protocol=no \
    --http-method=no \
    --all-static-files \
    --ignore-panel=HOSTS \
    --geoip-database=${GEODB} \
    --no-progress
</pre>
* The GoAccess report is generated once per day, every night at 2pm.

Latest revision as of 22:19, 4 June 2020

We keep some statistics on this wiki.

For starters, we use the HitCounter extension to re-enable the MediaWiki internal statistics and show page hitcounters in the footer of the page.

To enhance that, we create statistics based on the wiki's web server log files using GoAccess. We anonymize and filter the logfiles to only show the actual wiki pages in the stats, so the numbers do not actually represent the full load on the server, but what people actually read (also, we filter out crawlers). To see that information, go to https://wiki.munichmakerlab.de/report.html

Technical details

  • Web server logs are rotated weekly
  • We take a whole lot of these logs, do some basic grep filtering, and then run them through GoAccess, with some additional filter options set. This looks like so:
zcat -f ${LOGFILE} ${LOGFILE}.{1..12} \
  | grep "/wiki/" \
  | grep -v load.php \
  | goaccess - \
    -o ${OUTPUT} \
    --log-format=COMBINED \
    --anonymize-ip \
    --ignore-crawlers \
    --http-protocol=no \
    --http-method=no \
    --all-static-files \
    --ignore-panel=HOSTS \
    --geoip-database=${GEODB} \
    --no-progress
  • The GoAccess report is generated once per day, every night at 2pm.