mwoffliner
is a tool which allows to make a local HTML snapshot of
any online (recent) Mediawiki instance. It goes through all articles
(or a selection if specified) and write the HTML/pictures to a local
directory. It has mainly been tested against Wikimedia projects like
Wikipedia, Wiktionary, ... But it should also work for any recent
Mediawiki.
To use mwoffliner
, you need a recent version of Node.js and a POSIX
system (like GNU/Linux). But there are also a few other dependencies
described below.
Most of the instructions are given for a Debian based OS.
Install first Node.js
$curl -sL https://deb.nodesource.com/setup_6.x | sudo -E bash -
$sudo apt-get install -y nodejs
mwoffliner
makes some treatments on downloaded images, so the
following binaries are required : jpegoptim, advdef, gifsicle, pngquant, imagemagick
.
$sudo apt-get install jpegoptim advancecomp gifsicle pngquant imagemagick
FIXME: These instructions are insufficient to build zimwriterfs Please follow instructions in https://github.com/openzim/zimwriterfs/blob/master/docker/Dockerfile
mwoffliner
is thought to write the snapshots in the ZIM archive file
format. See http://www.openzim.org/ for more details.
$sudo apt-get install liblzma-dev libmagic-dev zlib1g-dev libgumbo-dev libzim-dev libicu-dev
$git clone https://github.com/openzim/zimwriterfs.git
$cd zimwriterfs
$./autogen.sh
$./configure
$make
$sudo make install
Installation can be processed by following official installation documentation : https://raw.githubusercontent.com/openzim/zimwriterfs/master/README.md
Redis a software daemon to store huge quantity of key=value pairs. It is
used as a cache by mwoffliner
.
You can install it from the source:
$wget http://download.redis.io/releases/redis-3.2.8.tar.gz
$tar xzf redis-3.2.8.tar.gz
$cd redis-3.2.8
$make
or directly from the repository:
$sudo apt-get install redis-server
Here are the important parts of the configuration (/etc/redis/redis.conf):
unixsocket /dev/shm/redis.sock
unixsocketperm 777
save ""
appendfsync no
We also recommend to use a DNS cache like nscd
.
Then install mwoffliner
and its dependencies itself:
$sudo npm -g install mwoffliner
or if you do not want to install it as root:
$npm install mwoffliner
When you are done with the installation, you can start mwoffliner. There are two ways to use mwoffliner.
If installed as root (so in the $PATH):
mwoffliner
otherwise:
node ./node_modules/mwoffliner/bin/mwoffliner.script.js
This will show the usage() of the command.
If you want to run mwoffliner
the npm
way, you must create some
npm
scripts through package.json
definition. Add, for example, the
following scripts part in your package.json
:
"scripts": {
"mwoffliner": "mwoffliner",
"create_archive": "mwoffliner --mwUrl=https://en.wikipedia.org/ [email protected]",
"create_mywiki_archive": "mwoffliner --mwUrl=https://my.wiki.url/ [email protected]"
}
Now you are able to run mwoffliner through npm:
$npm run mwoffliner -- --mwUrl=https://en.wikipedia.org/ [email protected]
The first "--" is meant to pass the following arguments to
mwoffliner
module.
Include this script to the .js file of your project:
const mwoffliner = require('./lib/mwoffliner.lib.js')
const parameters = {
mwUrl: 'https://en.wikipedia.org/',
adminEmail: '[email protected]'
}
mwoffliner.execute(parameters)