-
Notifications
You must be signed in to change notification settings - Fork 600
Setting Up a Local Musicbrainz Mirror
This guide will show you how to set up a local musicbrainz mirror on a Debian-based Linux distro. Although untested, you can probably get it up and running on other Linux distros - usually just by pulling in the required packages using your distro's package manager.
The following is pulled from the official guide at: https://github.com/metabrainz/musicbrainz-server/blob/master/INSTALL with some notes on some error-prone parts.
Also see: http://pastebin.com/raw.php?i=a99EqPzb for a more condensed version of the steps involved.
The MusicBrainz Server is the web frontend to the MusicBrainz Database, and is accessible at http://musicbrainz.org.
This document explains the steps necessary to setup your own MusicBrainz Server. If you require any assistance with these instructions, please feel free to contact us via the information given at the bottom of this document.
- A Unix based operating system
The MusicBrainz development team uses a mix of Ubuntu and Debian, but Mac OS X will work just fine, if you're prepared to potentially jump through some hoops. If you are running Windows we recommend you set up a Ubuntu virtual machine.
This document will assume you are using Ubuntu for its instructions.
- Perl (at least version 5.10.1)
Perl comes bundled with most Linux operating systems, you can check your installed version of Perl with:
perl -v
- PostgreSQL (at least version 8.4)
PostgreSQL is required, along with its development libraries. To install using packages run the following, replacing 8.x with the latest version.
sudo apt-get install postgresql-8.x postgresql-server-dev-8.x postgresql-contrib
Alternatively, you may compile PostgreSQL from source, but then make sure to also compile the cube extension found in contrib/cube. The database import script will take care of installing that extension into the database when it creates the database for you.
- Git
The MusicBrainz development team uses Git for their DVCS. To install Git, run the following:
sudo apt-get install git-core
- Memcached
By default the MusicBrainz server requires a Memcached server running on the same server with default settings. You can change the memcached server name and port or configure other datastores in lib/DBDefs.pm.
(Note: The recommended way to set this up is to create a musicbrainz user with a home in /home/musicbrainz. You can do this on Debian by running: "adduser musicbrainz" which will automatically set up the home directory in /home/musicbrainz. You can then switch to this user by running "su musicbrainz" and then "cd ~" and following the steps below. You'll need to add musicbrainz to the sudoers file so you can run sudo commands (edit the sudoers file by running "visudo" and adding the musicbrainz user under # User privilege specification musicbrainz ALL=(ALL) ALL)
- Download the source code.
git clone git://git.musicbrainz.org/musicbrainz-server.git cd musicbrainz-server
- Modify the server configuration file.
cp lib/DBDefs.pm.sample lib/DBDefs.pm
Fill in the appropriate values for MB_SERVER_ROOT and WEB_SERVER.
Determine what type of server this will be using REPLICATION_TYPE:
(NOTE: If you're just going to be using this as a Headphones mirror, you can set it as RT_SLAVE)
a) RT_SLAVE (mirror server)
A mirror server will always be in sync with the master database at http://musicbrainz.org by way of an hourly replication packet. Mirror servers do not allow any local editing, after the initial data import the only changes allowed will be to load the next replication packet in turn.
Mirror servers will have their WikiDocs automatically kept up to date.
If you are not setting up a mirror server for development purposes, make sure to set DB_STAGING_SERVER to 0.
b) RT_STANDALONE
A stand alone server is recommended if you are setting up a server for development purposes. They do not accept the replication packets and will require manually importing a new database dump in order to bring it up to date with the master database. Local editing is available, but keep in mind that none of your changes will be pushed up to http://musicbrainz.org.
Stand alone servers will need to manually download and update their WikiDoc transclusion table:
wget -O root/static/wikidocs/index.txt http://musicbrainz.org/static/wikidocs/index.txt
The fundamental thing that needs to happen here is all the dependency Perl modules get installed, somewhere where your server can find them. There are many ways to make this happen, and the best choice will be very site-dependent. MusicBrainz ships with support for Carton, a Perl package manager, which will allow you to have the exact same dependencies as our production servers. Carton also manages everything for you, and lets you avoid polluting your system installation with these dependencies.
Below outlines how to setup MusicBrainz server with Carton.
- Prerequisities
Before you get started you will actually need to have Carton installed as MusicBrainz does not yet ship with an executable. There are also a few development headers that will be needed when installing dependencies. Run the following steps as a normal user on your system.
sudo apt-get install libxml2-dev libpq-dev libexpat1-dev libdb-dev memcached libyaml-perl
(NOTE: In order to install all the Carton dependencies, all of these packages are required: build-essential git-core libssl-dev libxml2-dev memcached libexpat-dev postgresql-8.4 postgresql-server-dev-8.4 postgresql-contrib liblocal-lib-perl libossp-uuid-perl)
sudo cpan Carton
NOTE: This installs Carton at the system level, if you prefer to install this in your home directory, use http://search.cpan.org/perldoc?local::lib .
- Install dependencies
To install the dependencies for MusicBrainz server, first make sure you are in the MusicBrainz source code directory and run the following:
cat Makefile.PL | grep ^requires > cpanfile carton install --deployment
The following three libraries failed to install, so install them manually: sudo cpan Hash::Merge sudo cpan JSON::Syck sudo cpan Net::CoverArtArchive::CoverArt
Note that if you've previously used this command in the musicbrainz folder it will not always upgrade all packages to their correct version. If you're having trouble running musicbrainz, run "rm -rf local" in the musicbrainz directory to remove all packages previously installed by carton, and then run the above step again.
- Install PostgreSQL Extensions
Before you start, you need to install the PostgreSQL Extensions on your database server. To build the musicbrainz_unaccent extension run these commands:
cd postgresql-musicbrainz-unaccent make sudo make install cd ..
To build our collate extension you will need libicu and it's development files, to install these run:
sudo apt-get install libicu-dev
With libicu installed, you can build and install the collate extension by running:
cd postgresql-musicbrainz-collate make sudo make install cd ..
Note: If you are using Ubuntu 11.10, the collate extension currently does not work with gcc 4.6 and needs to be built with an older version such as gcc 4.4. To do this, run "sudo apt-get install gcc-4.4" and then run the following:
cd postgresql-musicbrainz-collate CC=gcc-4.4 make -e sudo make install cd ..
- Setup PostgreSQL authentication
For normal operation, the server only needs to connect from one or two OS users (whoever your web server / crontabs run as), to one database (the MusicBrainz Database), as one PostgreSQL user. The PostgreSQL database name and user name are given in DBDefs.pm (look for the "READWRITE" key). For example, if you run your web server and crontabs as "www-user", the following configuration recipe may prove useful:
_(NOTE: On Debian 6.0, with PostgreSQL 8.4, you'll find the pg_hba.conf file in /etc/postgresql/8.4/main . Instead of only allowing the user musicbrainz to access the database, you can allow all local connections, and make sure postgres only listens on localhost:
In /etc/postgresql/8.4/main:
Edit postgresql.conf:
listen_addresses = 'localhost'
Edit pg_hba.conf:
# "local" is for Unix domain socket connections only
local all all trust
host all all 127.0.0.1/32 trust
# in pg_hba.conf (Note: The order of lines is important!): local musicbrainz_db musicbrainz ident mb_map
# in pg_ident.conf: mb_map www-user musicbrainz
Alternatively, if you are running a server for development purposes and don't require any special access permissions, the following configuration in pg_hba.conf will suffice (make sure to insert this line before any other permissions):
local all all trust
- Create the databases
You have two options when it comes to databases. You can either opt for a clean database with just the schema (useful for developers with limited disk space), or you can import a full database dump.
a) Use a clean database
To use a clean database, all you need to do is run:
carton exec ./admin/InitDb.pl -- --createdb --clean
b) Import an NGS database dump
The easiest way to import the database is to use a database dump. These dumps are provided twice a week and are available here:
ftp://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/
To get going, you need at least the mbdump.tar.bz2, mbdump-editor.tar.bz2 and mbdump-derived.tar.bz2 archives, but you can grab whichever dumps suit your needs. Assuming the dumps have been downloaded to /tmp/dumps/ you can import them with:
carton exec ./admin/InitDb.pl -- --createdb --import /tmp/dumps/mbdump*.tar.bz2 --echo
--echo just gives us a bit more feedback in case this goes wrong, you may leave it off. Remember to change the paths to your mbdump*.tar.bz2 files, if they are not in /tmp/dumps/.
NOTE: on a fresh postgresql install you may see the following error:
CreateFunctions.sql:33: ERROR: language "plpgsql" does not exist
To resolve that login to postgresql with the "postgres" user (or any other postgresql user with SUPERUSER privileges) and load the "plpgsql" language into the database with the following command:
postgres=# CREATE LANGUAGE plpgsql;
- Start the development server
You should now have everything ready to run the development server!
The development server is a lightweight HTTP server that gives good debug output and is much more convenient than having to set up a standalone server. Just run:
carton exec -- plackup -Ilib -r
Visiting http://your.machines.ip.address:5000 should now present you with your own running instance of the MusicBrainz Server.
- Troubleshooting
If you have any difficulties, please feel free to contact ocharles or warp in #musicbrainz-devel on irc.freenode.net, or email the developer mailing list at musicbrainz-devel [at] lists.musicbrainz.org.
If you find any bugs, please report them on http://tickets.musicbrainz.org.
Good luck, and happy hacking!
To load the replication changes manually, you'll need to run:
carton exec -- ./admin/replication/LoadReplicationChanges
Alternatively, you can use the following script which can start/stop the server, and load the replication changes.
http://paste.pocoo.org/show/555245/ (if that doesn't work try: http://paste2.org/p/2042937)
Save it as "/usr/bin/mbcontrol", and run:
chmod a+x /usr/bin/mbcontrol
which will make it executable.
Run it as the musicbrainz user, making sure you have write access to /var/run/musicbrainz and var/log/musicbrainz.
Usage:
mbcontrol start/stop (start and stop the server)
mbcontrol hourly (load the replication changes)
To have the script run automatically, you can stick a line into the musicbrainz user's crontab:
crontab -e
and add this line, which will run "mbcontrol hourly" at 10 minutes past the hour every hour:
10 * * * * /usr/bin/mbcontrol hourly
If you have trouble with the script not running, it could be because carton is not in the crontab path. If that's the case, you'll need to specify the path to carton in the mbcontrol script:
CARTON=/usr/local/bin/carton