Docvert


Table of Contents
=================

* Requirements
* Install + Security

Requirements
=============================================================

*	PHP 5 or later, with the "ZLIB" and "XSL" extensions.


Optional Libraries
-------------------------------------------------------------
*	If you want to convert Microsoft Word documents then
	you'll need to choose a Microsoft Word to OpenDocument
	converter.

	Your options are as follows (in order of preference)

	Option 1) "PyODConverter via OOo Server" depends on
		* OpenOffice.org 2.3 or later (NOTE version. 2.3)
		* python-uno (Python UNO libraries)

	Option 2) "OpenOffice.org Stand-alone" depends on,		
		* OpenOffice.org 1.9.122 or later (NOTE version. 1.9+)
		* Xvfb  (on Linux/BSD/OSX)

	Option 3)
		* Abiword 2.4.2 or later.

*	If you want to convert bitmap images like BMP, JPG,
	PNG, GIF you'll need PHP GD ("php5-gd").

*	If your documents files have diagrams or graphs in WMF
	or EMF (windows metafile format) and you want to convert
	them to PNG/JPEG/SVG then you'll need to install
	"libwmf".

		Windows:
		http://gnuwin32.sourceforge.net/packages/libwmf.htm
		and grab "Complete package, except sources"

		Linux:
		your distribution's packaging, and find
			libwmf0.2-7 or later,
			libwmf-bin
		http://wvware.sourceforge.net/libwmf.html

*	If you want to validate documents against a schema
	you'll need

		Linux:
		Trang from your distro's package manager or
		<http://thaiopensource.com/relaxng/trang.html>

		Windows:
		...I don't know. If you could suggest a good
		schema validator with command line support
		I'll support it.

*	If your documents have SVG images and you want to
	convert them to PNG/JPEG/GIF, then you'll need to
	install "svgrlib".

		Windows:
		http://www.gimp.org/~tml/gimp/win32/downloads.html

		Linux:
		your distribution's packaging, and find
			librsvg2-2
			librsvg-bin
		http://librsvg.sourceforge.net/download/

*	If you're using "Document Generation" you'll need PHP-Tidy
	("php5-tidy").

Install + Security
=============================================================

Step 1.	Host the Docvert software on a web server.

Step 2.	Ensure that /writable is writable.

	And,
		Linux/OSX Users:
			Make /etc/docvert and ensure that
			/etc/docvert is writable to the web
			server user (typically "www-data").

		Windows Users:
			Ensure that
			/core/config/windows-specific/config/
			is writable.

			Ensure that this directory is not
			browsable from the web. Verify that
			it's not by browsing to this
			directory in your web browser
			Eg. ensure you can't browse to (for example)
			http://localhost/docvert/core/config/windows-specific/config/

	Now secure Docvert's admin page by opening up
	Docvert in your web browser and browse to
	the admin page (the tab on the top-right).
	On the admin page, set the password.

Step 3. Do you want to convert Word Documents or just
	OpenDocument files?

	(If you just want to convert OpenDocument files then
	skip this stage ... advance to Step 4)

	Still reading?

	Well you'll need to choose a 'Microsoft Word to OpenDocument'
	converter. You have the choice of,

		1) PyODConverter via OOo Server
		2) OpenOffice.org Stand-alone
		3) Abiword

	Choose now! (I strongly recommend the first one)

	1) PyODConverter via OOo Server
	-------------------------------
	
		This requires Python, python-uno, and
		OpenOffice.org 2.3+ running in server mode
		(which means it's listening on a port).

		Docvert includes its own highly customised version of
		PyODConverter so there's no need to download a copy.

			For Linux/OSX users you'll need to allow
			your web-server user to start/stop
			OOo Server so go to the admin page and under
			the "Run as User" heading enter the username
			to run OpenOffice.org as.

			Then ensure that the admin page's
			"Super User Method" is "sudo",
			and then you'll need to configure permissions
			with the "visudo" command (from the command line)
			to add some lines to sudoers,
			Add this line,

	www-data ALL=(NAMEOFUSER) NOPASSWD: /var/www/docvert/core/config/unix-specific/openoffice.org-server-init.sh

			Replace NAMEOFUSER with... the name of the user but
			leave the brackets () because sudoers wants them.
			And replace www-data with the web server user if it's
			something else.
			For example, it might look like this,

	www-data ALL=(docvert) NOPASSWD: /var/www/docvert/core/config/unix-specific/openoffice.org-server-init.sh

			if you wanted to run OOo Server under
			the system user "docvert".


	2) Standalone OpenOffice.org (DEPRECATED)
	-----------------------------------------

		Stage i)
			Windows:
				In "core/config/windows-specific" edit
				these files,
					"convert-using-openoffice.org.bat"

				In "core/config/unix-specific" edit
			Linux/OSX:
				these files,
					"convert-using-openoffice.org.sh"

			...to point at the OpenOffice.org executable
			(typically "oowriter" or "oowriter2").

				For Linux/OSX users you'll need to allow
				your web-server user to run this script
				so go to the admin page and under the
				"Run as User" heading enter the username
				to run OpenOffice.org as (by default it
				will run as root), then you'll need to
				configure permissions with the "visudo"
				command to add some lines to sudoers,

				Add this line (read below for which
				bits to edit)

	www-data ALL=(NAMEOFUSER) NOPASSWD: /var/www/docvert/core/config/unix-specific/convert-using-openoffice.org.sh

				/var/www/vhosts[...].sh are the paths to your
				scripts. In the case of Abiword, point to Abiword
				instead.
				Replace NAMEOFUSER with.. the name of the user but
				leave the brackets () because sudoers wants them.
				And replace www-data with the web server user if it's
				something else.

				Note: As a security measure it's a good idea to
				ensure that the webserver user cannot edit this file
				and change it to run any command with root privledges.
				Do this by removing write permissions for all files
				under the "core/config/unix-specific" directory.

		Stage ii)

			In OpenOffice.org we'll need to tell it to trust Docvert's
			macros. We do this by adding "core/config/trusted-macros"
			to the list of trusted macros within OpenOffice.org.

			Open up Docvert in your web browser and browse to the Admin
			page (there's a tab on the top right). If it's complaining
			about the "/writable" directory not being writable then
			make that directory writable, create an admin password.

			Now get on the same computer as the web server.

				If you've got your Web Server as a Windows Service
				it may be supress OOo from appearing so go to
				Windows Services and the properties window for your
				web server program. Go to the "Log On" tab, ensuring
				that "Allow service to interact with desktop" is
				ticked and then restart the service.
				(Later, if you want to supress OOo from appearing
				during conversions, as you probably do, then untick
				that checkbox and then restart the service.)

			Now click the "Setup OpenOffice.org" button on the admin page
			to start OpenOffice.org.

				If there are problems here check Step 2 again, and
				follow the instructions on screen which will try
				and detect common configuration mistakes. Also try
				the troubleshooting.txt	guide.
				You'll need to be able to start OpenOffice.org before
				continuing.

			Once you've got OpenOffice.org started go to TOOLS | OPTIONS
			| SECURITY and "macro security". Add the directory
			"core/config/trusted-macros" to the trusted list.

			Exit OpenOffice.org and test that it's worked by clicking
			on "Setup OpenOffice.org" again. Ensure there are no dialog
			windows that appear when doing this.

			If there are still dialog windows and you think you've followed
			Stage ii correctly then read the troubleshooting.txt file, or
			post on the Docvert mailing list.

	3) Abiword
	----------

		Edit the core/config Abiword file for...

		Linux/OSX:
			In "core/config/unix-specific" edit
			this file,
				"convert-using-abiword.sh"
		Windows:
			In "core/config/windows-specific" edit
			this file,
				"convert-using-abiword.bat"

			For Linux/OSX users you'll need to
			configure permissions with the "visudo"
			command to add some lines to sudoers,
			Add this line

	www-data ALL=(NAMEOFUSER) NOPASSWD: /var/www/docvert/core/config/unix-specific/convert-using-abiword.py

			Replace NAMEOFUSER with.. the name of the user but
			leave the brackets () because sudoers wants them.
			And replace www-data with the web server user if it's
			something else.

Step 4.	That's it.

	...well that's enough to convert, but there are additional
	features you can set up too such as,

	All of these involve editing the appropriate files under
	"/core/config" to point at conversion libraries.
	
	Schema Validation
	-----------------
		And if you want Schema Validation then edit,
			Linux/OSX,
				validate-against-schema.sh
			Windows,
				validate-against-schema.bat
		...to point to Trang.

			For Linux/OSX users you'll need to
			configure permissions with the "visudo"
			command to add some lines to sudoers,
			Add this line

	www-data ALL=(NAMEOFUSER) NOPASSWD: /var/www/docvert/core/config/unix-specific/validate-against-schema.sh


	Image Conversion
	----------------
		Edit these...
			Linux/OSX,
				convert-using-wmf2gd.sh
				convert-using-wmf2svg.sh
			Windows,
				convert-using-wmf2gd.bat
				convert-using-wmf2svg.bat


