
= Skytools FAQ =

== Skytools ==

=== What is Skytools? ===

It is bunch of database management tools we use
and various frameworks / modules they depend on.

Main components are:

==== Python scripts ====

Main tools:

   walmgr:: walshipping manager
   londiste:: replication on top of pgq
   pgqadm:: pgq administration and maintenance
   setadm:: cascaded pgq administration
 
Special scripts:
 
   bulk_loader
   cube_dispatcher
   table_dispatcher
 
Queue copy

   queue_mover:: copy queue contents from one db to another
   queue_splitter:: copy queue contents to another db splitting it into several queues
 
Operate bunch of scripts together

   scriptmgr

==== Python modules ====

   pgq
   skytools
   londiste
 
==== SQL modules ====
   
   londiste
   pgq
   pgq_node
   pgq_ext

=== Where is the code located? ===

Code layout:

 debian/
 doc/
 python/bin/
 python/londiste/   - Londiste python modules
 python/modules/    - C extension modules for Python (string handling)
 python/pgq/        - pgq and cascaded pgq python modules
 python/skytools/   - python modules for generic database scripting
 scripts/           - Special purpose python scripts (python)
 sql/londiste/      - database code for londiste (plpgsql)
 sql/pgq/           - PgQ database code (C + plpgsql)
 sql/pgq_ext/       - PgQ event/batch tracking on remote database (plpgsql)
 sql/pgq_node/      - cascaded pgq support (plpgsql)
 sql/txid/          - Obsolete txid code for Postgres 8.2 and below

== PgQ - The generic queue ==

=== Why do queue in database?  Transactional overhead? ===

1. PgQ is quite likely the fastest ACID compliant queue,
   thanks to Postgres being pretty fast despite the
   "transactional overhead".  Why use anything less robust?

2. We have lot of business logic in database.  Events created
   by business transactions need to live or die with main transaction.

3. Queue used for replication purposes needs to be transactional.

I think the reason people act surprised when they hear about queue
in database is not that they don't care about reliability
of their event transport, but that the reliable data storage
mechanism - SQL databases - did not have any way to write
performant queue.  Now thanks to the txid/snapshot technique
we have a way to write fast _and_ reliable queue,
so why (care about anything less).

=== Could you break dependancy on Python? ===

There is no dependancy on Python.  The PgQ itself is written in C / plpgsql
and it appears as bunch of SQL functions under `pgq` schema.
Thus it can be used from any language that can execute SQL queries.

There is Python helper framework that makes writing Python consumers easier.
Such framework could be written for any language.

=== Aren't the internals similar to Slony-I? ===

Yes, PgQ was created by generalizing queueing parts from Slony-I.

=== Dump-restore ===

== Londiste - The replication tool ==

=== What type of replication it does? ===

Londiste does trigger-based asynchronous single-master replication,
same as Slony-I.

In Skytools 3.x it will support merging partitions togethers,
that could be called shared-nothing multimaster replication.

=== What is the difference between Slony-I and Londiste? ===

Nothing fundamental.  Both do asynchronous replication.

Main difference is that Londiste consists of several
relatively independent parts, unlike Slony-I where
code is more tightly tied together.

At the moment Londiste loses to Slony-I featurewise,
but should be easier to use.  Hopefully we can keep
the simple UI when we catch up in features.

=== What are the limitations of Londiste ===

It does not support '.' and ',' in table, schema and column names.

