
Skytools 3 - cascaded replication
=================================

Keep old design from Skytools 2
-------------------------------

* Worker process connects to only 2 databases, there is no
  everybody-to-everybody communication going on.
* Worker process only pulls data from queue.
  - No pushing with LISTEN/NOTIFY is used for data transport.
  - Administrative work happens in separate process.
  - Can go down anytime, without affecting anything else.
* Relaxed attitude about tables
  - Tables can be added/removed any time.
  - Inital data sync happens table-by-table, no attempt is made to keep
    consistent picture between tables during initial copy.

New features in Skytools 3
--------------------------

* Cascading is implemented as generic layer on top of PgQ - *Cascaded PgQ*.
  - Its goal is to keep identical copy of queue contents in several nodes.
  - Not replication-specific - can be used for any queue.
  - Advanced admin operations: switchover, failover, change-provider, pause/resume.
  - For terminology and technical details see here: set.notes.txt.

* New Londiste features:
  - Parallel copy - during inital sync several tables can be
    copied at the same time.   In 2.x the copy already happened in separate
    process, making it parallel was just a matter of tuning launching/syncing logic.

  - EXECUTE command, to run random SQL script on all nodes.  The script is executed
    in single TX on root, and insterted as event into queue in the same TX.
    The goal is to emulate DDL AFTER TRIGGER that way.
    Londiste itself does no locking and no coordination between nodes.  The assumption
    is that the DDL commands itself do enough locking.  If more locking is needed
    is can be added to script.

  - Automatic table or sequence creation by importing the structure
    from provider node.  Activeted with --create switch for add-table, add-seq.
    By default *everything* is copied, including Londiste own triggers.
    The basic idea is that the triggers may be customized and that way
    we avoid the need to keep track of trigger customizations.

  - Ability to merge replication queues coming from partitioned database.
    The possibility was always there but now PgQ keeps also track
    of batch positions, allowing loss of the merge point.

  - Londiste now uses the intelligent log-triggers by default.  The triggers
    were introduced in 2.1.x, but were not on by default.  Now they are
    used by default.

* New interactive admin console.  Because long command lines are not very
  user-friendly, this is an experiment on interactive console with
  heavy emphasis on tab-completion.

* New multi-database ticker.  It is possible to set up one process that
  maintains all PgQ databases in one PostgreSQL instance.  It will
  auto-detect both databases and whether they have PgQ installed.
  This also makes core PgQ usable without need for Python.

* New cascaded dispatcher script.  Previous 3 dispatcher scripts
  (bulk_loader, cube_dispatcher, table_dispatcher) shared quite
  a lot of logic for partitionaing, differing on few details.
  So instead of porting all 3 to cascaded consuming, I merged them.

Minor improvements
------------------

* sql/pgq: ticks also store last sequence pos with them.  This allowed
  also to move most of the ticker functionality into database.  Ticker
  daemon now just needs to call SQL function periodically, it does not
  need to keep track of seq positions.

* sql/pgq: Ability to enforce max number of events that one TX can insert.
  In addition to simply keeping queue healthy, it also gives a way to
  survive bad UPDATE/DELETE statements with buggy or missing WHERE clause.

* sql/pgq: If Postgres has autovacuum turned on, internal vacuuming for
  fast-changing tables is disabled.

* python/pgq: pgq.Consumer does not register consumer automatically,
  cmdline switches --register / --unregister need to be used for that.

* londiste: sequences are now pushed into queue, instead pulled
  directly from database.  This reduces load on root
  and also allows in-between nodes that do not have sequences.

* psycopg1 is not supported anymore.

* PgQ does not handle "failed events" anymore.

Open questions
--------------

* New ticker
  - Name for final executable: pgqd, pgq-ticker, or something else?
  - Should it serialize retry and maint operations on different dbs?
  - Should it serialize ticking?

* Python modules
  - Skytools 3 modules should be parallel installable with Skytools 2.
    we decided to solve it via loader module
    (like http://faq.pygtk.org/index.py?req=all#2.4[pygtk]).
    The question is should we have Skytools-specific loader or more generic.
    And what should the name be?

      import skytools_loader
      skytools_loader.require('3.0')

      vs

      import pkgloader
      pkgloader.require('skytools', '3.0')

* PgQ Cascading
  - There are some periodic operations that should be done on root node.
    Should we let the ticker do them or do we require a Londiste daemon there?
    Letting ticker do it means initial setup is slightly easier.
    But requiring separate daemon means user has guarantee that
    switchover (downgrading root to branch) works without need to
    start the replication daemon.

* Londiste EXECUTE command
  - Should the scripts have ability to inform Londiste of the tables
    they operate on, so that nodes that do not have such tables
    can ignore the script.

* Newadm:
  - Good name for it.

* Is there good reason not to drop following modules:
  - logtriga(), pgq.logtriga() - non-automatic triggers
  - cube_dispatcher, table_dispatcher, bulk_loader - they are merged into queue_loader

Further reading
---------------

* Skytools 3 todo list: TODO.txt
* Newadm design and todo list: qadmin.txt
* Technical notes about cascading: set.notes.txt
* Notes for contributors: devnotes.txt

* Python API docs:
  - http://skytools.projects.postgresql.org/skytools-3.0/api/[skytools, pgq, londiste modules]

* Database API docs:
  - http://skytools.projects.postgresql.org/skytools-3.0/pgq/[PgQ]
  - http://skytools.projects.postgresql.org/skytools-3.0/pgq_node/[Cascaded PgQ]
  - http://skytools.projects.postgresql.org/skytools-3.0/pgq_coop/[Cooperative PgQ]
  - http://skytools.projects.postgresql.org/skytools-3.0/londiste/[Londiste]

* Londiste Demo:
  - http://skytools.projects.postgresql.org/skytools-3.0/demo.html[]

