Skip to main content

Subscribe, Acquire, and Re-Advertise products.

Project description

==================
MetPX-Sarracenia
==================

[ Français_ ]

Overview
--------

MetPX-Sarracenia is a data duplication or distribution pump that leverages
existing standard technologies (web servers and the `AMQP <http://www.amqp.org>`_
brokers) to achieve real-time message delivery and end to end transparency
in file transfers. Data sources establish a directory structure which is carried
through any number of intervening pumps until they arrive at a client. The
client can provide explicit acknowledgement that propagates back through the
network to the source. Whereas traditional file switching is a point-to-point
affair where knowledge is only between each segment, in Sarracenia, information
flows from end to end in both directions.

At it's heart, sarracenia exposes a tree of web accessible folders (WAF), using
any standard HTTP server (tested with apache). Weather applications are soft
real-time, where data should be delivered as quickly as possible to the next
hop, and minutes, perhaps seconds, count. The standard web push technologies,
ATOM, RSS, etc... are actually polling technologies that when used in low latency
applications consume a great deal of bandwidth an overhead. For exactly these
reasons, those standards stipulate a minimum polling interval of five minutes.
Advanced Message Queueing Protocol (AMQP) messaging brings true push to
notifications, and makes real-time sending far more efficient.

homepage: http://metpx.sf.net

Man Pages online: http://metpx.sourceforge.net/#sarracenia-documentation


Sarracenia is an initiative of Shared Services Canada ( http://ssc-spc.gc.ca )
in response to internal needs of the Government of Canada.


.. _Français:

Survol
------

MetPX-Sarracenia est un engin de copie et de distribution de données qui utilise
des technologies standards (tel que les services web et le courtier de messages
AMQP) afin d'effectuer des transferts de données en temps réel tout en permettant
une transparence de bout en bout. Alors que chaque commutateur Sundew est unique
en soit, offrant des configurations sur mesure et permutations de données multiples,
Sarracenia cherche à maintenir l'intégrité de la structure des données, tel que
proposée et organisée par la source, à travers tous les noeuds de la chaîne,
jusqu'à destination. Le client peut fournir des accusés de réception qui se
propagent en sens inverse jusqu'à la source. Tandis qu'un commutateur traditionnel
échange les données de point à point, Sarracenia permet le passage des données
d'un bout à l'autre du réseau, tant dans une direction que dans l'autre.

Sarracenia, à sa plus simple expression, expose une arborescence de dossiers disponibles
sur la toile ("Web Accessible Folders"). Le temps de latence est une composante
névralgique des applications météo: les minutes, et parfois les secondes, sont comptées.
Les technologies standards, telles que ATOM et RSS, sont des technologies qui consomment
beaucoup de bande passante et de ressouces lorsqu'elles doivent répondre à ces contraintes.
Les standards limitent la fréquence maximale de vérification de serveur à cinq minutes.
Le protocol de séquencement de messages avancés (Advanced Message Queuing Protocol,
AMQP) est une approche beaucoup plus efficace pour la livraison d'annonces de
nouveaux produits.

Sarracenia est une initiative de Services Partagés Canada ( http://ssc-spc.gc.ca ) en réponse à des besoins interne du gouvernement du Canada.

portail: http://metpx.sf.net




===============
Release Notes
===============

lists all changes between versions.

**git repo(unreleased)**

**2.17.10a1**

* cleanup/declare/setup actions (all programs): no exit, log with configname
* sr_subscribe/sr_sarra/sr_sender : do_task plugin (initialised to proper module for now)
* sr_subscribe: headers' source and from_cluster forced when source_from_exchange
* sr_subscribe: add substitution for ${DR} ${PDR} ${YYYYMMDD} ${SOURCE} ${HH}
* sr_subscribe log ignore message when already in cache
* sr_subscribe: events option is consider to perform link and delete messages
* sr_subscribe: modified to be a base class instantiated from most programs
* sr_subscribe: integration of restore_queue, process report_daemons, save/restore
* sr_subscribe: help module : treats sr_shovel,sr_winnow,sr_sarra cases
* sr_sender: for R and L messages skip offset/length setting in module set_local()
* sr_shovel: caching optional default to False
* sr_config: some save,restore and cache defaults
* sr_config: inflight supports duration_from_str (for sr_watch/post)
* sr_config: duration_from_str time suffix [sS] [mM] [hH] [dD] [wW] where applicable
* sr_config: module configure cleans up extended options (proper reload)
* sr_config: option -headers to add,delete or reset user's key,value pair in message headers
* sr_ftp,sr_sftp: connect/reconnect resets cdir (current dir)
* sr_ftp,sr_sftp,sr_http: standardisation, http exception (no hang)
* sr_ftp,sr_sftp,sr_http: fix Eric's os.getcwd bug, add preventive fp.flush and os.fsync
* msg_total.py: plugin skip total byte increment when no partstr in message
* sr_message: move support with oldname/newname (impact watch,post,subscribe,sarra,sender to come)
* sr_message: srcpath turned to baseurl, set_notice(baseurl,relpath) ***impact all programs
* sr_message: trim_headers for user added headers key,value pair ***impact all programs
* sr_cache: module cache.check_msg ... process correctly message without parts (sum L and R)
* sr_audit,sr speed up through class instantiation and direct broker connection
* sr_audit fix permissions for source and subscribe users
* sr_amqp,sr_pika: cleanup skip removal of exchanges xpublic,xreport,xwinnow*
* sr_util: startup_args catches -help when only args given
* flow_test: several changes to make it more reliable.

**2.17.09a1**

* FIXME: do old cache files need to be deleted during upgrade? update RELEASE_NOTES
* expire DEFAULT CHANGED: 7 days -> 5 minutes. Avoiding pump overloading turns out to be critical.
* new plugin msg_to_clusters, simplified replacement of inter-cluster routing logic.
* sr_watch, returned to recursive formulation of sr_watch, reduces overhead substantially.
* flow_test now includes ftp download test.
* flow_test now uses sr_audit, queues and exchanges extant now tested.
* flow_test now waits for queues to drain (so it works more often.)
* fix (bug# 88) for sr_audit creating report queues with no consumers.
* sr_poll and plugin/poll_script.py post with parent.post (srcpath,relpath instead of url)
* flow_templates under poll|post|watch modified not to generate errors in flow logs
* flow_templates shovel t_dd[12].conf reject .*citypage.* to avoid errors in flow logs
* plugin/msg_by_user.py now considers msg.report_user for v02.report messages (correct error in flow logs)
* flow_check.sh shows classified list of errors in log or report No error found
* sr_poster unused in sr_poll, sr_winnow, sr_sender, sr_shovel
* sr_winnow, sr_subscribe supports caching on messages
* sr_config post_url option equivalent to url
* sr_subscribe support posting if post_broker is set (and other post options)
* plugin heartbeat_cache : cache clean/save + stats if cache_stat = True
* all program consuming... calls heartbeat_check themselves
* move hearbeat code from sr_consumer to sr_config
* cache is cleaned every heartbeat.


**2.17.08a1**

* sr_pika tested with flow stuff...
* sr uses .config/sarra/post directory ... check for option sleep to call sr_cpost
* throttle use better time function
* sr_message topic without filename
* sr_http timeout + self test
* sr_sftp self test works
* sr_sftp/sr_ftp call self.close on download or send problems
* sr_sftp minimal credentials based on SSH configs being ok
* sr_sftp read/uses ~/.ssh/config if needed/provided
* sr_sender sftp/ftp bugfix now honours *mirror true* default. was ignored before.
* sr_cache same algorithm as the C implementation
* getting rid of cluster routing logic, gateway_for/, to be implemented with plugins.
* debian packaging for C.
* C posting library, including sr_cpost that replicates post and watch is complete.
* C libc shim that calls C posting library complete.
* getting rid of random checksums (L & R -> SHA512 digest.)

**2.17.07a4**

* changed *chmod* interpretation. Was obsolete in favour of umask, now an option to override umask.
* bug fixes for chmod not being done in a number of situations where it was required.

**2.17.07a3**

* on_heartbeat support added to sr_watch.

**2.17.07a2**

* on_post plugins were broken in 2.17.07a1
* on_heartbeat now defaults to heartbeat_log as one would expect, and documented both.

**2.17.07a1**

* sr_sarra bug fix os_.exit
* All sarra programs have standard invoke : pgm [args] action config... old way still supported (MG)
* sr_util defines a function startup_args to parse sarra program arguments (MG)
* sr_audit --users : makes sure exchanges/queues configured on pump are setup (MG)
* all programs manage exchanges/queues through action 'cleanup','declare','setup' (MG)
* sr_poll nows supports http (MG)
* sr_poll start posting without parts when it has no clue for size (MG)
* on_html_page added in config and sr_poll with default http_page.py (MG)
* on_watch added in config and sr_watch (MG)
* sr_http.py now has a valid class sr_http (used in sr_poll) (MG)
* mode bits limited to the last four digits (upper digits non portable anyways.)
* C implementation of libsrshim, libsarra, sr_cpost, and sr_subjsondump in C (not packaged yet.)
* fixed bogus error message from backward compatibility plugins.
* added mtime check to sarra and sr_subscribe so that if of new file is <= file_on_disk, then don't download.

**2.17.06a3**

* git repo url was wrong. Thanks Canadian Tire!
* compatibility editing local_file (full path) now results in setting new_dir and new_file.
* still harmonizing sender vs. subsribe api senders use parent.new_file, subs use parent.msg.new_file
* fixed sender using ftp broken by error message referring to *remote_urlstr* ( replaced by *new_urlstr* )
* files were created as public write because umask was overridden. Dunno why it was there in the first place.
* strip fixed in sr_subscribe.
* flatten fixed in sr_config.
* crasher bug when sr_sender doesn´t have a post_broker.

**2.17.06a2**

* added chmod_log for log files, which were defaulting to public writable... no idea why, set default to 600.
* changed posting default for to_clusters from ALL to the hostname of the broker.
* moved accept/reject processing into sr_poster.post, so automatically honoured when using plugin scripts that call it.
* fix bug#86 DESTFNSCRIPT in one accept would be used by subsequent ones.
* fix bug#51 now use new_path, rather than local_path in consumers, and remote_path in senders. all can use same plugins.
includes warnings for existing plugins to change their variable names, old ones should still work, just prompt warning in log.


**2.17.06a1**

* Added default value of 'ALL' for *to_clusters* of and *gateway_for* to make those options... optional.
* Adding *preserve_time* option (default: True), to have mtime from source reflected in files written.
* Adding *preserve_mode* option (default: True) the move mode bits from source reflected in files written.
* deprecating *interface* setting, code from Jun. one less thing to set. Now scans all interfaces for *vip*
* polling script should still sleep for *sleep* seconds if the script fails. busyloop is bad.
* added download_dd plugin, which does multiple process copies (striping individual files.)
* documentation improvement: made *blocksize* the main partitioning option, *parts* is developer only.
there was an error in that usage of parts actually referred partially to blocksize
* fixed blocksize=1 to mean send entire file, not 1 byte blocks.
* fix bug#66 for sr_sender to put the actual file name on the destination (after destfn, etc...)
* sr_sarra: suppressedn excessive messages about who has vip in debug mode.
* sr_sarra: fixed -strip. Did not work at all before.
* added the poll_script.py plugin as an example for sr_poll.
* fix from Eric for wrong permissions in sr_sftp.
* removed useless import in line_mode.py plugin which breaks it on python 3.2
* fix from Eric for wrong permissions in sr_ftp. (bug #84)
* added version strings to components log and usage outputs.
* added sr_poll to flow_test (from Daniel)
* some re-organizing of code in sr_watch.
* implement 0400 default permission mask in sr_poll.
* note on how to encode special characters in passwords in credentials.conf
* some plugin improvements from Dominic Racette.

**2.17.03a5**

* added sr_watchb... the old implementation as a backup in case the new sr_watch is busted.
* attempted fix for sr_watch permission denied issue. Reformulated how recursion is done.
now it just queues up issues for later.

**2.17.03a4**

* attempted fix for bug #79 (.tmp file stay when download fails.) not tested.
* added 's', SHA512 checksum support.
* after a shovel has restored a queue from a save file, it now exits.
* on repeated saves, the json save files came out different for the same messages.
Fixed by adding sort_keys=True to dumps. now save of same files is bitwise identical.
* added 'attempt' setting to make the number of retries programmable.
* fixed on_line plugins being broken in sr_poll.
* fixed 'reject' not working in sr_poll.
* added -save_file option to shovel and sender to allow arbitrary locations for save files.
* report_daemons False option setting now stops report routing shovels from starting.
* added file_age.py to plugins examples.

**2.17.03a3**

* added sr_log2save a little filter to extract reloadable messages from log files.

**2.17.03a2**
* release of a1 broke in the middle, had to use a new tag.

**2.17.03a1**

* feature #61: save/restore Deal with large queues on brokers by persisting to disk.
* bug #77: fixed. crash on file deletion when inflight is numeric.
* feature #61, sr_sender -save/-restore to avoid broker queues implemented.
* bug #78: fixed. posting symlinks now works.
* bug #76: fixed. sr_audit will now only start if the admin option is set in default.conf
only need one sr_audit for each pump. having more isn't a problem, but dozens are stupid.
for deployment to a cluster, need to run on hundreds of nodes, stop running hundreds of useless instances.
* sr_watch now indicates the exchange being published to on startup.
* feature #56: system startup (init file and/or systemd service) now installed with package. might be a bit shaky...
* bug (not submitted) problem with truncation on sftp sender, missing argument.
* developer: flow test improvement: added verification of content sent by sr_sender.
* bug (not submitted) all DESTFNSCRIPT are broken in last release. Fixed now.
* sr_subscribe with no directory spec was broken. default to pwd as one would expect. Fixed now.
* changed build-dep from python-docutils -> python3-docutils.

**2.17.02a1**

* Summary: added some understanding of symbolic links.
* sr_watch will be faster in many cases, many improvements.
* sr_post now accepts normal file specifications (more than 1, and relative paths)
* Any component can now use vip/interface for active/passive. Cluster configurations more flexible.
* programming: can have more than one plugin for on_*, they now stack sequentially.
* programming: do_download plugin examples added for use of wget or scp.
* other small improvements.
*
* Details:
* Added symbolic link processing (sr_watch, sr_post, sr_sarra, sr_subscribe, sr_sender)
Caveat: links are mirrored as-is. Likely the wrong thing to do for absolute ones. Suggestions bug#70 welcome.
* sr_post: now works with relative paths, and * etc... can post multiple files and/or directories at once.
* sr_post: simplified partitioning options: blocksize eliminated, replaced by 'parts'
* sr_post: parts 0 - autocompute part size, 1- always send files in a single part, <sz> used a fixed size.
* sr_watch: events keywords changed: modified->modify, created->create, deleted->delete.
* sr_watch: event keyword for links: link - mirror symbolic links
* sr_watch: added inflight xx to ignore files until they have not been modified for > xx seconds.
* sr_watch: symbolic link processing significantly changes paths produced, as realpath no longer used.
This should be perceived as an improvement (paths look more familiar).
* sr_watch: enabled inotify observer (can be hundreds of times faster to notice a change in a large tree.)
* sr_watch: added *force_polling* toggle option to allow user selection of slower method (polling observer)
* sr_watch: added *follow_symlinks* toggle option.
* sr_watch: process groups of events with a single cache lock/unlock. Provides 4-10x speedup.
* sr_watch: added 'realpath' option. Earlier versions use 'realpath' all the time, which changes
paths read significantly when directories are symbolically linked. So default was changed to not do that.
Can obtain old behaviour by spcifying this option (listed as a developer option.)
* plugins: are now stackable, when on_message encountered it is added to the list of plugins,
rather than replacing a single one.
* plugins: added alternate downloading examples: (download_scp, download_wget, msg_download )
This is used to invoke high speed xfer mechanism, such as bbcp.
* sum 0: the sum 0 algorithm is changed to produce random checksum, rather than constant 0 to improve load balancing.
* sr_audit: changed 'role' directive to 'declare' to allow declaration of things beside users. See following line:
* sr_audit: added 'declare exchange' to permit creation of exchanges.
* developer: flow test improvement: essentially re-written to improve reliability, and shorten.
* developer: flow test improvement: now checks every item, rather than sampling, results more reassurring.
* developer: flow test improvement: cumulative status (of all tests.)
* developer: flow test improvement: compare actual downloads vs. watch.
* developer: flow test improvement: programmable number of items to collect before verifying.
* feature #59: #!/usr/bin/python3 -> #!/usr/bin/env python3 ... harmless...
* feature #56: started. systemd support file begun, more testing required.
* feature #54: done. added Active/passive options to all components (vip & interface support.)
* feature #53: done. sr_watch 'inflight' implements mtime work.
* feature #52: done. plugin-stacking.
* bug #74: workaround ( sr_post to an ssl broker prints scary (but harmless) message after succeeding, messge suppressed. )
* bug #73: sr_sender overwriting files with shorter new versions leaves old content) fixed.
General bug fix for over-writing of files when new shorter than old (sftp mostly)
* bug #72: fixed ( sr_sender -strip now works. )
* bug #71: fixed ( sr_audit user creation )
* bug #70: started ( sr_watch symbolic link handling ) mitigated. Unclear if really fixed.
* bug #68: fixed ( sr_sarra part of flow test improvements above.)
* bug #67: fixed ( config files always parsed twice. )
* bug #45: fixed ( sr_sarra will not delete local files )

**2.16.11a4**

* Added moving of log directory from var/log -> log, and replacement of var directory with a symlink.
* Added setting of passwords by default for broker users by sr_audit.
* Added --reset flag interpretation by sr_audit so that permissions can be updated easily for all users.
So now when upgrading after 'log' -> 'report' transition, just do:

``sr_audit --reset True --users foreground``

and it will overwrite all the permission regexp's of the broker users.
If someone has funny permissions, that could be a problem.
* Added 'set_passwords' flag to sr_audit, defaulting to True.
if set to false, users are given blank passwords.... not sure if this is useful.
trying to understand what to do with this in the case of LDAP based users.
* Added creation of send directory to flow_setup.sh
* un-commented the over-ride default exchange for reporting in tsource2send.conf...
it still needs overriding.
* Corrected the regexp permission masks to allow sources to write to any
exchange that starts with xs_<user>... rather than just specifically that source.
* Corrected the regexp permissions to allow reading by subs from same.
* Reverted patch in sarra that broke download URL's.
* Add old log exchanges to sr_audit for compatibility with pre-transition clients.
* Changed test of sender to compare against the ones watch, rather than subscriber.
* Added measurable test to flow test for sender.
* Adding sr_watch to flow_test.
* Added sr_sender to flow test.
* Removing '/var' so log files are in the normal place now.
* Optimizing the flow_test script (so it's shorter, more straightforward and regular.)
* Documentation cleanup

**2.16.11a3**

* Fixing a cosmetic but ugly bug. Caused by the URL fix
* Add unready list to prevent posting unreadable files

**2.16.11a2**

* fix bug #61: change outputs to better present URL's in logs.
* just naming of some routines that were imported from sundew, add prefix ``metpx_``...
* fix bug #54: Adds interpretation of sundew-specific delivery options to sr_subscribe.

**2.1611a1**

* Another String too long fix.
* Potential fix for bug #55 (chdir)

**2.16.10a2**

* Fix issue #42 (header length in AMQP)
* Numerous doc changes

**2.16.10a1**

* Fixes to self test suite
* Added calls to the usage strings on a bunch of components
* Added centralized time format conversion in sr_util
* Added sr_report(1) manual page.
* Bugfix for headers too long.
* Patch to sr_poll to prevent crashing with post_exchange_split.
* Tentative fix for bug #50 improper requirement of write permissions
* Process headers dynamically
* Documentation Updates.

**2.16.08a1**

* Major Change: Changed "log" to "report" in all components.
* Added test case for sr_sender
* Documentation Update

**2.16.07a3**

* Ian's fix for sr_sender borked with post_exchange_split.
* Jun's fix for chmod and chmod_dir to be octal.

**2.16.07a2**

* Fixed typos that broke the package install in debian

**2.16.07a1**

* Added post_exchange_split config option (allows multiple cooperating sr_winnow instances) code, test suite mode, and documentation.
* fix logger output to file (bug #39 on sf)
* sr_amqp: Modified truncated exponential backoff to use multiplication instead of a table. So can modify max interval far more easily. Also values are better.
* nicer formatting of sleep debug print.
* sr_post/sr_watch: added atime and mtime to post. (FR #41)
* sr_watch: handle file rename in watch directory (addresses bug #40)
* sr_watch: fix for on_post trigger to be called after filtering events.
* sr_sender: Added chmod_dir support (bug #28)
* plugin work: Made 'script incorrect' message more explicit about what is wrong in the script.
* plugin work: word smithery, replaced 'script' by 'plugin' in execfile. so the messages refer to 'plugin' errors.
* Added plugin part_check, which verbosely checks checksums,
* plugin work: Added dmf_renamers, modified for current convention, and word smithery in programmers guide.
* Tested (de-bugged) the missing file_rxpipe plugin, added it to the default list.
* Documentation improvements: sundew compatibility options to sr_subscribe.
* Documentation improvements: moving code from subscriber to programming guide.
* Added a note for documenting difference between senders and subscription clients in the message plugins.
* Made reference to credentials.conf more explicit in all the command line component man pages. (Ian didn't understand he needed it... was not obvious.)
* Moved information about how to access credentials from plugin code from subscriber guide to programming guide.
* Turned a bit of the sr_watch man page into a CAVEAT section.
* Added a note about how file renaming is (poorly) handled at the moment.
* Test suite: removing overwrites of config files from test_sr_watch.sh
* Test suite: Continuing the quest: getting rid of passwords in debug output,
* Test suite: adding explicit mention of exchange wherever possible.
* Fixed self-test to authenticate to broker as tfeed, but look for messages from tsource.

**v2.16.05a2**

* plugins improved.
* sr_winnow fixed.
* stop printing passwords in log files.
* beginnings of flow_test implemented. ( self-testing configuration with multiple components fed.)

**v2.16.05a1**

* something about log message settings and permissions.
* reviewing log message generation (older versions too voluble.)
* setting a plugin to None removes it.
* moved logging mostly into plugins to make it more modular.
* added permission of user to read own exchange.
* added plugin examples to subscriber guide.
* working through Michel's self-tests, trying to get them to work.
* Added Programmer Guide.
* sr_sender modified to use truncated exponential backoff (to avoid hammering sites when they are down.)
* some credits.

**v2.16.03a10**

* documentation fixes.
* fixed sr_audit which had been broken.
* added 'foreground' to start/stop/status in usage statements.
* Daluma input on sr_watch.
* stop sr_audit from downloading rabbitmqadmin into cwd.
* Michel retired :-)

**v2.16.01a8**

* for earlier releases, please consult git log.

**v2.16.01a3**

**v2.16.01a2**

**v2.16.01a1**

**v2.15.12a4**

**v2.15.12a3**

**v2.15.12a2**

**v2.15.12a1**

* first version with all components extant.
* Build/tag process introduced.
* until now, had just been using master branch in git.

**0.0.1**
* development began in 2013.

* Initial release


Michel Grenier <michel.grenier@IamRetiredNow.ca> (Retired)
dd_subscribe, sr_subscribe, sr_sarra, sr_post,
All of the code, until 2016/03.

Jun Hu <jun.hu3@canada.ca>
Documentation Diagrams, lead on deployments (head tester!)

Peter Silva <peter.silva@canada.ca>
Project Manager & Evangelist. A lot of Documentation, and Review of Docs.
Architect? Much discussion with Michel. Small bug fixes.
wrote most (all?) plugins included with package.

Khosrow Ebrahimpour <khosrow.ebrahimpour@canada.ca>
Packaging & Process (Debian, Launchpad, some pypi, the vagrant self-test)

Daluma Sen <Daluma.Sen@canada.ca>
sr_watch, and worked on sr_post as well for caching.

Murray Rennie <Murray.Rennie@canada.ca>
sr_winnow, worked on that with Michel.



Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

metpx_sarracenia-2.17.10a1-py3-none-any.whl (203.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page