os_nova_ha_monitor

Python Openstack Nova HA monitor package

Project description

# Openstack Nova HA monitor package

Openstack Nova HA monitor is monitoring state of ocmpute nodes and when any of them fails it
will initial compute node takeover on HA node.

## Documentation

### Consul K/V store values

_ha_cluster/compute_nodes/{{ compute-node }}/status/failed_
- fail status of the compute node, if this is set to true, this node is considered failed,
and will be recovered. If this status becomes "True" it can only be reset to "False"
manually by Administrator.
_ha_cluster/compute_nodes/{{ compute-node }}/status/recovered_
- this value is becomes True, once HA node attempts to recover this node. HA nodes only
try to recover nodes which have "status/failed" set to True and "status/recovered"
set to "False". Once this becomes True, only andministrator can set it back to False
_ha_cluster/compute_nodes/{{ compute-node }}/internal/number_fails_
- counter of HA Monitor check interval failures. HA Monitor periodically checks
health of compute node, and increases this counter. When this counter reaches defined
threshold, the HA Monitor sets "status/failed" to True.
_ha_cluster/compute_nodes/{{ compute-node }}/internal/runtime_config_
- runtime local config of the compute node. This config contains all
required information for performing the recovery.
_ha_cluster/compute_nodes/{{ compute-node }}/config/bmc/user_
- ilo/ipmi user name used to managed this compute-node remotely
_ha_cluster/compute_nodes/{{ compute-node }}/config/bmc/password_
- ilo/ipmi user password used to managed this compute-node remotely
_ha_cluster/compute_nodes/{{ compute-node }}/config/bmc/ip_
- ilo/ipmi ip address used to managed this compute-node remotely
_ha_cluster/compute_nodes/{{ compute-node }}/config/ssh/user_
- ssh-check required test user login
_ha_cluster/compute_nodes/{{ compute-node }}/config/ssh/password_
- ssh-check required test user password
_ha_cluster/compute_nodes/{{ compute-node }}/general_
- empty directory for now

_ha_cluster/ha_nodes/{{ ha-node }}/status/in_use_
- When HA Node starts recovering a compute node, it marks it sets this value to True,
to prevent condition where two compute nodes fail at the same time and HA Node tries
to recover both simultaneously. Updating of this value is in critical section.
_ha_cluster/ha_nodes/{{ ha-node }}/general_
- empty directory for now

## Development

### Running unit tests

```console
pytest -v -s
```

### Running integration tests

Integration tests are implemented in `tests/integration` directory. They are based on vagrant which will start 2 VMs w

# TODO

1) Split Consul based cluster monitor from NOVA HA recovery part.

2) Create integration tests.

Project details

Release history Release notifications | RSS feed

This version

1.1.15

Dec 15, 2017

1.1.14

Dec 14, 2017

1.1.13

Dec 14, 2017

1.1.12

Jul 24, 2017

1.1.11

Jul 24, 2017

1.1.10

Jul 20, 2017

1.1.9

Jul 20, 2017

1.1.8

Jul 20, 2017

1.1.7

Jul 20, 2017

1.1.6

Jul 17, 2017

1.1.5

Jul 12, 2017

1.1.3

Jun 28, 2017

1.1.2

Jun 28, 2017

1.1.1

Jun 26, 2017

1.1.0

Jun 26, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

os_nova_ha_monitor-1.1.15.tar.gz (8.8 kB view hashes)

Uploaded Dec 15, 2017 Source

Hashes for os_nova_ha_monitor-1.1.15.tar.gz

Hashes for os_nova_ha_monitor-1.1.15.tar.gz
Algorithm	Hash digest
SHA256	`fa765f64b4e9b3edd8ae85df51864861cad7f2e921b2b977139150f418e4aef1`
MD5	`16d89362af3ba8369f156c77b2983f26`
BLAKE2b-256	`6fb7869b7b2f29285aca01fc40bd9e7d6b7b9050a01110ae44f94383dbbbfb37`