[Linux-ha-jp] STONITH リソースが停止・起動を繰り返す

Back to archive index

N.Miyamoto fj508****@aa*****
2011年 10月 8日 (土) 17:31:59 JST


いつもお世話になっております。
宮本です。

STONITH リソースが、停止・起動を繰り返す現象に遭遇しています。
原因と回避方法について、教えて下さい。

検証環境は、以下の通りです。
pacemaker-1.0.10-1.4.el5 + corosync-1.2.5-1.3.el5 です。

現象発生手順:
(1) crm < stonith-setup2.txt.node001

(2) crm_mon -f
    リソースがStarted になることを確認する。

    Online: [ node001 node002 ]
    
     Resource Group: rscgroup
         mntrsc1 (ocf::heartbeat:Filesystem):    Started node001
         mntrsc2 (ocf::heartbeat:Filesystem):    Started node001
         mgrrsc  (lsb:mgrrsc):   Started node001
         viprsc  (ocf::heartbeat:IPaddr2):       Started node001
     Resource Group: stonith-node001
         stonith-node001-1 (stonith:external/stonith-helper):      Started node002
         stonith-node001-2 (stonith:external/ipmi):        Started node002
         stonith-node001-3 (stonith:meatware):     Started node002
    
    Migration summary:
    * Node node001:
    * Node node002:

(3) crm < stonith-setup2.txt.node002

(4) crm_mon -f
    ☆stonith-node001 が停止?し、stonith-node002が起動する。
    
    Online: [ node001 node002 ]
    
     Resource Group: rscgroup
         mntrsc1 (ocf::heartbeat:Filesystem):    Started node001
         mntrsc2 (ocf::heartbeat:Filesystem):    Started node001
         mgrrsc  (lsb:mgrrsc):   Started node001
         viprsc  (ocf::heartbeat:IPaddr2):       Started node001
     Resource Group: stonith-node002
         stonith-node002-1 (stonith:external/stonith-helper):      Started node001
         stonith-node002-2 (stonith:external/ipmi):        Started node001
         stonith-node002-3 (stonith:meatware):     Started node001
    
    Migration summary:
    * Node node001:
    * Node node002:

(5) (2) →(4) →(2) →(4) →・・・と1分20秒程度の間隔で繰り返し。

現象発生時のログ:
Oct 08 17:05:59 node001 crmd: [4423]: WARN: action_timer_callback: Timer popped (timeout=20000, abort_level=0, complete=false)
Oct 08 17:05:59 node001 crmd: [4423]: ERROR: print_elem: Aborting transition, action lost: [Action 10]: In-flight (id: stonith-node002_delete_0, loc: node001, priority: 0)
Oct 08 17:05:59 node001 crmd: [4423]: info: abort_transition_graph: action_timer_callback:486 - Triggered transition abort (complete=0) : Action lost
Oct 08 17:05:59 node001 crmd: [4423]: info: update_abort_priority: Abort priority upgraded from 0 to 1000000
Oct 08 17:05:59 node001 crmd: [4423]: info: update_abort_priority: Abort action done superceeded by restart
Oct 08 17:05:59 node001 crmd: [4423]: WARN: cib_action_update: rsc_op 10: stonith-node002_delete_0 on node001 timed out
Oct 08 17:05:59 node001 crmd: [4423]: WARN: find_xml_node: Could not find primitive in rsc_op.
Oct 08 17:05:59 node001 crmd: [4423]: info: run_graph: ====================================================
Oct 08 17:05:59 node001 crmd: [4423]: notice: run_graph: Transition 55 (Complete=14, Pending=0, Fired=0, Skipped=9, Incomplete=0, Source=/var/lib/pengine/pe-input-1.bz2): Stopped
Oct 08 17:05:59 node001 crmd: [4423]: info: te_graph_trigger: Transition 55 is now complete
Oct 08 17:05:59 node001 crmd: [4423]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ]
Oct 08 17:05:59 node001 crmd: [4423]: info: do_state_transition: All 2 cluster nodes are eligible to run resources.
Oct 08 17:05:59 node001 crmd: [4423]: info: do_pe_invoke: Query 935: Requesting the current CIB: S_POLICY_ENGINE
Oct 08 17:05:59 node001 crmd: [4423]: info: do_pe_invoke_callback: Invoking the PE: query=935, ref=pe_calc-dc-1318061159-605, seq=130536, quorate=1
Oct 08 17:05:59 node001 pengine: [4422]: notice: unpack_config: On loss of CCM Quorum: Ignore
Oct 08 17:05:59 node001 pengine: [4422]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Oct 08 17:05:59 node001 pengine: [4422]: WARN: unpack_nodes: Blind faith: not fencing unseen nodes
Oct 08 17:05:59 node001 pengine: [4422]: info: determine_online_status: Node node001 is online
Oct 08 17:05:59 node001 pengine: [4422]: info: determine_online_status: Node node002 is online
Oct 08 17:05:59 node001 pengine: [4422]: notice: group_print:  Resource Group: rscgroup
Oct 08 17:05:59 node001 pengine: [4422]: notice: native_print:      mntrsc1 (ocf::heartbeat:Filesystem):    Started node001
Oct 08 17:05:59 node001 pengine: [4422]: notice: native_print:      mntrsc2 (ocf::heartbeat:Filesystem):    Started node001
Oct 08 17:05:59 node001 pengine: [4422]: notice: native_print:      mgrrsc  (lsb:mgrrsc):   Started node001
Oct 08 17:05:59 node001 pengine: [4422]: notice: native_print:      viprsc  (ocf::heartbeat:IPaddr2):       Started node001
Oct 08 17:05:59 node001 pengine: [4422]: notice: group_print:  Resource Group: stonith-node001
Oct 08 17:05:59 node001 pengine: [4422]: notice: native_print:      stonith-node001-1 (stonith:external/stonith-helper):      Started node002
Oct 08 17:05:59 node001 pengine: [4422]: notice: native_print:      stonith-node001-2 (stonith:external/ipmi):        Started node002
Oct 08 17:05:59 node001 pengine: [4422]: notice: native_print:      stonith-node001-3 (stonith:meatware):     Started node002
Oct 08 17:05:59 node001 pengine: [4422]: notice: group_print:  Resource Group: stonith-node002
Oct 08 17:05:59 node001 pengine: [4422]: notice: native_print:      stonith-node002-1 (stonith:external/stonith-helper):      Stopped
Oct 08 17:05:59 node001 pengine: [4422]: notice: native_print:      stonith-node002-2 (stonith:external/ipmi):        Stopped
Oct 08 17:05:59 node001 pengine: [4422]: notice: native_print:      stonith-node002-3 (stonith:meatware):     Stopped
Oct 08 17:05:59 node001 pengine: [4422]: notice: check_rsc_parameters: Forcing restart of stonith-node001 on node002, type changed: external/ipmi -> <null>
Oct 08 17:05:59 node001 pengine: [4422]: notice: check_rsc_parameters: Forcing restart of stonith-node001 on node002, class changed: stonith -> <null>
Oct 08 17:05:59 node001 pengine: [4422]: notice: DeleteRsc: Removing stonith-node001 from node002
Oct 08 17:05:59 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:05:59 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:05:59 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:05:59 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:05:59 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:05:59 node001 pengine: [4422]: notice: RecurringOp:  Start recurring monitor (60s) for stonith-node002-1 on node001
Oct 08 17:05:59 node001 pengine: [4422]: notice: RecurringOp:  Start recurring monitor (10s) for stonith-node002-2 on node001
Oct 08 17:05:59 node001 pengine: [4422]: notice: RecurringOp:  Start recurring monitor (10s) for stonith-node002-3 on node001
Oct 08 17:05:59 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:05:59 node001 pengine: [4422]: notice: LogActions: Leave resource mntrsc1 (Started node001)
Oct 08 17:05:59 node001 pengine: [4422]: notice: LogActions: Leave resource mntrsc2 (Started node001)
Oct 08 17:05:59 node001 pengine: [4422]: notice: LogActions: Leave resource mgrrsc  (Started node001)
Oct 08 17:05:59 node001 pengine: [4422]: notice: LogActions: Leave resource viprsc  (Started node001)
Oct 08 17:05:59 node001 pengine: [4422]: notice: LogActions: Restart resource stonith-node001-1       (Started node002)
Oct 08 17:05:59 node001 pengine: [4422]: notice: LogActions: Restart resource stonith-node001-2       (Started node002)
Oct 08 17:05:59 node001 pengine: [4422]: notice: LogActions: Restart resource stonith-node001-3       (Started node002)
Oct 08 17:05:59 node001 pengine: [4422]: notice: LogActions: Start stonith-node002-1  (node001)
Oct 08 17:05:59 node001 pengine: [4422]: notice: LogActions: Start stonith-node002-2  (node001)
Oct 08 17:05:59 node001 pengine: [4422]: notice: LogActions: Start stonith-node002-3  (node001)
Oct 08 17:05:59 node001 crmd: [4423]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Oct 08 17:05:59 node001 crmd: [4423]: info: unpack_graph: Unpacked transition 56: 23 actions in 23 synapses
Oct 08 17:05:59 node001 crmd: [4423]: info: do_te_invoke: Processing graph 56 (ref=pe_calc-dc-1318061159-605) derived from /var/lib/pengine/pe-input-2.bz2
Oct 08 17:05:59 node001 crmd: [4423]: info: te_pseudo_action: Pseudo action 9 fired and confirmed
Oct 08 17:05:59 node001 crmd: [4423]: info: te_rsc_command: Initiating action 10: delete stonith-node001_delete_0 on node002
Oct 08 17:05:59 node001 crmd: [4423]: info: te_rsc_command: Initiating action 31: stop stonith-node001-3_stop_0 on node002
Oct 08 17:05:59 node001 crmd: [4423]: info: te_pseudo_action: Pseudo action 42 fired and confirmed
Oct 08 17:05:59 node001 crmd: [4423]: info: te_rsc_command: Initiating action 36: start stonith-node002-1_start_0 on node001 (local)
Oct 08 17:05:59 node001 crmd: [4423]: info: do_lrm_rsc_op: Performing key=36:56:0:76d16842-4a6f-4ae1-908b-890f2c3926c1 op=stonith-node002-1_start_0 )
Oct 08 17:05:59 node001 lrmd: [4420]: info: rsc:stonith-node002-1:152: start
Oct 08 17:05:59 node001 lrmd: [28668]: info: Try to start STONITH resource <rsc_id=stonith-node002-1> : Device=external/stonith-helper
Oct 08 17:05:59 node001 stonithd: [4418]: info: Cannot get parameter run_dead_check from StonithNVpair
Oct 08 17:05:59 node001 stonithd: [4418]: info: Cannot get parameter run_quorum_check from StonithNVpair
Oct 08 17:05:59 node001 stonithd: [4418]: info: Cannot get parameter run_standby_wait from StonithNVpair
Oct 08 17:05:59 node001 stonithd: [4418]: info: Cannot get parameter check_quorum_wait_time from StonithNVpair
Oct 08 17:05:59 node001 crmd: [4423]: info: match_graph_event: Action stonith-node001-3_stop_0 (31) confirmed on node002 (rc=0)
Oct 08 17:05:59 node001 crmd: [4423]: info: te_rsc_command: Initiating action 29: stop stonith-node001-2_stop_0 on node002
Oct 08 17:05:59 node001 crmd: [4423]: info: match_graph_event: Action stonith-node001-2_stop_0 (29) confirmed on node002 (rc=0)
Oct 08 17:05:59 node001 crmd: [4423]: info: te_rsc_command: Initiating action 27: stop stonith-node001-1_stop_0 on node002
Oct 08 17:05:59 node001 crmd: [4423]: info: match_graph_event: Action stonith-node001-1_stop_0 (27) confirmed on node002 (rc=0)
Oct 08 17:05:59 node001 crmd: [4423]: info: te_pseudo_action: Pseudo action 35 fired and confirmed
Oct 08 17:05:59 node001 pengine: [4422]: info: process_pe_message: Transition 56: PEngine Input stored in: /var/lib/pengine/pe-input-2.bz2
Oct 08 17:05:59 node001 pengine: [4422]: info: process_pe_message: Configuration ERRORs found during PE processing.  Please run "crm_verify -L" to identify issues.
Oct 08 17:06:00 node001 stonithd: [4418]: info: stonith-node002-1 stonith resource started
Oct 08 17:06:00 node001 lrmd: [4420]: debug: stonithRA plugin: provider attribute is not needed and will be ignored.
Oct 08 17:06:00 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-1_start_0 (call=152, rc=0, cib-update=936, confirmed=true) ok
Oct 08 17:06:00 node001 crmd: [4423]: info: match_graph_event: Action stonith-node002-1_start_0 (36) confirmed on node001 (rc=0)
Oct 08 17:06:00 node001 crmd: [4423]: info: te_rsc_command: Initiating action 37: monitor stonith-node002-1_monitor_60000 on node001 (local)
Oct 08 17:06:00 node001 crmd: [4423]: info: do_lrm_rsc_op: Performing key=37:56:0:76d16842-4a6f-4ae1-908b-890f2c3926c1 op=stonith-node002-1_monitor_60000 )
Oct 08 17:06:00 node001 lrmd: [4420]: info: rsc:stonith-node002-1:153: monitor
Oct 08 17:06:00 node001 crmd: [4423]: info: te_rsc_command: Initiating action 38: start stonith-node002-2_start_0 on node001 (local)
Oct 08 17:06:00 node001 crmd: [4423]: info: do_lrm_rsc_op: Performing key=38:56:0:76d16842-4a6f-4ae1-908b-890f2c3926c1 op=stonith-node002-2_start_0 )
Oct 08 17:06:00 node001 lrmd: [4420]: info: rsc:stonith-node002-2:154: start
Oct 08 17:06:00 node001 lrmd: [28734]: info: Try to start STONITH resource <rsc_id=stonith-node002-2> : Device=external/ipmi
Oct 08 17:06:00 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-1_monitor_60000 (call=153, rc=0, cib-update=937, confirmed=false) ok
Oct 08 17:06:00 node001 stonithd: [4418]: info: stonith-node002-2 stonith resource started
Oct 08 17:06:00 node001 crmd: [4423]: info: match_graph_event: Action stonith-node002-1_monitor_60000 (37) confirmed on node001 (rc=0)
Oct 08 17:06:00 node001 lrmd: [4420]: debug: stonithRA plugin: provider attribute is not needed and will be ignored.
Oct 08 17:06:00 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-2_start_0 (call=154, rc=0, cib-update=938, confirmed=true) ok
Oct 08 17:06:00 node001 crmd: [4423]: info: match_graph_event: Action stonith-node002-2_start_0 (38) confirmed on node001 (rc=0)
Oct 08 17:06:00 node001 crmd: [4423]: info: te_rsc_command: Initiating action 39: monitor stonith-node002-2_monitor_10000 on node001 (local)
Oct 08 17:06:00 node001 crmd: [4423]: info: do_lrm_rsc_op: Performing key=39:56:0:76d16842-4a6f-4ae1-908b-890f2c3926c1 op=stonith-node002-2_monitor_10000 )
Oct 08 17:06:00 node001 lrmd: [4420]: info: rsc:stonith-node002-2:155: monitor
Oct 08 17:06:00 node001 crmd: [4423]: info: te_rsc_command: Initiating action 40: start stonith-node002-3_start_0 on node001 (local)
Oct 08 17:06:00 node001 crmd: [4423]: info: do_lrm_rsc_op: Performing key=40:56:0:76d16842-4a6f-4ae1-908b-890f2c3926c1 op=stonith-node002-3_start_0 )
Oct 08 17:06:00 node001 lrmd: [4420]: info: rsc:stonith-node002-3:156: start
Oct 08 17:06:00 node001 lrmd: [28772]: info: Try to start STONITH resource <rsc_id=stonith-node002-3> : Device=meatware
Oct 08 17:06:00 node001 stonithd: [4418]: info: parse config info info=node002
Oct 08 17:06:00 node001 stonithd: [4418]: info: stonith-node002-3 stonith resource started
Oct 08 17:06:00 node001 lrmd: [4420]: debug: stonithRA plugin: provider attribute is not needed and will be ignored.
Oct 08 17:06:00 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-3_start_0 (call=156, rc=0, cib-update=939, confirmed=true) ok
Oct 08 17:06:00 node001 crmd: [4423]: info: match_graph_event: Action stonith-node002-3_start_0 (40) confirmed on node001 (rc=0)
Oct 08 17:06:00 node001 crmd: [4423]: info: te_pseudo_action: Pseudo action 43 fired and confirmed
Oct 08 17:06:00 node001 crmd: [4423]: info: te_rsc_command: Initiating action 41: monitor stonith-node002-3_monitor_10000 on node001 (local)
Oct 08 17:06:00 node001 crmd: [4423]: info: do_lrm_rsc_op: Performing key=41:56:0:76d16842-4a6f-4ae1-908b-890f2c3926c1 op=stonith-node002-3_monitor_10000 )
Oct 08 17:06:00 node001 lrmd: [4420]: info: rsc:stonith-node002-3:157: monitor
Oct 08 17:06:00 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-3_monitor_10000 (call=157, rc=0, cib-update=940, confirmed=false) ok
Oct 08 17:06:00 node001 crmd: [4423]: info: match_graph_event: Action stonith-node002-3_monitor_10000 (41) confirmed on node001 (rc=0)
Oct 08 17:06:01 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-2_monitor_10000 (call=155, rc=0, cib-update=941, confirmed=false) ok
Oct 08 17:06:01 node001 crmd: [4423]: info: match_graph_event: Action stonith-node002-2_monitor_10000 (39) confirmed on node001 (rc=0)
Oct 08 17:06:38 node001 cib: [4419]: info: cib_stats: Processed 80 operations (3000.00us average, 0% utilization) in the last 10min
Oct 08 17:07:19 node001 crmd: [4423]: WARN: action_timer_callback: Timer popped (timeout=20000, abort_level=0, complete=false)
Oct 08 17:07:19 node001 crmd: [4423]: ERROR: print_elem: Aborting transition, action lost: [Action 10]: In-flight (id: stonith-node001_delete_0, loc: node002, priority: 0)
Oct 08 17:07:19 node001 crmd: [4423]: info: abort_transition_graph: action_timer_callback:486 - Triggered transition abort (complete=0) : Action lost
Oct 08 17:07:19 node001 crmd: [4423]: info: update_abort_priority: Abort priority upgraded from 0 to 1000000
Oct 08 17:07:19 node001 crmd: [4423]: info: update_abort_priority: Abort action done superceeded by restart
Oct 08 17:07:19 node001 crmd: [4423]: WARN: cib_action_update: rsc_op 10: stonith-node001_delete_0 on node002 timed out
Oct 08 17:07:19 node001 crmd: [4423]: WARN: find_xml_node: Could not find primitive in rsc_op.
Oct 08 17:07:19 node001 crmd: [4423]: info: run_graph: ====================================================
Oct 08 17:07:19 node001 crmd: [4423]: notice: run_graph: Transition 56 (Complete=14, Pending=0, Fired=0, Skipped=9, Incomplete=0, Source=/var/lib/pengine/pe-input-2.bz2): Stopped
Oct 08 17:07:19 node001 crmd: [4423]: info: te_graph_trigger: Transition 56 is now complete
Oct 08 17:07:19 node001 crmd: [4423]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ]
Oct 08 17:07:19 node001 crmd: [4423]: info: do_state_transition: All 2 cluster nodes are eligible to run resources.
Oct 08 17:07:19 node001 crmd: [4423]: info: do_pe_invoke: Query 942: Requesting the current CIB: S_POLICY_ENGINE
Oct 08 17:07:19 node001 crmd: [4423]: info: do_pe_invoke_callback: Invoking the PE: query=942, ref=pe_calc-dc-1318061239-616, seq=130536, quorate=1
Oct 08 17:07:19 node001 pengine: [4422]: notice: unpack_config: On loss of CCM Quorum: Ignore
Oct 08 17:07:19 node001 pengine: [4422]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Oct 08 17:07:19 node001 pengine: [4422]: WARN: unpack_nodes: Blind faith: not fencing unseen nodes
Oct 08 17:07:19 node001 pengine: [4422]: info: determine_online_status: Node node001 is online
Oct 08 17:07:19 node001 pengine: [4422]: info: determine_online_status: Node node002 is online
Oct 08 17:07:19 node001 pengine: [4422]: notice: group_print:  Resource Group: rscgroup
Oct 08 17:07:19 node001 pengine: [4422]: notice: native_print:      mntrsc1 (ocf::heartbeat:Filesystem):    Started node001
Oct 08 17:07:19 node001 pengine: [4422]: notice: native_print:      mntrsc2 (ocf::heartbeat:Filesystem):    Started node001
Oct 08 17:07:19 node001 pengine: [4422]: notice: native_print:      mgrrsc  (lsb:mgrrsc):   Started node001
Oct 08 17:07:19 node001 pengine: [4422]: notice: native_print:      viprsc  (ocf::heartbeat:IPaddr2):       Started node001
Oct 08 17:07:19 node001 pengine: [4422]: notice: group_print:  Resource Group: stonith-node001
Oct 08 17:07:19 node001 pengine: [4422]: notice: native_print:      stonith-node001-1 (stonith:external/stonith-helper):      Stopped
Oct 08 17:07:19 node001 pengine: [4422]: notice: native_print:      stonith-node001-2 (stonith:external/ipmi):        Stopped
Oct 08 17:07:19 node001 pengine: [4422]: notice: native_print:      stonith-node001-3 (stonith:meatware):     Stopped
Oct 08 17:07:19 node001 pengine: [4422]: notice: group_print:  Resource Group: stonith-node002
Oct 08 17:07:19 node001 pengine: [4422]: notice: native_print:      stonith-node002-1 (stonith:external/stonith-helper):      Started node001
Oct 08 17:07:19 node001 pengine: [4422]: notice: native_print:      stonith-node002-2 (stonith:external/ipmi):        Started node001
Oct 08 17:07:19 node001 pengine: [4422]: notice: native_print:      stonith-node002-3 (stonith:meatware):     Started node001
Oct 08 17:07:19 node001 pengine: [4422]: notice: check_rsc_parameters: Forcing restart of stonith-node002 on node001, type changed: external/ipmi -> <null>
Oct 08 17:07:19 node001 pengine: [4422]: notice: check_rsc_parameters: Forcing restart of stonith-node002 on node001, class changed: stonith -> <null>
Oct 08 17:07:19 node001 pengine: [4422]: notice: DeleteRsc: Removing stonith-node002 from node001
Oct 08 17:07:19 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:07:19 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:07:19 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:07:19 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:07:19 node001 pengine: [4422]: notice: RecurringOp:  Start recurring monitor (60s) for stonith-node001-1 on node002
Oct 08 17:07:19 node001 pengine: [4422]: notice: RecurringOp:  Start recurring monitor (10s) for stonith-node001-2 on node002
Oct 08 17:07:19 node001 pengine: [4422]: notice: RecurringOp:  Start recurring monitor (10s) for stonith-node001-3 on node002
Oct 08 17:07:19 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:07:19 node001 pengine: [4422]: ERROR: unpack_operation: Specifying on_fail=fence and stonith-enabled=false makes no sense
Oct 08 17:07:19 node001 pengine: [4422]: notice: LogActions: Leave resource mntrsc1 (Started node001)
Oct 08 17:07:19 node001 pengine: [4422]: notice: LogActions: Leave resource mntrsc2 (Started node001)
Oct 08 17:07:19 node001 pengine: [4422]: notice: LogActions: Leave resource mgrrsc  (Started node001)
Oct 08 17:07:19 node001 pengine: [4422]: notice: LogActions: Leave resource viprsc  (Started node001)
Oct 08 17:07:19 node001 pengine: [4422]: notice: LogActions: Start stonith-node001-1  (node002)
Oct 08 17:07:19 node001 pengine: [4422]: notice: LogActions: Start stonith-node001-2  (node002)
Oct 08 17:07:19 node001 pengine: [4422]: notice: LogActions: Start stonith-node001-3  (node002)
Oct 08 17:07:19 node001 pengine: [4422]: notice: LogActions: Restart resource stonith-node002-1       (Started node001)
Oct 08 17:07:19 node001 pengine: [4422]: notice: LogActions: Restart resource stonith-node002-2       (Started node001)
Oct 08 17:07:19 node001 pengine: [4422]: notice: LogActions: Restart resource stonith-node002-3       (Started node001)
Oct 08 17:07:19 node001 crmd: [4423]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Oct 08 17:07:19 node001 crmd: [4423]: info: unpack_graph: Unpacked transition 57: 23 actions in 23 synapses
Oct 08 17:07:19 node001 crmd: [4423]: info: do_te_invoke: Processing graph 57 (ref=pe_calc-dc-1318061239-616) derived from /var/lib/pengine/pe-input-3.bz2
Oct 08 17:07:19 node001 crmd: [4423]: info: te_pseudo_action: Pseudo action 33 fired and confirmed
Oct 08 17:07:19 node001 crmd: [4423]: info: te_rsc_command: Initiating action 27: start stonith-node001-1_start_0 on node002
Oct 08 17:07:19 node001 crmd: [4423]: info: te_pseudo_action: Pseudo action 9 fired and confirmed
Oct 08 17:07:19 node001 crmd: [4423]: info: te_rsc_command: Initiating action 10: delete stonith-node002_delete_0 on node001 (local)
Oct 08 17:07:19 node001 crmd: [4423]: WARN: find_xml_node: Could not find primitive in rsc_op.
Oct 08 17:07:19 node001 crmd: [4423]: ERROR: crm_abort: do_lrm_invoke: Triggered asser****@lrm*****:1285 : xml_rsc != NULL
Oct 08 17:07:19 node001 crmd: [4423]: info: te_rsc_command: Initiating action 41: stop stonith-node002-3_stop_0 on node001 (local)
Oct 08 17:07:19 node001 lrmd: [4420]: info: cancel_op: operation monitor[157] on stonith::meatware::stonith-node002-3 for client 4423, its parameters: CRM_meta_interval=[10000] on_fail=[restart] stonith-timeout=[600s] hostlist=[node002] CRM_meta_on_fail=[restart] CRM_meta_timeout=[30000] crm_feature_set=[3.0.1] priority=[3] CRM_meta_name=[monitor]  cancelled
Oct 08 17:07:19 node001 crmd: [4423]: info: do_lrm_rsc_op: Performing key=41:57:0:76d16842-4a6f-4ae1-908b-890f2c3926c1 op=stonith-node002-3_stop_0 )
Oct 08 17:07:19 node001 lrmd: [4420]: info: rsc:stonith-node002-3:158: stop
Oct 08 17:07:19 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-3_monitor_10000 (call=157, status=1, cib-update=0, confirmed=true) Cancelled
Oct 08 17:07:19 node001 lrmd: [29829]: info: Try to stop STONITH resource <rsc_id=stonith-node002-3> : Device=meatware
Oct 08 17:07:19 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-3_stop_0 (call=158, rc=0, cib-update=943, confirmed=true) ok
Oct 08 17:07:19 node001 crmd: [4423]: info: match_graph_event: Action stonith-node002-3_stop_0 (41) confirmed on node001 (rc=0)
Oct 08 17:07:19 node001 crmd: [4423]: info: te_rsc_command: Initiating action 39: stop stonith-node002-2_stop_0 on node001 (local)
Oct 08 17:07:19 node001 lrmd: [4420]: info: cancel_op: operation monitor[155] on stonith::external/ipmi::stonith-node002-2 for client 4423, its parameters: CRM_meta_interval=[10000] ipaddr=[172.25.1.2] on_fail=[restart] interface=[lan] CRM_meta_on_fail=[restart] CRM_meta_timeout=[30000] crm_feature_set=[3.0.1] priority=[2] CRM_meta_name=[monitor] hostname=[node002] passwd=[admin00] userid=[admin]  cancelled
Oct 08 17:07:19 node001 crmd: [4423]: info: do_lrm_rsc_op: Performing key=39:57:0:76d16842-4a6f-4ae1-908b-890f2c3926c1 op=stonith-node002-2_stop_0 )
Oct 08 17:07:19 node001 lrmd: [4420]: info: rsc:stonith-node002-2:159: stop
Oct 08 17:07:19 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-2_monitor_10000 (call=155, status=1, cib-update=0, confirmed=true) Cancelled
Oct 08 17:07:19 node001 lrmd: [29831]: info: Try to stop STONITH resource <rsc_id=stonith-node002-2> : Device=external/ipmi
Oct 08 17:07:19 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-2_stop_0 (call=159, rc=0, cib-update=944, confirmed=true) ok
Oct 08 17:07:19 node001 crmd: [4423]: info: match_graph_event: Action stonith-node002-2_stop_0 (39) confirmed on node001 (rc=0)
Oct 08 17:07:19 node001 crmd: [4423]: info: te_rsc_command: Initiating action 37: stop stonith-node002-1_stop_0 on node001 (local)
Oct 08 17:07:19 node001 lrmd: [4420]: info: cancel_op: operation monitor[153] on stonith::external/stonith-helper::stonith-node002-1 for client 4423, its parameters: CRM_meta_interval=[60000] standby_wait_time=[15] stonith-timeout=[180s] hostlist=[node002] CRM_meta_on_fail=[restart] CRM_meta_timeout=[30000] standby_check_command=[/usr/sbin/crm_resource -r rscgroup -W | grep -q `hostnamcrm_feature_set=[3.0.1] priority=[1] CRM_meta_name=[monitor] dead_check_target=[172.25.0.2]  cancelled
Oct 08 17:07:19 node001 crmd: [4423]: info: do_lrm_rsc_op: Performing key=37:57:0:76d16842-4a6f-4ae1-908b-890f2c3926c1 op=stonith-node002-1_stop_0 )
Oct 08 17:07:19 node001 lrmd: [4420]: info: rsc:stonith-node002-1:160: stop
Oct 08 17:07:19 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-1_monitor_60000 (call=153, status=1, cib-update=0, confirmed=true) Cancelled
Oct 08 17:07:19 node001 lrmd: [29833]: info: Try to stop STONITH resource <rsc_id=stonith-node002-1> : Device=external/stonith-helper
Oct 08 17:07:19 node001 crmd: [4423]: info: process_lrm_event: LRM operation stonith-node002-1_stop_0 (call=160, rc=0, cib-update=945, confirmed=true) ok
Oct 08 17:07:19 node001 crmd: [4423]: info: match_graph_event: Action stonith-node002-1_stop_0 (37) confirmed on node001 (rc=0)
Oct 08 17:07:19 node001 crmd: [4423]: info: te_pseudo_action: Pseudo action 45 fired and confirmed
Oct 08 17:07:19 node001 pengine: [4422]: info: process_pe_message: Transition 57: PEngine Input stored in: /var/lib/pengine/pe-input-3.bz2
Oct 08 17:07:19 node001 pengine: [4422]: info: process_pe_message: Configuration ERRORs found during PE processing.  Please run "crm_verify -L" to identify issues.
Oct 08 17:07:19 node001 crmd: [4423]: info: match_graph_event: Action stonith-node001-1_start_0 (27) confirmed on node002 (rc=0)
Oct 08 17:07:19 node001 crmd: [4423]: info: te_rsc_command: Initiating action 28: monitor stonith-node001-1_monitor_60000 on node002
Oct 08 17:07:19 node001 crmd: [4423]: info: te_rsc_command: Initiating action 29: start stonith-node001-2_start_0 on node002
Oct 08 17:07:19 node001 crmd: [4423]: info: match_graph_event: Action stonith-node001-2_start_0 (29) confirmed on node002 (rc=0)
Oct 08 17:07:19 node001 crmd: [4423]: info: te_rsc_command: Initiating action 30: monitor stonith-node001-2_monitor_10000 on node002
Oct 08 17:07:19 node001 crmd: [4423]: info: te_rsc_command: Initiating action 31: start stonith-node001-3_start_0 on node002
Oct 08 17:07:19 node001 crmd: [4423]: info: match_graph_event: Action stonith-node001-3_start_0 (31) confirmed on node002 (rc=0)
Oct 08 17:07:19 node001 crmd: [4423]: info: te_pseudo_action: Pseudo action 34 fired and confirmed
Oct 08 17:07:19 node001 crmd: [4423]: info: te_rsc_command: Initiating action 32: monitor stonith-node001-3_monitor_10000 on node002
Oct 08 17:07:19 node001 crmd: [4423]: info: match_graph_event: Action stonith-node001-3_monitor_10000 (32) confirmed on node002 (rc=0)
Oct 08 17:07:19 node001 crmd: [4423]: info: match_graph_event: Action stonith-node001-2_monitor_10000 (30) confirmed on node002 (rc=0)
Oct 08 17:07:20 node001 crmd: [4423]: info: match_graph_event: Action stonith-node001-1_monitor_60000 (28) confirmed on node002 (rc=0)
Oct 08 17:08:39 node001 crmd: [4423]: WARN: action_timer_callback: Timer popped (timeout=20000, abort_level=0, complete=false)

以上ですが、宜しくお願いします。

----------------------------------------------
Nobuaki Miyamoto
mail:fj508****@aa*****





Linux-ha-japan メーリングリストの案内
Back to archive index