0

I'm using the AWS CDK to add lifecycle hooks to my ALB in order to be notified by email when an instance gets terminated or when a new one is started. This is what my code looks like:

topic = sns.Topic(my_stack, "fleet_changes")
topic_hook = autoscaling_hooktargets.TopicHook(topic)

asg.add_lifecycle_hook(
    'instance-terminating-hook',
    lifecycle_transition=autoscaling.LifecycleTransition.INSTANCE_TERMINATING,
    lifecycle_hook_name="instance-terminating",
    notification_target=topic_hook,
    notification_metadata="INFO: An instance has been terminated"
)

asg.add_lifecycle_hook(
    'instance-launching-hook',
    lifecycle_transition=autoscaling.LifecycleTransition.INSTANCE_LAUNCHING,
    lifecycle_hook_name="instance-launching",
    notification_target=topic_hook,
    notification_metadata="INFO: A new instance has been launched"
)

This works fine as I'm indeed getting notifications when an instance is terminated or launched. The problem is that the instances are terminated every hour, and this stops happening when I remove the lifecycle hooks. This is what I see in the ALB events when the hooks are set:

At 2024-03-22T17:11:34Z an instance was taken out of service in response to a launch failure.

And right after:

At 2024-03-22T17:12:35Z an instance was started in response to a difference between desired and actual capacity, increasing the capacity from 1 to 2.   

I've checked the application logs in Cloudwatch and haven't found anything that would indicate a bug, and there is currently little to no activity on these instances.

Any idea what could be the culprit?

Thank you.

1
  • Some helpful diagnostic steps here. Commented Mar 22, 2024 at 17:38

1 Answer 1

0

You have to complete the lifecycle action after getting it. Use something like the following when getting it:

aws autoscaling complete-lifecycle-action --lifecycle-action-result CONTINUE \
  --lifecycle-hook-name my-launch-hook --auto-scaling-group-name my-asg \
  --lifecycle-action-token bcd2f1b8-9a78-44d3-8a7a-4dd07d7cf635

You can also put it in UserData to automate it. You would have to give your insurance permission to do so in its instance profile.

    aws autoscaling complete-lifecycle-action --lifecycle-action-result CONTINUE \
  --instance-id i-1a2b3c4d --lifecycle-hook-name my-launch-hook \
  --auto-scaling-group-name my-asg
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for your replies. Now, that's a bit confusing because I did a few things since then. First I set the default action to CONTINUE, then reduced the heartbeat to 30 seconds. This backfired very badly because the instances would keeping cycling and I really don't know why. I ended up just removing the lifecycle for instance start-up, which works fine so far. I still don't understand why I was getting the previous outcomes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.