0

I have three EC2 instances in AWS:

  • instance A - docker with nginx container - private IP address 1.2.3.4
  • instance B and C - docker with keycloak containers - private IP address 1.2.3.5 and 1.2.3.6
  • RDS instance running MySQL 8 - host foo.us-east-1.rds.amazonaws.com

All in the same VPC. Instance B and C are in different subnets (different availability zones), but can communicate with each other via port 80 and 7600.

The docker instances launch without issue with the following command:

  docker run \
  --name test-node-1 \
  -e DB_PORT=3306 \
  -e PROXY_ADDRESS_FORWARDING=true \
  -e DB_VENDOR=mysql \
  -e DB_DATABASE=keycloak \
  -e DB_ADDR=foo.us-east-1.rds.amazonaws.com \
  -e KEYCLOAK_STATISTICS=all \
  -e DB_USER=keycloak \
  -e KEYCLOAK_USER=kcuser \
  -e DB_PASSWORD=... \
  -e KEYCLOAK_PASSWORD=... \
  -p 80:8080 \
  -p 7600:7600 \
  jboss/keycloak:16.1.0

Both containers launch fine, but they aren't talking to each other.

Adding the following three environment variables:

  -e JGROUPS_DISCOVERY_EXTERNAL_IP=1.2.3.5 \
  -e JGROUPS_DISCOVERY_PROTOCOL=TCPPING \
  -e JGROUPS_DISCOVERY_PROPERTIES='1.2.3.5[7600],1.2.3.6[7600]' \

Causes Keycloak to crash on startup:

=========================================================================

  Using MySQL database

=========================================================================

17:01:35,028 INFO  [org.jboss.modules] (CLI command executor) JBoss Modules version 2.0.0.Final
17:01:35,124 INFO  [org.jboss.msc] (CLI command executor) JBoss MSC version 1.4.13.Final
17:01:35,134 INFO  [org.jboss.threads] (CLI command executor) JBoss Threads version 2.4.0.Final
17:01:35,267 INFO  [org.jboss.as] (MSC service thread 1-2) WFLYSRV0049: Keycloak 16.1.0 (WildFly Core 18.0.0.Final) starting
...
17:01:43,320 INFO  [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server
17:01:43,322 INFO  [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: Keycloak 16.1.0 (WildFly Core 18.0.0.Final) started in 3261ms - Started 49 of 79 services (31 services are lazy, passive or on-demand)
The batch executed successfully
17:01:43,560 INFO  [org.jboss.as] (MSC service thread 1-1) WFLYSRV0050: Keycloak 16.1.0 (WildFly Core 18.0.0.Final) stopped in 21ms
Setting JGroups discovery to TCPPING with properties {1.2.3.5[7600],1.2.3.6[7600]}

That last log line hangs for a few seconds, and then the process crashes. Note that it's the FIRST instance that crashes (I never get to launching the second one), so I don't think it's a matter of communication/firewall/etc, but port 80 and 7600 are open.

I'm using the jboss/Keycloak docker image v16.1 from Docker Hub.

4
  • It doesn' make sense to have `-e JGROUPS_DISCOVERY_EXTERNAL_IP=1.2.3.5 ` on host A (which has IP 1.2.3.4). Also keycloak.org/2019/05/keycloak-cluster-setup refers to different syntax for JGROUPS_DISCOVERY_PROPERTIES Commented Jan 11, 2022 at 17:57
  • Instance A (IP 1.2.3.4) is only an nginx server, so there is no JGROUPS settings at all. Instance B (IP 1.2.3.5) is the first Keycloak server, which has JGROUPS_DISCOVERY_EXTERNAL_IP set to it's own IP address. Commented Jan 11, 2022 at 21:04
  • I have read through keycloak.org/2019/05/keycloak-cluster-setup (although it has a warning that it may be out of date). I believe what I have posted is inline with those suggestions (namely JGROUPS_DISCOVERY_EXTERNAL_IP, JGROUPS_DISCOVERY_PROTOCOL, and JGROUPS_DISCOVERY_PROPERTIES settings), but the server crashes when they are added. Commented Jan 11, 2022 at 21:06
  • Ok, that makes sense Commented Jan 11, 2022 at 22:02

1 Answer 1

0

The container will need a TCPPING.cli script, or the appropriate modifications made to standalone-ha.xml. The following TCPPING.cli file worked for me (mounted into the docker container with -v $(pwd)/TCPPING.cli:/opt/jboss/tools/cli/jgroups/discovery/TCPPING.cli):

embed-server --server-config=standalone-ha.xml --std-out=echo
batch

/subsystem=infinispan/cache-container=keycloak/distributed-cache=sessions:write-attribute(name=owners, value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=authenticationSessions:write-attribute(name=owners, value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=offlineSessions:write-attribute(name=owners, value=${env.CACHE_OWNERS:2})
/subsystem=infinispan/cache-container=keycloak/distributed-cache=loginFailures:write-attribute(name=owners, value=${env.CACHE_OWNERS:2})

/subsystem=jgroups/stack=udp:remove()

/subsystem=jgroups/stack=tcp/protocol=MPING:remove()
/subsystem=jgroups/stack=tcp/protocol=$keycloak_jgroups_discovery_protocol:add(add-index=0, properties={"initial_hosts"=>$keycloak_jgroups_discovery_protocol_properties})

/subsystem=jgroups/channel=ee:write-attribute(name=stack, value="tcp")

/subsystem=jgroups/stack=tcp/transport=TCP/property=external_addr/:add(value=${env.JGROUPS_DISCOVERY_EXTERNAL_IP:127.0.0.1})

run-batch
stop-embedded-server

Note that this is different from what is recommended in https://www.keycloak.org/2019/05/keycloak-cluster-setup - specifically the line

/subsystem=jgroups/stack=tcp/protocol=$keycloak_jgroups_discovery_protocol:add(add-index=0, properties={"initial_hosts"=>$keycloak_jgroups_discovery_protocol_properties})

I also changed the JGROUPS_DISCOVERY_PROPERTIES env var to only be the first server (e.g. -e JGROUPS_DISCOVERY_PROPERTIES=1.2.3.5[7600]) - each server in the cluster should just need to check with the master in order to join.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.