3

I have a redis server and I want to implement an atomic (or pseudo atomic) method that will do the following (NOTICE: I have a system that has multiple sessions to the redis server) :

  1. If some key K exists get the value for it
  2. Otherwise, call SETNX function with a random value that is generated by some function F(that generates salts)
  3. Ask redis for the value of key K that was just generated by the current session (or by another session "simultaneously" - a short moment before the current session generated it)

The reasons that I don't want to pre-generate (before checking if the value exists) a value with function F, and use it if the key doesn't exist are :

  1. I don't want to call F with no justification (it might cause an intensive CPU behaviour(
  2. I want to avoid the next problematic situation : T1 : Session 1 generated a random value VAL1 T2 : Session 1 asked if key K exists and got "False" T3 : Session 2 generated a random value VAL2 T4 : Session 2 asked if key K exists and got "False" T5 : Session 2 calls SETNX with the value VAL2 and uses VAL2 from now on T6 : Session 1 calls SETNX with the value VAL1 and uses VAL1 from now on where the actual value of key K is VAL2

A python pseudo-code that I created is :

    import redis
    r = redis.StrictRedis(host='localhost', port=6379, db=0)
    ''' gets the value of key K if exists (r.get(K) if r.exists(K)), 
    otherwise gets the value of key K if calling SETNX function returned TRUE 
    (else r.get(K) if r.setnx(K,F())), meaning this the sent value is really the value,
    otherwise, get the value of key K, that was generated by another session a         
    short moment ago (else r.get(K))
    The last one is redundant and I can write "**r.setnx(K,F()) or True**" to get the 
    latest value instead, but the syntax requires an "else" clause at the end '''
    r.get(K) if r.exists(K) else r.get(K) if r.setnx(K,F()) else r.get(K)

Is there another solution?

1 Answer 1

7

Yes, you can use WATCH for this. Here's a modified example with redis-py:

def atomic_get_set(some_key):
    with r.pipeline() as pipe:
        try:
            # put a WATCH on the key that holds our sequence value
            pipe.watch(some_key)
            # after WATCHing, the pipeline is put into immediate execution
            # mode until we tell it to start buffering commands again.
            # this allows us to get the current value of our sequence
            if pipe.exists(some_key):
                return pipe.get(some_key)
            # now we can put the pipeline back into buffered mode with MULTI
            pipe.multi()
            pipe.set(some_key, F())
            pipe.get(some_key)
            # and finally, execute the pipeline (the set and get commands)
            return pipe.execute()[-1]
            # if a WatchError wasn't raised during execution, everything
            # we just did happened atomically.
        except WatchError:
            # another client must have changed some_key between
            # the time we started WATCHing it and the pipeline's execution.
            # Let's just get the value they changed it to.
            return pipe.get(some_key)
Sign up to request clarification or add additional context in comments.

12 Comments

Will it create a transaction every time we call atomic_get_set method? isn't it an overhead? My system characteristics are that I won't have too much keys/entries, and once a key has a value, it won't change, so maybe a if r.exists(some_key): return r.get(some_key) at the beginning of the function will avoid constantly creating new transactions
We're running if pipe.exists(some_key): return pipe.get(some_key) in immediate execution mode, so no overhead there. Really, on the Redis side, everything goes so fast the bottlenecks are almost always going to be on the network side, and this code will do everything you want with at most 2 networks back-and-forths. If you need to go even faster, you could write F() in lua and run the whole thing as a single Lua call, but unless you're running this function many times a second, the savings from that probably aren't worth it. Don't prematurely optimize unless you realize you have to.
there is a problem with return pipe.get(some_key) clause, it doesn't return the actual string value, but a StrictPipeline object..
Apparently, the StrictPipeline's execute method returns an array of the results that were executed in the pipeline so instead of lines "return pipe.get(some_key)" + "pipe.execute()", I think it should "return pipe.get(some_key).execute()[-1]" to get the result of the last command
The developer respond : A pipeline instance can operate in two modes. The default mode is to buffer all commands until pipeline.execute() is called. The execute method returns a list of results from the buffered commands. Optionally, you can use the pipeline.watch(key1, key2, ...) method to put the pipeline in immediate execution mode. When watching keys, pipeline commands will return results immediately until pipeline.multi() is called.The multi method will return the pipeline to the default buffered execution mode.This is what Eli suggests with his answer.It seems you forgot the watch method
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.