0

Recently, I encountered a scheduling problem in a distributed system and I hope to get some help: for a multi-stage microservice that has two stages calling the same instance, such as A-->B-->A, how should I serve to maximize the throughput under SLO (Service Level Objective) constraints?

This may involve two challenges: (1) making the pipeline uniform in thickness. That is, the pipeline for the first stage calling A, the second stage calling B, and the third stage calling A again, the rate at which the request flows out is the same. (2) How to schedule for stages that share resources. Because the first stage and the third stage...

I am confused about: (1) Why making the pipelines of A and B uniform in thickness can improve the throughput under SLO constraints? (2) For stages that share resources, is it optimal to schedule according to FCFS (First-Come, First-Served) regardless of whether the request belongs to the first stage or the second stage? If not, what should be done? I have implemented a piece of Python code to simulate my scenario, as shown below:

python
# invoke chain: M1Handler -- > M2Handler --> M1Handler

class M1Handler:
    def __init__(self):
        self.phase1_queue = Queue()
        self.phase2_queue = Queue()
        self.phase1_wait_times = []
        self.phase2_wait_times = []
    
    def process_phase(self):
        # FCFS strategy to get request
        while True:
            if not self.phase1_queue.empty() or not self.phase2_queue.empty():
                # Determine which queue to process based on the oldest request
                if not self.phase1_queue.empty() and (self.phase2_queue.empty() or self.phase1_queue.queue[0].enqueue_time <= self.phase2_queue.queue[0].enqueue_time):
                    queue = self.phase1_queue
                else:
                    queue = self.phase2_queue
                
                request = queue.get()
                if queue == self.phase1_queue:
                    request.phase1_entry_time = time.time()
                    self.phase1_wait_times.append(request.phase1_entry_time - request.enqueue_time)
                    time.sleep(0.6)  # Simulate processing time
                    request.phase1_exit_time = time.time()
                    
                else:
                    request.phase3_entry_time = time.time()
                    self.phase2_wait_times.append(request.phase3_entry_time - request.phase2_exit_time)
                    time.sleep(0.3)  # Simulate processing time
                    request.phase3_exit_time = time.time()
                    
                    

class M2Handler:
    def __init__(self):
        self.queue = Queue()
        self.wait_times = []
    
    def process(self):
        while True:
            if not self.queue.empty():
                request = self.queue.get()
                request.phase2_entry_time = time.time()
                self.wait_times.append(request.phase2_entry_time - request.phase1_exit_time)
                time.sleep(0.9)  # Simulate processing time
                request.phase2_exit_time = time.time()
                
                server.m1_handler.phase2_queue.put(request)
                throughput_counter['m2'] += 1
6
  • " I have implemented a piece of Python code" So why have you not tagged with python? Commented Jul 31, 2024 at 15:34
  • "A-->B-->A" What does this indicate? Are you using some representation scheme? Does it man do a then do B then redo A? Or something else? Commented Jul 31, 2024 at 15:37
  • "a multi-stage microservice that has two stages calling the same instance" Please clarify this. What does 'calling the same instance' mean. If A and B are tasks using the same resource then the only feasible solution is to do one task then the other when the first completes. Commented Jul 31, 2024 at 15:39
  • I only posted part of the code. I am using some representation scheme to abstract the actual scenario.Yes, how to choose the resource to run A or B first is what puzzles me. Scheduling A means B will be queued, but they also have a dependency relationship. My goal is to maximize the throughput of this chain under the SLO constraints. The current strategy is FCFS, that is, regardless of which stage the request belongs to, although the processing time required for these two stages is not the same. I want to know if FCFS is good enough, and whether there is a better scheduling algorithm. Commented Aug 2, 2024 at 9:51
  • ”"A-->B-->A" What does this indicate? Are you using some representation scheme? Does it man do a then do B then redo A? Or something else?“ Yes,do B then redo A,A chain has two stages that share a resource and both require processing by A, but they have different processing times. A spends 100ms processing the first stage and 800ms processing the subsequent stage. These two stages share a resource, and the processor can only choose to process requests from one stage at a time, which means that requests from the other stage will be queued. Commented Aug 2, 2024 at 9:58

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.