Skip to content

Conversation

@zeze1004
Copy link

@zeze1004 zeze1004 commented Jun 6, 2024

Pull Request check-list

#3090
Please make sure to review and check all of these items:

  • Do tests and lints pass with this change?
  • Do the CI tests pass with this change (enable it first in your forked repo and wait for the github action build to finish)?
  • Is the new or changed code fully tested?
  • Is a documentation update included (if this change modifies existing APIs, or introduces new ones)?
  • Is there an example added to the examples folder (if applicable)?
  • Was the change added to CHANGES file?

NOTE: these things are not required to open a PR and can be done
afterwards / while the PR is open.

Description of change

Please provide a description of the change here.
In this PR, I experimented with using MULTI and EXEC commands to batch multiple Redis commands into a single request. My goal was to reduce the number of network round-trips and improve overall performance by minimizing network latency.

Observations
After running the integration tests, I noticed that while the number of network requests decreased, the total execution time actually increased. Here’s what I found:

  1. Increased BufferedReader Time

    When we send multiple commands in one MULTI/EXEC block, the server responds with all the results in one go. Reading this large combined response takes longer, which increased the time spent in the BufferedReader.

  2. Command Packing Overhead

    Packing multiple commands into a single MULTI request requires additional processing to format the data correctly. This added some overhead to the command preparation phase.

  3. Complex Response Parsing

    Parsing the combined response from EXEC also turned out to be more complex and time-consuming. Each individual command’s result had to be handled separately from the large, single response, which added to the total processing time.

Test Result
Here are the Integration test results comparing the Original Logic' and the Modified Logic

[ Measuring the time for 1000 Redis connections ]

Modified Logic Original Logic
0.243718 s 0.168276 s
class TestRedisIntegration(unittest.TestCase): def setUp(self): self.client = redis.Redis(host='localhost', port=6379, db=0) self.client.set('test_key', 'test_value') logger.info(f"value: {self.client.get('test_key').decode('utf-8')}") self.client.flushdb() def tearDown(self): self.client.flushdb() def test_on_connect_performance_high_version(self): """Test on_connect method performance""" conn = Connection() conn.username = None conn.password = None conn.protocol = 2 conn.client_name = None conn.lib_name = 'redis-py' conn.lib_version = '99.99.99' conn.db = 0 conn.client_cache = False # Recording start time and measuring execution time connect_time_thousand = timeit.timeit(conn.on_connect, number=1000) logger.info(f"Measuring the time for 1000 Redis connections: {connect_time_thousand:.6f} seconds")

image

Conclusion
While the idea was to reduce network latency by batching commands, the extra time taken to read and parse the larger response offset these gains. It seems that for our case, the increased local processing outweighed the benefits of fewer network requests.


I’d love to get your feedback on this. Do you think there are other optimizations we should consider, or is there something I might have missed? Any insights would be greatly appreciated! cc. @chayim 🙇🏻

Thanks!

@zeze1004 zeze1004 force-pushed the connect-using-pipeline branch from 7d4f985 to e365726 Compare June 6, 2024 15:51
zeze1004 added 14 commits June 13, 2024 21:25
… and above, using MULTI command as a fallback
This update eliminates the condition checking for connection retries because the logic was refactored to execute MULTI and EXEC commands within the same try block. This adjustment resolves the issue with unintended reconnection attempts, ensuring proper handling of transactional commands
@zeze1004 zeze1004 force-pushed the connect-using-pipeline branch from d7eb12f to c4c21d0 Compare June 13, 2024 12:25
@petyaslavova
Copy link
Collaborator

Hi @zeze1004, thanks for the time and effort you’ve put into this PR!

From the results of this refactoring, it seems that it doesn’t improve overall connection performance, and the original issue has since been closed as not planned.

One alternative that comes to mind related to your change is trying out pipelines — the current code sends commands in separate network writes but reads the responses in a single step. I’m not sure this would yield better results, but it could be worth exploring.

Would you be interested in giving this a try, or would you prefer to close this PR for now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants