Giter VIP home page Giter VIP logo

aiomsg's Introduction

Welcome!

Featured repositories

Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card

Machine Learning

Readme Card

English language tools

Readme Card Readme Card Readme Card

Smaller tools

Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card

Logging tools

In past roles I needed to examine server logs extensively, and that experience was good inspiration for a bunch of ideas around making logs a little easier to produce and work with.

Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card Readme Card

Links

Mastodon

My GitHub Stats

aiomsg's People

Contributors

cjrh avatar dependabot-preview[bot] avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

kgrsajid

aiomsg's Issues

When AT_LEAST_ONCE is enabled, the logs are misleading

ERROR:aiomsg:Peer 9e63626071a74022a1d16bc6915c5a16 is not connected. Message will be dropped.
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
ERROR:aiomsg:Peer 9e63626071a74022a1d16bc6915c5a16 is not connected. Message will be dropped.
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
ERROR:aiomsg:Peer 9e63626071a74022a1d16bc6915c5a16 is not connected. Message will be dropped.
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
ERROR:aiomsg:Peer 9e63626071a74022a1d16bc6915c5a16 is not connected. Message will be dropped.
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
ERROR:aiomsg:Peer 9e63626071a74022a1d16bc6915c5a16 is not connected. Message will be dropped.
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}
ERROR:aiomsg:Peer 9e63626071a74022a1d16bc6915c5a16 is not connected. Message will be dropped.
INFO:root:Sending message: {'n': 5}
INFO:root:Sending message: {'n': 5}

The retries do trigger, but the logger message says that the messasge will be dropped. This would only be true if all the retries eventually failed.

Add a feature for staggered reconnection logic

If a great many sockets are in play, let's say 10000 connect sockets and 1 bind socket, it is possible that a network interruption (or a slow server restart) causes a large number of messages buffered on the connect sockets and these will all try to be sent immediately after reconnection when the bind socket once again becomes available. It would be good to check whether backpressure works (but it might not, due to how messages are sent, i.e., in the sending code the messages immediately go into a queue), but, it is probably also a good idea to add a feature that allows a user-specified random delay between connection attempts. If this is spread over, say five or ten seconds (for each connect socket) that might be enough to allow all the connections to come back up without overwhelming the server all at once.

Options to deal with a missing identity

In _sender_identity(), currently we drop the data if the intended recipient socket/connection is missing:

    def _sender_identity(self, message: bytes, identity: bytes):
        """Send directly to a peer with a distinct identity"""
        logger.debug(
            f"Sending message via identity {identity.hex()}: {message[:20]}..."
        )
        c = self._connections.get(identity)
        if not c:
            logger.error(
                f"Peer {identity.hex()} is not connected. Message will be dropped."
            )
            return

        try:
            c.writer_queue.put_nowait(message)
            logger.debug("Placed message on connection writer queue.")
        except asyncio.QueueFull:
            logger.error("Dropped msg to Connection blah, its write " "queue is full.")

This can be kinda irritating. You can deal with it somewhat by using the AT_LEAST_ONCE delivery guarantee, but there are some use-cases where you might want to send that payload to a different peer. (it does seem a bit odd though, in isolation - you're specifying a target identity but you're ok with sending it somewhere else if that target isn't there)

Fix README typo

Fix the typo in the first README example:

async def main():
    async with Søcket() as sock:
        await sock.bind('127.0.0.1', 25000):
        while True:
            await s.send(time.ctime().encode())

s should be changed to sock.

Connection errors can also trigger OSError

Unexpected error
Traceback (most recent call last):
  File "C:\Users\caleb\temp\testaiorun\aiomsg\__init__.py", line 202, in connect_with_retry
    await new_connection()
  File "C:\Users\caleb\temp\testaiorun\aiomsg\__init__.py", line 186, in new_connection
    hostname, port, loop=self.loop, ssl=ssl_context
  File "G:\Programs\Python37\lib\asyncio\streams.py", line 77, in open_connection
    lambda: protocol, host, port, **kwds)
  File "G:\Programs\Python37\lib\asyncio\base_events.py", line 968, in create_connection
    ', '.join(str(exc) for exc in exceptions)))
OSError: Multiple exceptions: [Errno 10061] Connect call failed ('::1', 25001, 0, 0), [Errno 10061] Conn
ect call failed ('127.0.0.1', 25001)
Unexpected error
Traceback (most recent call last):
  File "C:\Users\caleb\temp\testaiorun\aiomsg\__init__.py", line 202, in connect_with_retry
    await new_connection()
  File "C:\Users\caleb\temp\testaiorun\aiomsg\__init__.py", line 186, in new_connection
    hostname, port, loop=self.loop, ssl=ssl_context
  File "G:\Programs\Python37\lib\asyncio\streams.py", line 77, in open_connection
    lambda: protocol, host, port, **kwds)
  File "G:\Programs\Python37\lib\asyncio\base_events.py", line 968, in create_connection
    ', '.join(str(exc) for exc in exceptions)))
OSError: Multiple exceptions: [Errno 10061] Connect call failed ('::1', 25001, 0, 0), [Errno 10061] Conn
ect call failed ('127.0.0.1', 25001)
Unexpected error
Traceback (most recent call last):
  File "C:\Users\caleb\temp\testaiorun\aiomsg\__init__.py", line 202, in connect_with_retry
    await new_connection()
  File "C:\Users\caleb\temp\testaiorun\aiomsg\__init__.py", line 186, in new_connection
    hostname, port, loop=self.loop, ssl=ssl_context
  File "G:\Programs\Python37\lib\asyncio\streams.py", line 77, in open_connection
    lambda: protocol, host, port, **kwds)
  File "G:\Programs\Python37\lib\asyncio\base_events.py", line 968, in create_connection
    ', '.join(str(exc) for exc in exceptions)))
OSError: Multiple exceptions: [Errno 10061] Connect call failed ('::1', 25001, 0, 0), [Errno 10061] Conn
ect call failed ('127.0.0.1', 25001)

Set TCP_NODELAY and TCP_QUICKACK

See this comment.

That still irks me. The real problem is not tinygram prevention. It's ACK delays, and that stupid fixed timer. They both went into TCP around the same time, but independently. I did tinygram prevention (the Nagle algorithm) and Berkeley did delayed ACKs, both in the early 1980s. The combination of the two is awful. Unfortunately by the time I found about delayed ACKs, I had changed jobs, was out of networking, and doing a product for Autodesk on non-networked PCs.

Delayed ACKs are a win only in certain circumstances - mostly character echo for Telnet. (When Berkeley installed delayed ACKs, they were doing a lot of Telnet from terminal concentrators in student terminal rooms to host VAX machines doing the work. For that particular situation, it made sense.) The delayed ACK timer is scaled to expected human response time. A delayed ACK is a bet that the other end will reply to what you just sent almost immediately. Except for some RPC protocols, this is unlikely. So the ACK delay mechanism loses the bet, over and over, delaying the ACK, waiting for a packet on which the ACK can be piggybacked, not getting it, and then sending the ACK, delayed. There's nothing in TCP to automatically turn this off. However, Linux (and I think Windows) now have a TCP_QUICKACK socket option. Turn that on unless you have a very unusual application.

Turning on TCP_NODELAY has similar effects, but can make throughput worse for small writes. If you write a loop which sends just a few bytes (worst case, one byte) to a socket with "write()", and the Nagle algorithm is disabled with TCP_NODELAY, each write becomes one IP packet. This increases traffic by a factor of 40, with IP and TCP headers for each payload. Tinygram prevention won't let you send a second packet if you have one in flight, unless you have enough data to fill the maximum sized packet. It accumulates bytes for one round trip time, then sends everything in the queue. That's almost always what you want. If you have TCP_NODELAY set, you need to be much more aware of buffering and flushing issues.

None of this matters for bulk one-way transfers, which is most HTTP today. (I've never looked at the impact of this on the SSL handshake, where it might matter.)

Short version: set TCP_QUICKACK. If you find a case where that makes things worse, let me know.

John Nagle

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.