Yesterday I was working on message queue part and found that every now and then when I started a daemon process, it fails to create a message queue with ENOMEM error number.
What was interesting to me is that if I restart a daemon again, this does not happen. So I looked into this issue and found that when the daemon is down for a while and other clients keep sending messages, those messages get accumulated in the kernel after for a while. Then when we try to start a daemon process, my code simply unlinks message queue and tries to create(mq_open) a new one with the same name. When we do this, I think that all the messages that were in the queue are not removed quickly and thus, we fail to create a message queue due to memory limit. (We can find the memory limit by calling getrlimit with RLIMIT_MSGQUEUE)
This can happen no matter how small the message queue is and here is a way to view how our message queue is doing.
$ sudo mkdir /dev/mqueue
$ sudo mount -t mqueue none /dev/mqueue
$ sudo cat /dev/mqueue/afcollectorapi
QSIZE:1232 NOTIFY:0 SIGNO:0 NOTIFY_PID:0
Thus, what we should do is that we do not unlink message queue but simply open the existing one and consume all the messages there.
No comments:
Post a Comment