English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Language > Coders. C/C++

 
 
Thread Tools
Old 14 January 2023, 01:53   #1
tygre
Returning fan!
 
tygre's Avatar
 
Join Date: Jan 2011
Location: Montréal, QC, Canada
Posts: 1,440
ARexx, WaitPort() Doesn't Return under Some Condition

Hi there!

I have this code that send messages to the ARexx port of some module players, for example EaglePlayer. I looked at a lot of example on-line, like this one, and they all do something like that:

Code:
Forbid();
if((_arexx_port = FindPort(...)) == NULL)
{
	Permit();
	goto _RETURN_ERROR;
}
PutMsg(_arexx_port, &rexx_msg->rm_Node);
Permit();
WaitPort(reply_port);
	GetMsg(reply_port);
Forbid() and Permit() are around FindPort(). But even if I put them around the whole block, I still have this problem: WaitPort() never returns, gets "stuck", if I quit EaglePlayer just at the right moment.

It seems that, upon quitting, EaglePlayer keeps its port open (FindPort() succeeds) but, immediately after, stops answering messages (WaitPort() never returns). Does this make sense or is it my code that's buggy?

Cheers!

Last edited by tygre; 14 January 2023 at 01:54. Reason: Typos
tygre is offline  
Old 14 January 2023, 02:25   #2
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,068
If the server isn't shutting down atomically, then there's nothing much you can do. For example, if it replies to all the messages, then doesn't forbid while checking if empty and deleting the port. So a new message right between checking if there''s any left to reply and the port actually being deleted and made unavailable will potentially brick any client regardless whether said client uses forbid (smaller chance) or not.
Your code looks safe, and the problem is on the other side, as far as I can see.
a/b is offline  
Old 14 January 2023, 07:13   #3
Exodous
Registered User
 
Join Date: Sep 2019
Location: Leicester / England
Posts: 203
Putting a Forbid/Permit around the whole code, including the WaitPort won't work as the scheduler would never switch task to the one you're waiting on, so you would never get a reply even if it did send a reply.

However, if you know the other end of the message port may not repond, you could just use GetMsg in a timeout/delay loop as GetMsg is effectively asynchronous. If there is a message, it gets it, if not, it returns zero.

http://amigadev.elowar.com/read/ADCD.../node035A.html

Here's a bit of pseudo code to demonstrate what I'm thinking which tries 5 times with a 200ms delay between each check. At the end of the loop, if received is "true" then there was a message, otherwise there wasn't...

Code:
count = 0
received = false
while (count < 5 or received == false)
{
  if (GetMsg(reply_port) == null)
  {
    Delay(10)    // 10 ticks = 200ms
    count = count + 1
  }
  {
    received = true;
  }
}
This way your program will never wait indefinitely.
Exodous is offline  
Old 14 January 2023, 07:32   #4
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,358
Quote:
Originally Posted by Exodous View Post
Putting a Forbid/Permit around the whole code, including the WaitPort won't work as the scheduler would never switch task to the one you're waiting on, so you would never get a reply even if it did send a reply.
Normally waiting for something will 'break' the Forbid state.
meynaf is offline  
Old 14 January 2023, 08:18   #5
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,322
Quote:
Originally Posted by Exodous View Post
Putting a Forbid/Permit around the whole code, including the WaitPort won't work as the scheduler would never switch task to the one you're waiting on, so you would never get a reply even if it did send a reply.
Err... no. That's totally safe and a not so uncommon programming pattern. As soon as the code runs into Wait() (or the Wait() implicit in WaitPort()), the Forbid() or Disable() state is broken. An exec task that is voluntarily giving up the CPU by that implicitly re-allows interrupts and task switching. The Forbid() or Disable() state will be restored as soon as the signal the task waits for arrives.
Thomas Richter is offline  
Old 14 January 2023, 08:20   #6
Exodous
Registered User
 
Join Date: Sep 2019
Location: Leicester / England
Posts: 203
For the original code, this may or may not be the case as it isn't explicity documented in the AutoDocs.

They say that calling Wait() states that it breaks the Forbid status until the next time the task scheduler allocates time to the corresponding task which called Forbid.

For WaitPort(), they say "If necessary, the Wait() function will be called". Whilst I presume that if there is a message waiting, it won't call Wait() and will just return, otherwise it calls Wait(), it's making these sort of undocumented presumptions that always come back to bite you when you're lease expecting it and then causes hours of head scratching why things sometimes work and sometimes don't.


Though, whilst this discussion is interesting, it's not relevant for my suggested pseudo code, as that doesn't use WaitPort and therefore must not be within a Forbid/Permit pair.


However, there are still other problems - if another task created the port, then technically it could "go away" between finding the port and attempting to use if the other task is scheduled to run and closes the port at that time. This means the task reading the message could just be accessing arbitrary data from memory. At best, it would be reading what was there before. At worst, it could lead to corruption and a crash.

Within my pseudo code loop, it would probably be best to do the following to prevent this:

Forbid()
FindPort()
If port found GetMsg()
Permit()
Exodous is offline  
Old 14 January 2023, 08:33   #7
Exodous
Registered User
 
Join Date: Sep 2019
Location: Leicester / England
Posts: 203
Quote:
Originally Posted by Thomas Richter View Post
Err... no. That's totally safe and a not so uncommon programming pattern.
Thomas, whilst I appreciate you have probably seen how this works within the OS code, so can answer this with conviction, the documentation we "mere mortals" have available doesn't explicitly state under what circumstances Wait() is called when using WaitPort().

Making assumptions about how "internal" functions operate is dangerous and comes back to bite you. Therefore, based upon the documentation available, surely it cannot be recommended to use WaitPort() within a Forbid() section?


I know how things written can be misunderstood, so want to note that this is a legitimate question as, whilst it may work on all current releases, I can't see it documented anywhere that this is guaranteed to work in the way described.
Exodous is offline  
Old 14 January 2023, 09:56   #8
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,322
You do not need the Os sources for that. WaitPort() waits whenever there is no message in the port to remove, thus when waiting is necessary. Otherwise, waiting is not necessary. It is really quite simple. That is not an "assumption".... Check also the RKRMs.
Thomas Richter is offline  
Old 14 January 2023, 10:17   #9
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,322
Quote:
Originally Posted by Exodous View Post
For the original code, this may or may not be the case as it isn't explicity documented in the AutoDocs.

From the RKRM:


Quote:
You can call the WaitPort() function to wait for a message to arrive at a port. This function will return the first message (it may not be the only) queued to a port. Note that your application must still call GetMsg() to remove the message from the port. If the port is empty, your task will go to sleep waiting for the first message. If the port is not empty, your task will not go to sleep. It is possible to receive a signal for a port without a message being present yet. The code processing the messages should be able to handle this. The following code illustrates WaitPort().

Quote:
Originally Posted by Exodous View Post
They say that calling Wait() states that it breaks the Forbid status until the next time the task scheduler allocates time to the corresponding task which called Forbid.
Precisely. Thus, for example, if the signal you waited on was received. Or, if you like, the port becomes non-empty and as a result, WaitPort() receives task time.



Quote:
Originally Posted by Exodous View Post

For WaitPort(), they say "If necessary, the Wait() function will be called". Whilst I presume that if there is a message waiting, it won't call Wait() and will just return, otherwise it calls Wait(), it's making these sort of undocumented presumptions that always come back to bite you when you're lease expecting it and then causes hours of head scratching why things sometimes work and sometimes don't.
WaitPort() is bug-free (it is rather trivial, actually), and the only reason why it does not return is that there is no message ever delivered, or alternatively, if the message was received and removed from the port before the discussed code fragment is executed. WaitPort() as part of an event loop is only useful if there is only a single port on which messages shall be retrieved.


Quote:
Originally Posted by Exodous View Post


Though, whilst this discussion is interesting, it's not relevant for my suggested pseudo code, as that doesn't use WaitPort and therefore must not be within a Forbid/Permit pair.
That suggested pseudo-code is over-complicated and sub-optimal. It is over-complicated as it needs another library for the job (which also only runs into a Wait() at some point, and it needs the timer.device), and it is sub-optimal as it waits longer than necessary if a message arrives.




Quote:
Originally Posted by Exodous View Post



However, there are still other problems - if another task created the port, then technically it could "go away" between finding the port and attempting to use if the other task is scheduled to run and closes the port at that time.
Not if there is a Forbid() upfront the FindPort() and no other call between WaitPort() that may break a Forbid(). Note that AddPort() and RemPort() both call Forbid(), and thus will never be called by someone while your task holds the Forbid. Thus, the port cannot go away under your feed.




Quote:
Originally Posted by Exodous View Post




This means the task reading the message could just be accessing arbitrary data from memory. At best, it would be reading what was there before. At worst, it could lead to corruption and a crash.
Not if you do it properly. That is, protect the FindTask() with a Forbid() so the CPU cannot be stolen while your task operates and continues into the wait.


The problem is not that the port goes away. It cannot. The problem is that the program handling the port does not reply all messages before the port is removed, or your program already removed the message upfront and calls WaitPort() even though the message has already been delivered and removed.



Quote:
Originally Posted by Exodous View Post





Within my pseudo code loop, it would probably be best to do the following to prevent this:

Forbid()
FindPort()
If port found GetMsg()
Permit()
That also works, but is a non-blocking version.



If you must wait for the message to return, and if a message is returned by protocol, then
Code:
Forbid();
if (port = FindPort(...)) {
 WaitPort();
 msg = GetMsg()
}
Permit();
does what it should do.
Thomas Richter is offline  
Old 14 January 2023, 11:46   #10
a/b
Registered User
 
Join Date: Jun 2016
Location: europe
Posts: 1,068
Forbid() story aside...
If we are talking alternative approaches, other than the obvious back-to-basics polling, if you happen to have an async piece of code running periodically (interrupt handler) you could send yourself a wake-up msg. Or if you use multiple sources instead, e.g. a fat Cancel button that a user could mash if the app becomes unresponsive that would send you an intuimsg and wake you up.
Hard to tell without context, polling being the most obvious approach.
a/b is offline  
Old 14 January 2023, 13:23   #11
thomas
Registered User
 
thomas's Avatar
 
Join Date: Jan 2002
Location: Germany
Posts: 7,032
Quote:
Originally Posted by Thomas Richter View Post
If you must wait for the message to return, and if a message is returned by protocol, then
Code:
Forbid();
if (port = FindPort(...)) {
 WaitPort();
 msg = GetMsg()
}
Permit();
does what it should do.
I don't see in which occasion you would use this code.

FindPort gives you the address of a foreign, i.e. another task's port. You can only wait for messages on ports your own task has allocated. And if you allocated the port, you know it's address, you don't need to call FindPort for it.

Also I don't see why you would have to embed WaitPort and GetMsg in Forbid/Permit. When the message has returned to your reply port, you are the owner of the message. It cannot disappear between WaitPort and GetMsg.

Only foreign ports can disappear unexpectedly. Therefore FindPort and PutMsg may need Forbid. But not the wait for reply.

And if the remote port disappears before it has removed itself from the public port list, then it is a bug of the server program. I doubt there is any workaround you can add to the client program.
thomas is offline  
Old 14 January 2023, 13:31   #12
Hedeon
Semi-Retired
 
Join Date: Mar 2012
Location: Leiden / The Netherlands
Posts: 2,049
Regarding GetMsg() and Forbid() / Permit(). Is Remove() et al also protected by Forbid() / Permit () in OS3? if you decide to not use it with GetMsg() then if during the removal of a message/node a task switch occurs, and the new tasks also reads the same list, nasty things happen?
Hedeon is offline  
Old 14 January 2023, 13:51   #13
Exodous
Registered User
 
Join Date: Sep 2019
Location: Leicester / England
Posts: 203
Quote:
Originally Posted by Thomas Richter View Post
From the RKRM:
....
OK, I'm not sure what I read to miss that line, but simply pointing out that I'd missed this bit would have been sufficient.

You seriously didn't need to write more than a screenful of quotes with such a condescending reply!


EDIT: In fact, why did it actually need the second response as you had responded once - it almost looks like you were deliberately trying to be antagonistic with the second response?


Apologies to OP for the slight derail.
Exodous is offline  
Old 14 January 2023, 15:03   #14
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,322
Quote:
Originally Posted by Hedeon View Post
Regarding GetMsg() and Forbid() / Permit(). Is Remove() et al also protected by Forbid() / Permit () in OS3?
No, of course not. Remove() is just removing a node from a list, assuming that you are the exclusive user of the list. It is one of the list-related calls.
Quote:
Originally Posted by Hedeon View Post
If you decide to not use it with GetMsg() then if during the removal of a message/node a task switch occurs, and the new tasks also reads the same list, nasty things happen?
You are typically not reading shared lists. You are reading from your own ports. If you want to share a list that is not a port, you need to serialize access to it, typically by a semaphore or a forbid/permit pair.
Thomas Richter is offline  
Old 14 January 2023, 15:05   #15
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,322
Smile

Quote:
Originally Posted by thomas View Post
I don't see in which occasion you would use this code.
Sorry, there is a PutMsg() to the found port missing for the message to send, and exclusively wait for. That happens if you do not put arguments. *sigh*
Thomas Richter is offline  
Old 14 January 2023, 18:10   #16
tygre
Returning fan!
 
tygre's Avatar
 
Join Date: Jan 2011
Location: Montréal, QC, Canada
Posts: 1,440
Hi all!

Thank you all! I really appreciate the thorough discussion!

While I understand that the following code would be the right thing to do:

Code:
Forbid();
if(arexx_port = FindPort(...))
{
	PutMsg(arexx_port, rexx_msg);
	WaitPort(reply_port);
	GetMsg(reply_port);
}
Permit();
I think that Exodus' code would be better in my case, because some external programs, like EaglePlayer?, may never reply to the WaitPort(reply_port):

Code:
while(wait_count < 25 && msg_received == FALSE)
{
	if(GetMsg(reply_port) == NULL)
	{
		Delay(10);
		wait_count++;
	}
	else
	{
		msg_received = TRUE;
	}
}
I'm going to experiment with this code and let you know!
Cheers!

Last edited by tygre; 15 January 2023 at 20:42. Reason: Fixed broken logic! Added longer delay and explanation...
tygre is offline  
Old 14 January 2023, 21:55   #17
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,322
If a program with an ARexx port does not reply to an Arexx command, then something is fishy about this program, I would say.

Typically, howevrer, you would not wait for the reply from the port in the very same place. Instead,retrieving a returned Rexx message would be part of the event loop of your program - in the sense of: You first fire off the rexx command (Forbid(),FindPort(),PutMsg(),Permit()), and then in the event loop of your program, you check multiple ports for incoming messages to react upon, and the reply port of the rexx message would be just one of them. WaitPort() is then, of course, not the right answer. You rather need to wait on the signal mask of all ports combined, and then check one port after another for any incoming message.

You still have the "problem" of what to do about non-replied rexx messages then, though. If the user chooses to terminate your program, you should better check whether all rexx messages send out had been returned back as you cannot safely kill the reply port without having retrieved all messages.
Thomas Richter is offline  
Old 15 January 2023, 00:37   #18
tygre
Returning fan!
 
tygre's Avatar
 
Join Date: Jan 2011
Location: Montréal, QC, Canada
Posts: 1,440
Thanks Thomas! Yes, that makes perfect sense

But then what to do if a "rogue" program doesn't reply at all, for example when I quit EaglePlayer: it won't be able to answer anymore at all... Is there a safe way for me to stop my program then, without having to wait and retrieve all messages?
tygre is offline  
Old 15 January 2023, 01:41   #19
Thomas Richter
Registered User
 
Join Date: Jan 2019
Location: Germany
Posts: 3,322
First, I would suggest contacting the author of the program is the first thing you should try.

As a practical advice, I would suggest that, upon exit of your program, check which messages are still pending to be replied back to you. If the target port still exists, you can still wait on them before quitting your task because it could still happen that your messages will be replied at some point, and then if their reply port is no longer present, bad things will happen if the destination port owner attempts to reply them.

If the target port does not exist anymore (and thus there is a defect in the destination program), I would zero out the mn_ReplyPort by those messages you still wait on (so nothing bad happens if someone still attempts to reply them), and then exit your program without releasing the message. You then have a memory leak (unfortunatly), but at least if someone picks up the message and attemps to reply it, nothing bad will happen.

A message with NULL reply port will just have its node type set to NT_FREEMSG (or something like it, I forgot the precise type).
Thomas Richter is offline  
Old 15 January 2023, 20:47   #20
tygre
Returning fan!
 
tygre's Avatar
 
Join Date: Jan 2011
Location: Montréal, QC, Canada
Posts: 1,440
Hi all!

Again, thanks for the help

I updated to code snippet above because I put a || where it should be a &&.

I also increased the delay (count to 25 instead of 5) to give a chance to the other program to actually reply! With a count to 5, for example, EaglePlayer didn't have the time to reply legitimately...

I think that increasing the waiting time to max. 5s could make sense: if after 5s the other program hasn't replied, something must be wrong anyway, mustn't it?

Cheers!

Last edited by tygre; 15 January 2023 at 21:13.
tygre is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Disk condition of Elvira II in TOSEC Crashdisk request.Old Rare Games 22 02 August 2022 19:03
Open(xxx, MODE_NEWFILE) on CD-ROM doesn't return alpine9000 Coders. General 4 11 May 2018 02:11
Utopia - Decent Condition Neil79 MarketPlace 0 05 March 2014 00:15
Cybervision64 (not in 100% condition) macce2 MarketPlace 3 01 December 2006 01:34
Looking For Amiga 600 in good condition Vaclav MarketPlace 0 06 May 2006 00:03

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 04:13.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.14403 seconds with 13 queries