Discussion:
multithreading issues
nanogui-G/ASJRsgvFgvW+L+
2007-08-29 13:58:10 UTC
Permalink
Hi,

Going through various emails from this list and experiments I'm trying
to find a way to have one thread dealing with events and many other
threads calling GrXXX functions. To my defense I can only say, that
it's not me creating those threads, but some evil application
programmers;) I'm just trying to provide the graphics framework based
on nano-X.

The experiments have been conducted on an X86 with framebuffer support
and microwindows from CVS and of course THREADSAFE=Y.

Here are my observations:

1.) There is a deadlock both with LINK_APP_INTO_SERVER and without.
Create a thread, which calls GrGetNextEvent() and tell it to listen
only to keyboard events, but you never press any key.
GrGetNextEvent() takes a global lock and since select() never returns
no other thread can draw at the same time, although I don't see any
reason why this should be forbidden.
I propose two patches, which temporarily unlock the global lock in
case there are no events in the event queue, i.e. select() is
blocking, and lock it immediately after select(). (Please see files
attached)

With those two patches both the case of Client/Server and
LINK_APP_INTO_SERVER should be covered.

Greg, do you see any problems there?

2.) From what I saw the fun with multi threaded problems starts, when
there are non void GrXXX functions called, which means, that the
server needs to return something and its replies can get mixed up for
the client threads.
The solution to this problem should be to use LINK_APP_INTO_SERVER, so
there is no funny communication between client(s) and server, but
there are direct function calls instead.

Greg, would you consider to use LINK_APP_INTO_SERVER and the attached
patches as a safe approach for the described multi threading scenario?

Regards,
Robert
Greg Haerr
2007-08-29 19:13:36 UTC
Permalink
Post by nanogui-G/ASJRsgvFgvW+L+
Going through various emails from this list and experiments I'm trying
to find a way to have one thread dealing with events and many other
threads calling GrXXX functions.

This can only work, even with your mod, only if you can guarantee
that no two threads will ever call a non-void GrXXX function
at once. If the other threads only call void GrXXX (drawing only)
functions, this likely will work, since essentially the client->server
communications is write-only.
Post by nanogui-G/ASJRsgvFgvW+L+
With those two patches both the case of Client/Server and
LINK_APP_INTO_SERVER should be covered.

At first glance, it seems OK, I need to think more deeply about
the locking mechanism and exactly when select() returns 0.

Can you send me your sample program that shows the system
not working until the patches are applied?
Post by nanogui-G/ASJRsgvFgvW+L+
2.) From what I saw the fun with multi threaded problems starts, when
there are non void GrXXX functions called, which means, that the
server needs to return something and its replies can get mixed up for
the client threads.
The solution to this problem should be to use LINK_APP_INTO_SERVER, so
there is no funny communication between client(s) and server, but
there are direct function calls instead.

I agree this is likely the answer, but the server will execute every
request on behalf of the calling thread, and that thread can't be
interrupted until the server's done processing it. I'm not sure
this can be guaranteed without more thought. Easier would be
to allow only non-voide Gr calls as discussed above. (perhaps we
need to put a list together of non-void drawing functions for
further analysis)

Regards,

Greg
nanogui-G/ASJRsgvFgvW+L+
2007-09-03 14:42:51 UTC
Permalink
Hi Greg,

Sorry for the delay.
Now I revised my patches and added also an application, which shows
the problem, as you requested and inlined it all in the mail instead
of attachments to make your mailing list server more happy.

The non void nano-x calls are done from an init function (I want to
change this later on)
Afterwards two threads are created. One, which handles the key pressed
events and another one, which draws some stuff.

Here is the demo app:

---------->

#include <stdio.h>
#include <stdlib.h>
#define MWINCLUDECOLORS
#include "nano-X.h"
#include <pthread.h>

GR_WINDOW_ID wid;
GR_GC_ID gc;
GR_SCREEN_INFO si;

void init();
void handle_events(GR_EVENT * event);
void handle_graphics();

void init()
{
GrGetScreenInfo(&si);
wid = GrNewWindow(GR_ROOT_WINDOW_ID, 0, 0, si.cols, si.rows, 1, GRAY, WHITE);
GrSelectEvents(wid, GR_EVENT_MASK_KEY_DOWN);
GrMapWindow(wid);

gc = GrNewGC();
GrSetGCForeground(gc, RED);
}

void handle_events(GR_EVENT * event)
{
while (1)
{
printf("------------------------------------------------------\n");
GrGetNextEvent(event);
if(event != NULL)
{
switch (event->type)
{
case GR_EVENT_TYPE_KEY_DOWN:
printf("KEY PRESSED \n");
break;
}
}
printf("------------------------------------------------------\n");
}
}

void handle_graphics()
{
int i;

printf("Inside handle_graphics\n");

for (i=1; i<10; i++)
{
GrFillRect(wid, gc, 250-i*20, 230, 10, 50);
GrFlush();
sleep(1);
}
}

int main(int argc,char **argv) {
GR_EVENT event;
pthread_t p_thread_events;
pthread_t p_thread_graphics;

if (GrOpen() < 0) {
fprintf(stderr,"cannot open graphics\n");
exit(1);
}

init();
pthread_create(&p_thread_events, NULL, (void *)handle_events, &event);
pthread_create(&p_thread_graphics, NULL, (void *)handle_graphics, NULL);
sleep(100);
GrClose();
}


<----------

If I don't apply any patches, my system gets stuck (does not draw)
because GrGetNextEvent() holds a lock an so the other thread can not
draw.


Here are the patches:

---------------->

Index: microwin/src/nanox/client.c
===================================================================
--- microwin.orig/src/nanox/client.c
+++ microwin/src/nanox/client.c
@@ -989,8 +989,22 @@ _GrGetNextEventTimeout(GR_EVENT *ep, GR_
to.tv_usec = (timeout % 1000) * 1000;
}

- if((e = select(setsize+1, &rfds, NULL, NULL, timeout ? &to : NULL))>0) {
- int fd;
+
+ /* --> */
+ /* replaced: */
+ /*
+ if((e = select(setsize+1, &rfds, NULL, NULL, timeout ? &to :
NULL))>0) {
+ */
+ /* with: */
+ /* give Nano-X calls from other threads a chance to run by
releasing the lock */
+ UNLOCK(&nxGlobalLock);
+ e = select(setsize+1, &rfds, NULL, NULL, timeout ? &to : NULL);
+ /* we lock the GlobalLock again, which was unlocked a few
lines above */
+ LOCK(&nxGlobalLock);
+ if(e>0)
+ {
+ /* <-- */
+ int fd;

if(FD_ISSET(nxSocket, &rfds)) {
/*

<-------------------

-------------------->

Index: microwin/src/nanox/srvmain.c
===================================================================
--- microwin.orig/src/nanox/srvmain.c
+++ microwin/src/nanox/srvmain.c
@@ -596,6 +596,7 @@ GsSelect(GR_TIMEOUT timeout)
int setsize = 0;
struct timeval tout;
struct timeval *to;
+
#if NONETWORK
int fd;
#endif
@@ -708,7 +709,25 @@ GsSelect(GR_TIMEOUT timeout)
}

/* Wait for some input on any of the fds in the set or a timeout: */
- if((e = select(setsize+1, &rfds, NULL, NULL, to)) > 0) {
+
+ /* --> */
+ /* replaced: */
+ /*
+ if((e = select(setsize+1, &rfds, NULL, NULL, to)) > 0)
+ */
+ /* with: */
+#if NONETWORK
+ SERVER_UNLOCK();
+#endif
+ /* now we can call select, which will wait forever for an
event to arrive */
+ e = select(setsize+1, &rfds, NULL, NULL, to);
+ /* we lock the GlobalLock again, which was unlocked a few
lines above */
+#if NONETWORK
+ SERVER_LOCK();
+#endif
+ if(e>0)
+ /* <-- */
+ {
/* If data is present on the mouse fd, service it: */
if(mouse_fd >= 0 && FD_ISSET(mouse_fd, &rfds))
while(GsCheckMouseEvent())


<--------------------

With LINK_APP_INTO_SERVER, since all the Nano-X functions seem to
reentrant, everything should work fine from multiple threads, as far
as I can imagine.
It looks like the only communication, which is done through IPC is
between e.g. mouse-server and keyboard-server.

Also the client/server approach should work, but I guess this needs
some more thought about the protocol how the server replies to the
client.

Also I noted, that because MW_FEATURE_TIMERS is enabled in srvmain.c
select calls GdGetNextTimeout, which makes it timeout.

It looks like there is some CPU overhead by letting select time out
due to this frequent polling. I guess it would be fine to let select
only unblock in case there are new events coming in, or to increase
the timeout for e.g. a screen saver.
I don't use timers from my application (for now), so do I really need this?

Regards,

Robert

Loading...