From: shriram@ollie.cs.rice.edu (Shriram Krishnamurthi)
Newsgroups: comp.lang.scheme.scsh
I've been running the Scsh web server for a while now, and I find that
periodically, the server just quits on me. There doesn't seem to be
any clear period of time after which this happens; sometimes it's soon
after start-up, at others it's after a few days of operation.
In each case, it seems to be printing the following error message to
the error output port:
Error: 5
"Input/output error"
#{Procedure 8590 flush-fdport*}
#{fdport-data}
That's errno EIO in FreeBSD. Here's the relevant code:
int flush_fdport(scheme_value data)
{
FILE *f = fstar_cache[EXTRACT_FIXNUM(*PortData_Fd(data))];
return fflush(f) ? errno : 0;
}
Now, I'd guess that the only syscall fflush(f) does is write(). The man page
for EIO says:
[EIO] An I/O error occurred while reading from or writing to the
file system.
which isn't real useful. But at least we know scsh isn't erroneously trying
to flush output to a closed file descriptor, because that would give EBADF
or EPIPE. But... what does "file system" mean? Does it mean just the disk,
or can this include network I/O? If the former, that is quite strange.
Why don't you run the server interactively, so that when it craps out, you'll
get thrown into a debugger breakpoint REPL instead of having the process
exit. Then get a backtrace, and find out what port/file-descriptor is messing
you up. This is critical.
It just might be the case that this error is a legitimate error, caused by
client behaviour -- perhaps the client dropped his end of the socket.
But as I recall, I carefully wrote the server so that if anything went wrong
when interacting with the client, it would just abort the transaction (by
catching error exceptions and jumping back to the connect loop) and keep
on keepin' on.
So maybe the I/O error is happening outside the client-handling code...
That would be interesting.
Get a stack trace and let me know what's up.
-Olin
|