-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Machinery for translating pipe stream into messages #251
base: main
Are you sure you want to change the base?
Conversation
I think so, or preserve the current behavior of associating it with the most recently seen parent. |
def new_stream_msg(self, text, parent_id): | ||
msg = {} | ||
header = msg_header(new_id(), 'stream', self.username, self.session_id) | ||
parent_header = msg_header(parent_id, 'execute_request', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be constructing parent_headers here. The actual request header should be arriving here somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deliberately did it this way so that this machinery doesn't need to see the messages going to the kernel - I think it's going to be simpler to integrate into applications that way, and as far as I can think, frontends only rely on the parent message ID to know what a message is a response to.
If the application does need the full parent header, a separate piece further up the stack could keep a cache of message headers sent to the kernel, and apply them to the messages coming from this machinery by matching the IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'execute_request' is often the wrong value for msg_type, though. The parent_header needs to match the msg_type and session in the original requests (it really should match all fields, really, but those are especially important).
Maybe when a new request is seen, it gets forwarded on this pipe?
I thought with kernel nanny the Nanny was going to proxy all requests, so that it would necessarily see every request as it passes by, so the msg_id could be looked up in the recent requests that passed through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've lost track of what we actually wanted from the kernel nanny, so I thought I'd try to work on this bit separately, because at least I understand it. I thought we were heading in the direction of allowing a Python based process such as the notebook server integrate the nanny functionality without running a separate process, so I'd start by doing output capturing without the nanny inside one or more of our frontends. At the moment, I can't even think of an easy way to integrate it, though, because it requires changes in multiple repositories.
msg['msg_type'] = header['msg_type'] | ||
msg['parent_header'] = parent_header | ||
msg['content'] = {u'name': self.stream_name, u'text': text} | ||
msg['metadata'] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be simplified with msg = session.msg(msg_type, content=content, parent_header=parent_header)
This is the initial piece for capturing output from the kernel at the OS level, by hooking stdout and stderr up to pipes.
Kernels wanting to make use of this machinery will need to delimit the stream output resulting from executing a cell with markers containing the execute message ID:
These are used to set the message id in the parent header of the messages generated, so that a frontend can associate them with the relevant cell. The end marker is translated into an
EndOfOutput
object - once the frontend has seen one of these from each captured stream, plus the kernel-idle message on iopub, it knows it has all output produced while that cell was executing. The cell may have started background threads, and output from those may be lost or misdirected, but that's already the case, and I think it's unavoidable for piped data. I believe this machinery should recover if e.g. one of the markers it looks for isn't there.Question: at present, output outside the marked blocks (after one END, before the next BEGIN) is silently discarded. Do we want to expose it to the frontend in some way, e.g. as
stream
output messages without a parent header?