Skip to content

Ensure crucial warnings and messages are logged to files #41

@DM-Berger

Description

@DM-Berger

While of course a competent user can ensure this just by piping stderr and stdout, it would be nice to just have all terminal outputs automatically written to files, in case one forgets to do the piping manually. While this might be tricky for warnings and errors raised by different models (which, during multiprocessing, depending on implementations, seem to write directly to file descriptors in a way that is impossible to capture), we can at least ensure anything printed specifically by df-analyze is always additionally logged to a file.

EDIT: This is quite non-trivial as we need it to work cross-platform, not require threading (we don't want thousands of writes to be handled by a thread that has to deal with GIL issues to slow down tuning, in the event of some strange spam errors from some C / Fortran fitting routine), and actually be possible to test. I.e. if we implement this, and just assume it works, without careful testing, then we always examine the logfile and miss potential messages.

It seems we need to use pipes, os.dup, and os.set_inheritance, to work with the file descriptors properly. Below is AI code I am posting just as a note for further reference as I read up on this and look into other potential ways to do this. Of course we want to make sure CLI args also get handled correctly with this approach.

import os, sys, multiprocessing as mp

_CHUNK = 65536

def _child_entrypoint(w_fd: int, merge_stderr: bool, target, args, kwargs):
    # Route all stdout/stderr to the inherited pipe write end
    os.set_inheritable(w_fd, True)
    os.dup2(w_fd, 1)
    if merge_stderr:
        os.dup2(w_fd, 2)
    os.set_inheritable(1, True)
    os.set_inheritable(2, True)
    if w_fd not in (1, 2):
        try: os.close(w_fd)
        except OSError as e:
            raise e

    # Execute target; propagate exit code
    try:
        rc = target(*args, **kwargs)
        os._exit(int(rc) if isinstance(rc, int) else 0)
    except SystemExit as e:
        os._exit(e.code if isinstance(e.code, int) else 1)
    except BaseException:
        import traceback; traceback.print_exc()
        os._exit(1)

def tee(target, *args, log_path="run.log", merge_stderr=True, **kwargs) -> int:
    """
    Run `target(*args, **kwargs)` in a child process.
    Mirror everything written to its stdout/stderr (and all its children)
    to both the parent's terminal and `log_path`. 
    Returns the child's exit code.
    """
    # Pipe for child's stdout/stderr
    r_fd, w_fd = os.pipe()
    os.set_inheritable(w_fd, True)

    # Save parent's real terminal stdout FD
    tty_fd = os.dup(1)

    # Start child
    ctx = mp.get_context()  # respects platform defaults (spawn on Win/macOS)
    p = ctx.Process(target=_child_entrypoint,
                    args=(w_fd, merge_stderr, target, args, kwargs))
    p.start()

    # Parent: close write end; pump read -> tty + log
    os.close(w_fd)
    with open(log_path, "ab", buffering=0) as log:
        try:
            while True:
                chunk = os.read(r_fd, _CHUNK)
                if not chunk:
                    break
                os.write(tty_fd, chunk)
                log.write(chunk)
        finally:
            try:
                os.close(r_fd)
            except Exception as e:
                traceback.print_exc()
            os.close(tty_fd)

    p.join()
    return p.exitcode

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions