Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New methods to access processes. #25

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions src/Deployment.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,11 @@ const std::vector< std::string >& Deployment::getTaskNames() const
return tasks;
}

const std::map<std::string, std::string>& Deployment::getRenameMap() const
{
return renameMap;
}

const std::vector< std::string >& Deployment::getNeededTypekits() const
{
return typekits;
Expand Down
2 changes: 2 additions & 0 deletions src/Deployment.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ class Deployment : public boost::noncopyable
* */
const std::vector<std::string> &getTaskNames() const;

const std::map<std::string, std::string> &getRenameMap() const;

const std::vector<std::string> &getNeededTypekits() const;

/**
Expand Down
132 changes: 89 additions & 43 deletions src/Spawner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#include <sys/types.h>
#include <sys/wait.h>
#include <iostream>
#include <sstream>
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
Expand Down Expand Up @@ -92,7 +93,7 @@ Spawner& Spawner::getInstace()
}


Spawner::ProcessHandle::ProcessHandle(Deployment *deploment, bool redirectOutputv, const std::string &logDir) : isRunning(true), deployment(deploment)
Spawner::ProcessHandle::ProcessHandle(Deployment *deploment, bool redirectOutputv, const std::string &logDir) : deployment(deploment)
{
std::string cmd;
std::vector< std::string > args;
Expand Down Expand Up @@ -178,55 +179,31 @@ Spawner::ProcessHandle::ProcessHandle(Deployment *deploment, bool redirectOutput

bool Spawner::ProcessHandle::alive() const
{
//if it was already determined before that the process is already dead,
//we can stop here. Otherwise waitpid would fail!
if(!isRunning){
return isRunning;
}

int status = 0;
pid_t ret = waitpid(pid, &status, WNOHANG);

if(ret < 0 )
{
throw std::runtime_error(std::string("WaitPid failed ") + strerror(errno));
}

if(!status)
{
return isRunning;
}

if(WIFEXITED(status))
if(ret == 0)
return true;
else if(ret == pid)
return false;
else if(ret == -1 )
throw std::runtime_error(std::string("waitpid failed: ") + strerror(errno));
else
throw std::runtime_error("waitpid returned undocumented value");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function has a bug: the name of the function suggests that it's okay to call it on a non-running process with the expected return value false. This behavior was ensured with the previously present isRunning flag.

Whats the behavior now, if we call a process, kill it and then call alive after the collection of the termination signal again. I think the function would not show the expected behavior.

See also comment regarding killAll and handles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be an issue. When the process just finished, waitpid will return the pid of the exited process, hence alive will just return false. When waitpid returns -1, it means that it could not check the status of the process.


bool Spawner::ProcessHandle::wait(int cycles, int usecs_between_cycles) const
{
while(alive())
{
int exitStatus = WEXITSTATUS(status);
std::cout << "Process " << pid << " terminated normaly, return code " << exitStatus << std::endl;
isRunning = false;
usleep(usecs_between_cycles);
cycles --;
if(cycles <= 0)
return false;
}

if(WIFSIGNALED(status))
{
isRunning = false;

int sigNum = WTERMSIG(status);

if(sigNum == SIGSEGV)
{

std::cout << "Process " << processName << " segfaulted " << std::endl;
}
else
{
std::cout << "Process " << processName << " was terminated by SIG " << sigNum << std::endl;
}

}

return isRunning;
return true;
}



const Deployment& Spawner::ProcessHandle::getDeployment() const
{
return *deployment;
Expand Down Expand Up @@ -256,7 +233,41 @@ void Spawner::ProcessHandle::sendSigTerm() const
}
}

bool Spawner::ProcessHandle::end()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for having this function as well as killDeployment + wait(). Looks like is a re-implementaion of the same thing (also re-implementing the sendSig* functions), and it looks like end() is called nowhere.. so I wonder what's its purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely true, I forgot to remove this one. It is now gone.

{
if(kill(pid, SIGINT))
{
std::stringstream msg;
msg << "Error sending of SIGINT to pid " << pid << " failed:" << strerror(errno);
throw std::runtime_error(msg.str());
}

int cycles = 500;
bool finished = false;
while(!finished && cycles > 0)
{
usleep(500);
cycles--;
finished = !alive();
}

if(finished)
return true;

if(kill(pid, SIGKILL))
{
std::cout << "Error sending of SIGKILL to pid " << pid << " failed:" << strerror(errno) << std::endl;
}

while(!finished && cycles > 0)
{
usleep(500);
cycles--;
finished = !alive();
}

return finished;
}

Spawner::ProcessHandle &Spawner::spawnTask(const std::string& cmp1, const std::string& as, bool redirectOutput)
{
Expand All @@ -272,6 +283,13 @@ Spawner::ProcessHandle& Spawner::spawnDeployment(Deployment* deployment, bool re
logDir = Bundle::getInstance().getLogDirectory();
}

// rename the logger of default deployments
// this guarantees that every task has it's own logger
if(deployment->getName().find("orogen_default_") == 0)
{
deployment->renameTask(deployment->getLoggerName(), deployment->getName() + "_Logger");
}

ProcessHandle *handle = new ProcessHandle(deployment, redirectOutput, logDir);

handles.push_back(handle);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think handles never get removed from the handles-variable. Thus there's no real cleanup happening. This problem apparently was present also before the MR, but with the issue described at alive this might lead to unexpected behavior...

e.g. A single deployment was killed, afterwards killAll is called. In this case alive will be called on the already-terminated process again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the case, I need to look into this a bit further. But I think this issue existed before already.

Expand All @@ -284,6 +302,16 @@ Spawner::ProcessHandle& Spawner::spawnDeployment(Deployment* deployment, bool re
return *handle;
}

Spawner::ProcessHandle& Spawner::getDeployment(const std::string& dplName)
{
for(Spawner::ProcessHandle* h : handles)
{
if(h->getDeployment().getName() == dplName)
return *h;
}
throw std::runtime_error(std::string("Deployment does not exist: ") + dplName);
}

Spawner::ProcessHandle& Spawner::spawnDeployment(const std::string& dplName, bool redirectOutput)
{
Deployment *deploment = new Deployment(dplName);
Expand Down Expand Up @@ -344,6 +372,24 @@ void Spawner::waitUntilAllReady(const base::Time& timeout)
}
}

bool Spawner::killDeployment(const std::string &dplName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look at comment above regarding end()

{
Spawner::ProcessHandle &handle = getDeployment(dplName);

handle.sendSigInt();
if(!handle.wait())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the parameters for wait should be configurable here. I'd suggest to add them also to killDeployment

{
std::cout << "Failed to terminate deployment '" << dplName << "', trying to kill..." << std::endl;
handle.sendSigKill();
if(!handle.wait())
{
std::cout << "Failed to kill deployment '" << dplName << "'." << std::endl;
return false;
}
}
return true;
}

void Spawner::killAll()
{
//first we try to stop and cleanup the processes
Expand Down
15 changes: 14 additions & 1 deletion src/Spawner.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ class Spawner : public boost::noncopyable
public:
class ProcessHandle
{
mutable bool isRunning;
pid_t pid;
void redirectOutput(const std::string &filename);
std::string processName;
Expand All @@ -41,6 +40,8 @@ class Spawner : public boost::noncopyable

const Deployment &getDeployment() const;
bool alive() const;
bool wait(int cycles=500, int usecs_between_cycles=500) const;
bool end();
void sendSigInt() const;
void sendSigTerm() const;
void sendSigKill() const;
Expand Down Expand Up @@ -85,6 +86,12 @@ class Spawner : public boost::noncopyable
* */
ProcessHandle &spawnDeployment(Deployment *deployment, bool redirectOutput = true);

/**
* Get a deployment by its name
* @arg dplName
*/
ProcessHandle &getDeployment(const std::string &dplName);

/**
* This method checks if all spawened processes are still alive
* @return false if any process died
Expand All @@ -105,6 +112,12 @@ class Spawner : public boost::noncopyable
* */
void waitUntilAllReady(const base::Time &timeout);

/**
* Kill deployment with the given name
* @arg dplName
*/
bool killDeployment(const std::string &dplName);

/**
* This method first sends a sigterm to all processes
* and waits for the processes to terminate. If this
Expand Down