-
-
Notifications
You must be signed in to change notification settings - Fork 162
Commit
…lower memory usage
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,7 @@ | |
# Licensed under MIT (https://github.com/authentic-theme/authentic-theme/blob/master/LICENSE) | ||
# | ||
use strict; | ||
use Fcntl ':flock'; | ||
|
||
our $json; | ||
eval "use JSON::XS"; | ||
|
@@ -37,6 +38,68 @@ | |
virt => acl_system_status('virt'), | ||
}; | ||
|
||
# Read file contents | ||
sub stats_read_file_contents { | ||
my ($filename) = @_; | ||
# Check if file is readable | ||
return undef unless -r $filename; | ||
# Read file contents | ||
my $fh; | ||
if (!open($fh, '<', $filename)) { | ||
error_stderr("Failed to open file '$filename' for reading: $!"); | ||
return undef; | ||
} | ||
# Acquire lock | ||
if (!flock($fh, LOCK_SH)) { | ||
error_stderr("Failed to acquire shared lock on file '$filename': $!"); | ||
close($fh); | ||
return undef; | ||
} | ||
# Read file contents | ||
local $/; | ||
my $contents = <$fh>; | ||
# Release lock | ||
flock($fh, LOCK_UN); | ||
# Close file | ||
if (!close($fh)) { | ||
error_stderr("Failed to close file '$filename': $!"); | ||
return undef; | ||
} | ||
# Return file contents | ||
return $contents; | ||
} | ||
|
||
# Write file contents | ||
sub stats_write_file_contents { | ||
my ($filename, $contents) = @_; | ||
# Open file for writing | ||
my $fh; | ||
if (!open($fh, '>', $filename)) { | ||
error_stderr("Failed to open file '$filename' for writing: $!"); | ||
return 0; | ||
} | ||
# Acquire lock the file for writing | ||
if (!flock($fh, LOCK_EX)) { | ||
This comment has been minimized.
Sorry, something went wrong.
This comment has been minimized.
Sorry, something went wrong.
iliajie
Author
Collaborator
|
||
error_stderr("Failed to acquire exclusive lock on file '$filename': $!"); | ||
close($fh); | ||
return 0; | ||
} | ||
# Write to file | ||
if (!print($fh $contents)) { | ||
error_stderr("Failed to write to file '$filename': $!"); | ||
close($fh); | ||
return 0; | ||
} | ||
# Unlock the file | ||
flock($fh, LOCK_UN); | ||
# Close file | ||
if (!close($fh)) { | ||
error_stderr("Failed to close file '$filename': $!"); | ||
return 0; | ||
} | ||
return 1; | ||
} | ||
|
||
# Check if feature is enabled | ||
sub get_stats_option | ||
{ | ||
|
@@ -181,7 +244,7 @@ sub get_stats_history | |
my ($noempty) = @_; | ||
my $file = "$var_directory/modules/$current_theme". | ||
"/real-time-monitoring.json"; | ||
my $graphs = jsonify(read_file_contents($file)); | ||
my $graphs = jsonify(stats_read_file_contents($file)); | ||
# No data yet | ||
if (!keys %{$graphs}) { | ||
unlink($file); | ||
|
@@ -267,9 +330,7 @@ sub save_stats_history | |
# Save data | ||
my $file = "$var_directory/modules/$current_theme". | ||
"/real-time-monitoring.json"; | ||
lock_file($file); | ||
write_file_contents($file, $json->encode($graphs)); | ||
unlock_file($file); | ||
stats_write_file_contents($file, $json->encode($graphs)); | ||
# Release memory | ||
undef($graphs_chunk); | ||
undef($all_stats_histoy); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -94,12 +94,14 @@ my $statsserver_cmd = "$config_directory/$current_theme/$server_name"; | |
create_wrapper($statsserver_cmd, $current_theme, $server_name) | ||
if (!-r $statsserver_cmd); | ||
|
||
# Launch the server | ||
# Launch the server in a sub-process (no fork) | ||
my $logfile = $get_logfile->($port); | ||
my $rs = system_logged( | ||
"ulimit -v 384000; ". # Set sanity limit for memory usage to 384 MB | ||
This comment has been minimized.
Sorry, something went wrong.
jcameron
Collaborator
|
||
"SESSION_ID=$main::session_id ". | ||
"$statsserver_cmd @{[quotemeta($port)]} ". | ||
">$logfile 2>&1 </dev/null &"); | ||
|
||
# Return the result | ||
print_json({ success => !$rs, port => $port, | ||
socket => $get_socket->($port), | ||
|
11 comments
on commit c3098fe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do these locking and file reading functions fix the memory issue exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do these locking and file reading functions fix the memory issue exactly?
I didn't dig deeply, so I cannot tell you exactly how. However, my assumption is that there is something in the lock_file
, unlock_file
, read_file_contents
, and write_file_contents
subs and everything they call subsequently that created this issue.
Nevertheless, it's understandable because there has never been such an application for the Webmin API before — it was always meant to interrupt the process in a short period of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly I'm not comfortable with a fix that we don't understand how it works! Also, reading/writing and locking/unlocking are very different operations, so it would be useful to know which one of these is causing the memory leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't understand how it works!
We clearly understand how it work! Check this comment please.
What we don't understand (yet) is why the current implementation of lock_file
and unlock_file
doesn't release resources. I have an off-hand guess that it has everything to do with garbage collection and how things are meant to be cleared when terminating the whole process. In stats.pl
, we don't have any of that since this is a long-running process.
Also, reading/writing and locking/unlocking are very different operations, so it would be useful to know which one of these is causing the memory leak.
Sure, I can dig into that, but it shouldn't change anything regarding the current local implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we don't understand (yet) is why the current implementation of lock_file and unlock_file doesn't release resources.
That's pretty important!!! I would really rather we don't add new features if they have memory leaks when using the standard APIs in Webmin. Implementing a work-around to fix it isn't solving the root problem, and could mean that we're not truly fixing the issue properly. I know you're excited about this new real-time stats feature, but it has to be done correctly, even if that means fixing bugs in the Webmin API.
Also, why is the save_stats_history function needed at all? I though that the long-running stats.pl process was just serving data from memory..
Also, is there only one stats.pl process running ever? If so, maybe locking isn't even needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementing a work-around to fix it isn't solving the root problem, and could mean that we're not truly fixing the issue properly.
This is correct, and I agree! However, the Webmin API has never been used like this before, i.e. in infinitely running process.
I created this simple and effective solution to perhaps help you rethink how we manage file locking and unlocking. I wouldn't risk changing these core subroutines right before the new Webmin release! I needed a solution that worked locally, and I made it happen.
I know you're excited about this new real-time stats feature, but it has to be done correctly, even if that means fixing bugs in the Webmin API.
I absolutely agree. However, the Webmin API doesn't need an immediate fix for this. As I've mentioned before, stats.pl
is a never-ending, long-running process. It's entirely new, and make take months to address properly. We need a new Webmin release now!
Also, why is the save_stats_history function needed at all? I though that the long-running stats.pl process was just serving data from memory..
Yes, it does, but we don't want to show empty graphs when the user returns to the Dashboard or logs back in. We want the user to see the last 20 minutes of server activity. So, we save stats to a file on the disk every 30 seconds and trim the file to keep only 20 minutes of data (this duration is fixed and can't be changed in the UI, though can be changed using browser's console but no more than to 60 minutes).
Also, is there only one stats.pl process running ever? If so, maybe locking isn't even needed?
Yes, as I mentioned earlier, it's not strictly necessary. However, since I've already implemented it, I'd prefer to keep it for a potentially easier integration with the Webmin API in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just because a solution "works" doesn't mean it is correct. I'd actually prefer to disable this new stats.pl feature by default until we can get it implemented properly, than to push out a new release that contains a work-around solution that we don't understand. We really have to focus on doing things right, not on new features or rushing out new releases. Because that's where bugs and security holes come from.
I understand why we need to save the stats history though, so we shouldn't remove that.
The real question is where the memory leak comes from.. if you were to revert all this code and just use read_file_contents
and write_file_contents
with no locking, does the leak still happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just because a solution "works" doesn't mean it is correct.
Jamie, I completely agree and understand! Though, what I don't understand yet, is what exactly am I doing wrong in my currently checked-in code? Does my current code introduce a bug? If so, how exactly? If there’s anything wrong with my code, I’d be happy to fix it right here and now.
I'd actually prefer to disable this new stats.pl feature by default
This won’t be possible to disable it without overriding each and every user preference — both when disabling it now and later when re-enabling it.
Until we can get it implemented properly
What is implemented wrong exactly? I would love to fix it, right here and right now. If you told me what's wrong.
Than to push out a new release that contains a work-around solution
Which workaround? I'm just reading and writing a file to the disk.
That we don't understand
Are you referring to a memory leak problem that occurs when I don’t use locally implemented functions to lock, read, and write a file? Well, in that case, I don’t see how this relates to stats.pl
in any way .. But I agree that we don’t yet understand why it happens when the other subs are used, though this is not related to the currently checked-in code.
We really have to focus on doing things right, not on new features or rushing out new releases. Because that's where bugs and security holes come from.
I cannot agree more!
I understand why we need to save the stats history though, so we shouldn't remove that.
Excellent, thanks!
The real question is where the memory leak comes from
To my knowledge, with the currently checked-in code, it isn’t coming from anywhere in stats.pl
or any related stats libraries. Or is it?
If you were to revert all this code and just use read_file_contents and write_file_contents with no locking, does the leak still happen?
This is a good question! I’m not sure, as I replaced those all at once.
Since you asked, I’ll run deeper tests today and let you know! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you were to revert all this code and just use read_file_contents and write_file_contents with no locking, does the leak still happen?
This is a good question! I’m not sure, as I replaced those all at once.
Since you asked, I’ll run deeper tests today and let you know!
Alright, from what I can observe, I don't see any clear indication of a memory leak when using the stock read_file_contents
and write_file_contents
subs.
However, after adding lock_file
and unlock_file
, I clearly noticed memory usage for the stats.pl
process climbing up.
Now, I'm trying lock_file($file, undef, undef, 1);
instead to see how it goes to narrow down the problem ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, using no-diff upon locking a file with Webmin API seems to
help
Yes!! That would explain it ... when logging is enabled, the lock_file / unlock_file functions store diffs in RAM so they can be included if and when webmin_log is eventually called. And since stats.pl runs forever, they would just accumulate and consume unlimited memory! Glad you were able to find this.
Although lock_file has a param to turn off logging, there's also a global variable $no_log_file_changes
that you should set instead at the top of stats.pl
so that all lock_file functions called indirectly don't do any logging of diffs into memory.
To answer your original question, THIS is the fix I was looking for because now we truly understand what is going wrong, and it only requires a single line of code to fix it. That's clearly the better option than writing a custom locking function for this case.
flock is great, but doesn't this mean that the file has already been opened for writing and thus truncated before the lock is taken? This is why Webmin's
lock_file
function uses a.lock
file to prevent concurrent access before opening the file.