Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce CxxAstVisitorProfiler for CxxAstScanner #1507

Closed
wants to merge 1 commit into from

Conversation

ivangalkin
Copy link
Contributor

@ivangalkin ivangalkin commented Jun 25, 2018

Hi @guwirth, @Bertk, @jmecosta and All,

I am still suffering from the plugin slowdown, which is most probably caused by the CxxSquidSensor.
I tried hprof as profiling agent, but it is too fine-grained and works slow with the real SonarQube projects. Since we already plan to make the CxxSquidSensor parallel (#1493), my idea was to introduce some high-level "flight data recorder": for each check (or rather for each AstVisitor) collect its duration. See details below.

My hand-made profiler has several disadvantages, e.g. the analysis has to be implemented in some external tool (e.g. in python). Do you know some better standard profiling tools? Or maybe there are standard tools, which could import the fligh-data and visualize/analyze it? What do you think of this patch in particular? Any ideas are welcome. Thank you!

Details of the current patch:

  • introduce the hidden property sonar.cxx.squidprofiler.out=<path>

  • if property set, decorate each AstVisitor (all squid checks, CPD, highlighting etc), so that it tracks the elapsed time per source file

  • dump this duration as well as other invocation details to <path> in JSON format

  • currently only duration is logged

  • for parallel execution we can log start/stop time also, in order to check the CPU utilization level (see Speed-up analysis time with thread pool #1493)

  • output: semi-JSON format for python analysis (e.g. in http://jupyter.org/)
    { "visitor": "%s", "source_file": "%s", "duration_ms": "%d" }


This change is Reviewable

@ivangalkin ivangalkin requested review from Bertk, guwirth and jmecosta June 25, 2018 22:07
@guwirth
Copy link
Collaborator

guwirth commented Jun 26, 2018

@ivangalkin

  • First of all the question is if a check is really the root cause? Have you turned all checks off to see if it is still slow?
  • Think we should not introduce new files / settings, should be sonar.showProfiling
    or sonar.log.level=DEBUG/TRACE. By the way: have you already tried showProfiling?
    • Maybe you can also look at the core what they do in case of showProfiling=true
  • I don't like to add this additional code. But it's just a feeling...

* introduce the hidden property sonar.cxx.squidprofiler.out=<path>
* if property set, decorate each AstVisitor (all squid checks, CPD, highlighting etc), so that it tracks the elapsed time per source file
* dump this duration as well as other invocation details to <path> in JSON format
* currently only duration is logged
* for parallel execution we can log start/stop time also, in order to check the CPU utilization level (see SonarOpenCommunity#1493)
* output: semi-JSON format for python analysis (e.g. in http://jupyter.org/)
  `{ "visitor": "%s", "source_file": "%s", "duration_ms": "%d" }`
@ivangalkin ivangalkin changed the title introduce CxxSquidProfiler for CxxSquidSensor introduce CxxAstVisitorProfiler for CxxAstScanner Jun 27, 2018
@ivangalkin
Copy link
Contributor Author

Hi @guwirth,

thank you for your comments. I've overseen the existence of sonar.showProfiling property. I will give it a try, although I have not seen, that it is used in SSLR SquidBridge. This component "orchestrates" the whole analysis in sonar-cxx and this part is of the highest interest to me.

In a meantime I updated the patch in order to increase the scope of visitors. I've identified a couple of hotspots already and will come back with conclusions later.

There is no need to merge this PR. It's for internal usage only and probably of no interest to other developer AFAIS. Actually, I would like to propose it to sslr-squid-bridge, but this project looks almost frozen to me.

@m-g-sonar are you interested in profiling functionality for sslr-squid-bridge?

@m-g-sonar
Copy link

Hey @ivangalkin,

Thanks for the ping. We are indeed slowly but surely moving away from sslr-squid-bridge at SonarSource, dropping features and progressively stopping maintenance of the project. Today, we are only keeping it "more or less" up-to-date for a very limited set of projects at SonarSource. The idea being to completely remove all dependencies on it eventually. While help/simplify profiling is a quite interesting topic, sslr-squid-bridge is probably not the best place to introduce that! @henryju something to integrate/promote to the scanner API maybe?

Cheers,
Michael

@ivangalkin
Copy link
Contributor Author

Hi @m-g-sonar! We suspected the sunsetting of sslr-squid-bridge. It is understandable, that there is no demand in features like profiling. Nevertheless thank you for the quick and very informative response! Best regards, Ivan

ivangalkin added a commit to ivangalkin/sonar-cxx that referenced this pull request Jun 28, 2018
* `PUBLIC_API`, `PUBLIC_UNDOCUMENTED_API` and `PUBLIC_DOCUMENTED_API_DENSITY`
  were depricated since SQ 6.2 (https://jira.sonarsource.com/browse/SONAR-8328)
  but existed as custom sonar-cxx metrics

* the minimal supported SQ version now is 6.7

* moreover, it looks like there is a general problem in displaying
  custom metrics (see SonarOpenCommunity#1509)

* that means that nobody a) expects that the SQ plugin implements the
  deprectad metrics and b) nobody misses them (because they are just not
  visualized)

* BTW public API measurements belong to the obligatory AST Visitors
  and introduce the time overhead of ~6% (measured with SonarOpenCommunity#1507,
  6% means, that if there is no sensors activated at all
  the importing of C++ project will still cause a calculation
  of a bunch of metrics; summary overhead of this calculation
  considered as 100%)
ivangalkin added a commit to ivangalkin/sonar-cxx that referenced this pull request Jun 28, 2018
* `PUBLIC_API`, `PUBLIC_UNDOCUMENTED_API` and `PUBLIC_DOCUMENTED_API_DENSITY`
  were depricated since SQ 6.2 (https://jira.sonarsource.com/browse/SONAR-8328)
  but existed as custom sonar-cxx metrics

* the minimal supported SQ version now is 6.7

* moreover, it looks like there is a general problem in displaying
  custom metrics (see SonarOpenCommunity#1509)

* that means that nobody a) expects that the SQ plugin implements the
  deprectad metrics and b) nobody misses them (because they are just not
  visualized)

* the squid check `UndocumentedApiCheck` is not affected
  (it doesn't rely on the stored metric, but visits the AST
  by itself)

* BTW public API measurements belong to the obligatory AST Visitors
  and introduce the time overhead of ~6% (measured with SonarOpenCommunity#1507,
  6% means, that if there is no sensors activated at all
  the importing of C++ project will still cause a calculation
  of a bunch of metrics; summary overhead of this calculation
  considered as 100%)
ivangalkin added a commit to ivangalkin/sonar-cxx that referenced this pull request Jun 28, 2018
* `PUBLIC_API`, `PUBLIC_UNDOCUMENTED_API` and `PUBLIC_DOCUMENTED_API_DENSITY`
  were depricated since SQ 6.2 (https://jira.sonarsource.com/browse/SONAR-8328)
  but existed as custom sonar-cxx metrics

* the minimal supported SQ version now is 6.7

* moreover, it looks like there is a general problem in displaying
  custom metrics (see SonarOpenCommunity#1509)

* that means that nobody a) expects that the SQ plugin implements the
  deprectad metrics and b) nobody misses them (because they are just not
  visualized)

* the squid check `UndocumentedApiCheck` is not affected
  (it doesn't rely on the stored metric, but visits the AST
  by itself) but `CxxPublicApiVisitorTest` had to be rewritten

* BTW public API measurements belong to the obligatory AST Visitors
  and introduce the time overhead of ~6% (measured with SonarOpenCommunity#1507,
  6% means, that if there is no sensors activated at all
  the importing of C++ project will still cause a calculation
  of a bunch of metrics; summary overhead of this calculation
  considered as 100%)
@ivangalkin
Copy link
Contributor Author

  1. Fix #1509 - enable custom metrics introduced by sonar-cxx #1514 improved performance and reduced the memory overhead: aggregation of metrics is not done in Hashtables anymore, but rather performed server-side in MeasureComputer. So there is no big need in profiling anymore.

  2. there is no interest for merging this functionality neither from sonar-cxx, nor from sslr-squid-bridge side

  3. I can imagine, that we could benefit from profiling, when somebody starts to implement Speed-up analysis time with thread pool #1493. But for now I will close this PR.

Thank you for the comments.

@ivangalkin ivangalkin closed this Aug 3, 2018
@jmecosta
Copy link
Member

jmecosta commented Aug 4, 2018 via email

@ivangalkin
Copy link
Contributor Author

In case someone uses/will use this code for profiling: I've done a wrong assumption about the order of visitX()/leaveX() invocations. It's not enough to measure the elapsed time between visitFile() and leaveFile(). One has to measure time spent inside of each visitX()/leaveX() method. See implementation of org.sonar.squidbridge.AstScanner::scanFiles(Collection<File> files).

The following patch solves the problem:

diff --git a/cxx-squid/src/main/java/org/sonar/cxx/CxxAstVisitorProfiler.java b/cxx-squid/src/main/java/org/sonar/cxx/CxxAstVisitorProfiler.java
index fde2add..d2a44d9 100644
--- a/cxx-squid/src/main/java/org/sonar/cxx/CxxAstVisitorProfiler.java
+++ b/cxx-squid/src/main/java/org/sonar/cxx/CxxAstVisitorProfiler.java
@@ -48,7 +48,7 @@ public class CxxAstVisitorProfiler {
 
   public static class CxxProfilerDecorator_AstVisitor<G extends Grammar> extends SquidAstVisitor<G> {
     protected SquidAstVisitor<G> v;
-    private long startTimestamp = 0;
+    private long nanoPerFile = 0;
     private File outFile;
 
     public CxxProfilerDecorator_AstVisitor(SquidAstVisitor<G> visitor, File outF) {
@@ -99,25 +99,31 @@ public class CxxAstVisitorProfiler {
 
     @Override
     public void visitFile(@Nullable AstNode ast) {
-      startTimestamp = System.nanoTime();
+      final long startTimestamp = System.nanoTime();
       v.visitFile(ast);
+      nanoPerFile = System.nanoTime() - startTimestamp;
     }
 
     @Override
     public void leaveFile(@Nullable AstNode ast) {
+      final long startTimestamp = System.nanoTime();
       v.leaveFile(ast);
-      long nanoDuration = System.nanoTime() - startTimestamp;
-      writeEvent(nanoDuration);
+      nanoPerFile += System.nanoTime() - startTimestamp;
+      writeEvent(nanoPerFile);
     }
 
     @Override
     public void visitNode(AstNode ast) {
+      final long startTimestamp = System.nanoTime();
       v.visitNode(ast);
+      nanoPerFile += System.nanoTime() - startTimestamp;
     }
 
     @Override
     public void leaveNode(AstNode ast) {
+      final long startTimestamp = System.nanoTime();
       v.leaveNode(ast);
+      nanoPerFile += System.nanoTime() - startTimestamp;
     }
   }

ivangalkin added a commit to ivangalkin/sonar-cxx that referenced this pull request Oct 2, 2018
* I profiled our visitors (SonarOpenCommunity#1507) and came to the conclusion,
  that CxxParseErrorLoggerVisitor might require a terrible amount
  of time in order to analyze and to print out all the parsing errors

* The ordinary user should not be interested in the parsing process.
  He/she has no influence on that. Such information is important
  if parsing affects some internal checks or metrics. If user decides
  to submit a bug report, the full debug trace will include the
  parsing errors.

* In general I believe, that our trace is too verbose. So this
  change address this problem too.
ivangalkin added a commit to ivangalkin/sonar-cxx that referenced this pull request Oct 4, 2018
* I profiled our visitors (SonarOpenCommunity#1507) and came to the conclusion,
  that CxxParseErrorLoggerVisitor might require a terrible amount
  of time in order to analyze and to print out all the parsing errors

* The ordinary user should not be interested in the parsing process.
  He/she has no influence on that. Such information is important
  if parsing affects some internal checks or metrics. If user decides
  to submit a bug report, the full debug trace will include the
  parsing errors.

* In general I believe, that our trace is too verbose. So this
  change address this problem too.

UPDATE:
* review comments were applied: once per module there will be a
  WARN trace about parsing errors (if relevant). Errors details
  will be printed only in DEBUG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

4 participants