Software quality assurance days 25th International Conference on software quality issues sqadays.com St. Petersburg, May 31 – June 1, 2019 Expanding the idea of static analysis from code check to other development processes
About the speaker • Maxim Stefanov (stefanov@viva64.com) • C++/Java developer in PVS-Studio • Activity: • Participation in the development of the C++ analyzer core • Participation in the development of the Java analyzer
Content • General concepts of static analysis • Classic look • Alternative look • Conclusion
Static analysis is...
Static analysis inputs • Program code • Data in JSON, YAML, XML format • Documentation / article • Scematic, 3D model (in design) • …
Classic look: code base analysis
Code analysis is necessary for… • Searching for bugs, vulnerabilities, bottlenecks • Code formatting • Various metrics calculation • Code compliance verification • …
Abstract syntax tree while(b != 0) { if (a > b) { a = a - b; } else { b = b - a; } } return a;
Control flow graph (1) while (x < 50) { (2) if (a / b > 5) { (3) a = a - b; } else { (4) b = b - a; (5) } (6) x++; } (7) doSomething();
And… • Parsing tree • Abstract semantic graph (semantic model) • …
Search for bugs, vulnerabilities and bottlenecks
Search for bugs, vulnerabilities and bottlenecks • AST-based defect pattern search • Defects searching based on semantic model • Searching based on data flow analysis • … In details:
Code formatting Why watch code formatting? • Compliance with the coding rules in company • Easy to read • Easy to maintain and debug • Increased probability of defect detection • …
Before formatting Action doIfSkilledSpeaker(double rating, int experience) { int index = 0; while(index < speakers.length) { Speaker sp = speakers[index]; if (sp.getRating()>=rating || sp.getExperience() > experience) { if(sp.isGirl()) { return LISTEN; } else if (doThink()) { return LISTEN; } } } return GO_HOME; }
Before formatting Action doIfSkilledSpeaker(double rating, int experience) { int index = 0; while(index < speakers.length) { Speaker sp = speakers[index]; if (sp.getRating()>=rating || sp.getExperience() > experience) { if(sp.isGirl()) { return LISTEN; } else if (doThink()) { return LISTEN; } } } return GO_HOME; }
Action getNextInterestingSpeaker(double rating, int experience) { int index = 0; while(index < speakers.length) { Speaker sp = speakers[index++]; if (sp.getRating() >= rating || sp.getExperience() > experience) { if (sp.isGirl()) { return LISTEN; } else if (doThink()) { return LISTEN; } } } return GO_HOME; } After formatting
Before formatting if (A) if (B) doSomething(); else doSomething(someObject);
if (A) { if (B) { doSomething(); } else { doSomething(someObj); } } After formatting if (A) if (B) doSomething(); else doSomething(someObj);
Metrics calculation. Why measure software? • Determining the quality of an existing product or process • Predicting productprocess quality • Improving productprocess quality
Metrics calculation • Quantitive metrics • Program complexity metrics • Program size metrics • Program control flow complexity metrics • Data flow complexity metrics • Object oriented metrics ...
Quantitive metrics • SLOC – code lines (physical, logical) • Amount and percentage of comments • Average number of lines for functions (classes, files) • Code Duplication Percentage • ...
Quantitive metrics: defects in code metrics • Defect density: «Number of defects in a separate module» -------------------------------------------------------------- «Total number of defects in software» • Regression coefficient: «Number of defects in old functional» ------------------------------------------------------------------------------------- «Total number of defects, including new functional»
McCabe Metrics: Cyclomatic Complexity M = E − N + 2P where: M – cyclomatic complexity, E – number of edges in the graph, N – number of nodes in the graph, P – number of connected components.
Cyclomatic complexity Some language constructions in a graph representation
Cyclomatic complexity: example int someMethod(...){ int bot = 0; int top = n - 1; int mid, cmp; while (bot <= top) { mid = (bot + top)/2; if (table[mid] == item) { return mid; } else if (compare) { bot = mid + 1; } else { top = mid -1; } } return -1; }
Cyclomatic complexity: example E = 14 N = 12 P = 1 M = 14–12 + (2*1) M = 4
Cyclomatic complexity: what for An overly high cyclomatic complexity factor leads to complexity of: • Understanding, supporting and debugging code, • testing.
Summing up the metrics. Note • Using of metrics as punishment is dangerous • Using of metrics for information and support is useful • Metrics are better in dynamics
Source code obfuscation/deobfustation The need of protection against analysis performed by both man and machine, as well as software with increased requirements for crack resistance, for example: • DRM key protection • Game protection against extraneous code (cheats and bots) • Source protection during transfer/sale
Source code obfuscation/deobfustation int COUNT = 100; float TAX_RATE = 0.2; for (int i=0; i < COUNT; i++) { tax[i] = orig_price[i] * TAX_RATE; price[i] = orig_price[i] + tax[i]; } Before: After: for(int a=0;a<100;a++){b[a]=c[a]*0.2;d[a]=c[a]+b[a];}
Source code obfuscation/deobfustation
Validation check Known formats used for transferring data: JSON, XML, YAML Matching tools: JSONLint, XMLLint, YAMLLint Perform the analysis and reveal: • Syntax errors • Extra commas, brackets, spaces, … • Key duplication • Key definition order • …
Validation check { "warnings": [{ "title": "Some Title", "code": "V6050", "cwe": 0, "level": 1, "title": "Some message.", // Duplicate key 'title' "falseAlarm": false }] }
Publisher: internal development Task: checking articles and documentation for correctness Input data: articles and documentation . Output data: List of errors: • Matching links • Image verification • Code fragments correctness • Checking the date and authors of the documentation, articles • ...
Example: checking pictures for alpha channels Expectation Reality
Example: link validation The error is that in the documentation in Russian, a link to the English language source is used
KOMPAS - Expert: static analysis inspiration • Input data: figure and 3D-model • Output data: List of errors: • Design standards compliance • Enterprise restriction lists compliance • KOMPAS – 3D work rules compliance
KOMPAS - Expert: static analysis inspiration
Conclusion: • Static analysis is gaining momentum in recent years • In the near future its application will only grow

Expanding the idea of static analysis from code check to other development processes

  • 1.
    Software quality assurancedays 25th International Conference on software quality issues sqadays.com St. Petersburg, May 31 – June 1, 2019 Expanding the idea of static analysis from code check to other development processes
  • 2.
    About the speaker •Maxim Stefanov (stefanov@viva64.com) • C++/Java developer in PVS-Studio • Activity: • Participation in the development of the C++ analyzer core • Participation in the development of the Java analyzer
  • 3.
    Content • General conceptsof static analysis • Classic look • Alternative look • Conclusion
  • 4.
  • 5.
    Static analysis inputs •Program code • Data in JSON, YAML, XML format • Documentation / article • Scematic, 3D model (in design) • …
  • 6.
    Classic look: codebase analysis
  • 7.
    Code analysis isnecessary for… • Searching for bugs, vulnerabilities, bottlenecks • Code formatting • Various metrics calculation • Code compliance verification • …
  • 8.
    Abstract syntax tree while(b!= 0) { if (a > b) { a = a - b; } else { b = b - a; } } return a;
  • 9.
    Control flow graph (1)while (x < 50) { (2) if (a / b > 5) { (3) a = a - b; } else { (4) b = b - a; (5) } (6) x++; } (7) doSomething();
  • 10.
    And… • Parsing tree •Abstract semantic graph (semantic model) • …
  • 11.
    Search for bugs,vulnerabilities and bottlenecks
  • 12.
    Search for bugs,vulnerabilities and bottlenecks • AST-based defect pattern search • Defects searching based on semantic model • Searching based on data flow analysis • … In details:
  • 13.
    Code formatting Why watchcode formatting? • Compliance with the coding rules in company • Easy to read • Easy to maintain and debug • Increased probability of defect detection • …
  • 14.
    Before formatting Action doIfSkilledSpeaker(doublerating, int experience) { int index = 0; while(index < speakers.length) { Speaker sp = speakers[index]; if (sp.getRating()>=rating || sp.getExperience() > experience) { if(sp.isGirl()) { return LISTEN; } else if (doThink()) { return LISTEN; } } } return GO_HOME; }
  • 15.
    Before formatting Action doIfSkilledSpeaker(doublerating, int experience) { int index = 0; while(index < speakers.length) { Speaker sp = speakers[index]; if (sp.getRating()>=rating || sp.getExperience() > experience) { if(sp.isGirl()) { return LISTEN; } else if (doThink()) { return LISTEN; } } } return GO_HOME; }
  • 16.
    Action getNextInterestingSpeaker(double rating,int experience) { int index = 0; while(index < speakers.length) { Speaker sp = speakers[index++]; if (sp.getRating() >= rating || sp.getExperience() > experience) { if (sp.isGirl()) { return LISTEN; } else if (doThink()) { return LISTEN; } } } return GO_HOME; } After formatting
  • 17.
    Before formatting if (A) if(B) doSomething(); else doSomething(someObject);
  • 18.
    if (A) { if (B) { doSomething(); } else { doSomething(someObj); } } Afterformatting if (A) if (B) doSomething(); else doSomething(someObj);
  • 19.
    Metrics calculation. Why measuresoftware? • Determining the quality of an existing product or process • Predicting productprocess quality • Improving productprocess quality
  • 20.
    Metrics calculation • Quantitivemetrics • Program complexity metrics • Program size metrics • Program control flow complexity metrics • Data flow complexity metrics • Object oriented metrics ...
  • 21.
    Quantitive metrics • SLOC– code lines (physical, logical) • Amount and percentage of comments • Average number of lines for functions (classes, files) • Code Duplication Percentage • ...
  • 22.
    Quantitive metrics: defects incode metrics • Defect density: «Number of defects in a separate module» -------------------------------------------------------------- «Total number of defects in software» • Regression coefficient: «Number of defects in old functional» ------------------------------------------------------------------------------------- «Total number of defects, including new functional»
  • 23.
    McCabe Metrics: Cyclomatic Complexity M= E − N + 2P where: M – cyclomatic complexity, E – number of edges in the graph, N – number of nodes in the graph, P – number of connected components.
  • 24.
    Cyclomatic complexity Some languageconstructions in a graph representation
  • 25.
    Cyclomatic complexity: example intsomeMethod(...){ int bot = 0; int top = n - 1; int mid, cmp; while (bot <= top) { mid = (bot + top)/2; if (table[mid] == item) { return mid; } else if (compare) { bot = mid + 1; } else { top = mid -1; } } return -1; }
  • 26.
    Cyclomatic complexity: example E= 14 N = 12 P = 1 M = 14–12 + (2*1) M = 4
  • 27.
    Cyclomatic complexity: whatfor An overly high cyclomatic complexity factor leads to complexity of: • Understanding, supporting and debugging code, • testing.
  • 28.
    Summing up themetrics. Note • Using of metrics as punishment is dangerous • Using of metrics for information and support is useful • Metrics are better in dynamics
  • 29.
    Source code obfuscation/deobfustation Theneed of protection against analysis performed by both man and machine, as well as software with increased requirements for crack resistance, for example: • DRM key protection • Game protection against extraneous code (cheats and bots) • Source protection during transfer/sale
  • 30.
    Source code obfuscation/deobfustation intCOUNT = 100; float TAX_RATE = 0.2; for (int i=0; i < COUNT; i++) { tax[i] = orig_price[i] * TAX_RATE; price[i] = orig_price[i] + tax[i]; } Before: After: for(int a=0;a<100;a++){b[a]=c[a]*0.2;d[a]=c[a]+b[a];}
  • 31.
  • 32.
    Validation check Known formatsused for transferring data: JSON, XML, YAML Matching tools: JSONLint, XMLLint, YAMLLint Perform the analysis and reveal: • Syntax errors • Extra commas, brackets, spaces, … • Key duplication • Key definition order • …
  • 33.
    Validation check { "warnings": [{ "title":"Some Title", "code": "V6050", "cwe": 0, "level": 1, "title": "Some message.", // Duplicate key 'title' "falseAlarm": false }] }
  • 34.
    Publisher: internal development Task:checking articles and documentation for correctness Input data: articles and documentation . Output data: List of errors: • Matching links • Image verification • Code fragments correctness • Checking the date and authors of the documentation, articles • ...
  • 35.
    Example: checking pictures foralpha channels Expectation Reality
  • 36.
    Example: link validation The erroris that in the documentation in Russian, a link to the English language source is used
  • 37.
    KOMPAS - Expert: staticanalysis inspiration • Input data: figure and 3D-model • Output data: List of errors: • Design standards compliance • Enterprise restriction lists compliance • KOMPAS – 3D work rules compliance
  • 38.
    KOMPAS - Expert: staticanalysis inspiration
  • 39.
    Conclusion: • Static analysisis gaining momentum in recent years • In the near future its application will only grow