NSA, Ghidra and unicorns

This time, the PVS-Studio team was attracted by Ghidra, a large and evil framework for reverse engineering with which you can analyze various binary files and do all kinds of scary things with them. The most interesting thing about it is not that it is free to use or extends well with plugins, but that they wrote it in the NSA and posted the source code on GitHub for everyone. On the one hand, the NSA seems to have enough resources to keep the code base clean. And on the other hand, new contributors who were not very familiar with it could have accidentally added undetected bugs lately. Therefore, armed with a static analysis, we decided to look for weaknesses in this project.


In total, the PVS-Studio static analyzer issued 651 high, 904 medium, and 909 low warnings in the Java part of the Ghidra project ( release 9.1.2, commit 687ce7f ). Among them, about half of the high and medium responses were triggered by the diagnostics " V6022 Parameter is not used inside method's body", which usually appear after refactoring, when some parameter was no longer needed or some functionality was temporarily disabled by comments. A quick look at these warnings (there are too many of them to view each as an outside observer) in this project did not reveal anything clearly suspicious. It is probably possible for this project to temporarily disable this diagnostics in the analyzer settings so as not to be distracted by it. In practice, typos can often be found in the name of a setter or constructor parameter and, generally speaking, it should not be neglected. I am sure that most readers have at least once encountered such an unpleasant pattern:

public class A { private String value; public A(String val) {//V6022 this.value=value; } public int hashCode() { return value.hashCode();//NullPointerException } } 

More than half of the low warnings were made by the diagnosis " V6008 Potential null dereference of 'variable'" - for example, the value is often used File.getParentFile () without checking for null . If the file object on which this method was called was constructed without an absolute path, null will be returned and the absence of verification may drop the application.

By tradition, we will analyze only warnings of the high and medium levels, since most of the real errors are contained in them. When working with analyzer reports, we always recommend analyzing warnings in descending order of their reliability.

Next, we consider a few fragments indicated by the analyzer that seemed suspicious or interesting to me. The code base of the project turned out to be of impressive size, and it is almost impossible to find such places manually.

Fragment 1: broken validation

private boolean parseDataTypeTextEntry() throws InvalidDataTypeException { ... try { newDataType=parser.parse(selectionField.getText(), getDataTypeRootForCurrentText()); } catch (CancelledException e) { return false; } if (newDataType != null) { if (maxSize >= 0 && newDataType.getLength() > newDataType.getLength()) {//<= throw new InvalidDataTypeException("data-type larger than " + maxSize + " bytes"); } selectionField.setSelectedValue(newDataType); return true; } return false; } 

PVS-Studio warning: V6001 There are identical sub-expressions 'newDataType.getLength ()' to the left and to the right of the '>' operator. DataTypeSelectionEditor.javahaps66

This class provides a graphical component for selecting a data type that supports autocompletion. The developer using this component can set the maximum allowable size of the selected data type (via the maxSize field) or make it unlimited by setting a negative value. It was assumed that when validating the entered data, exceeding the limit throws an exception, which is then caught higher in the call stack and the user is shown the corresponding message.

It seems that the author of the component was distracted right at the time of writing this test, or maybe he thought about the meaning of life, but in the end, validation is simply not performed, since the number can never be larger than itself and, accordingly, we get ignoring this condition. This means that this component can provide invalid data.

Another similar error was found in two more classes: GuidUtil and NewGuid .

public class GuidUtil { ... public static GuidInfo parseLine(...) { ... long[] data=new long[4]; ... if (isOK(data)) { if (!hasVersion) { return new GuidInfo(guidString, name, guidType); } return new VersionedGuidInfo(guidString, version, name, guidType); } return null; } ... private static boolean isOK(long[] data) { for (int i=0; i < data.length; i++) { if ((data[i] != 0) || (data[i] != 0xFFFFFFFFL)) {//<= return true; } } return false; } ... } 

PVS-Studio warning: V6007 Expression 'data [i]!=0xFFFFFFFFL' is always true. GuidUtil.java: 200

The for loop of the isOK method checks that the same value is not equal to two different numbers at the same time. If so, then the GUID is immediately recognized as valid. That is, the GUID will be declared invalid only when the data array is empty, and this never happens, since the value of the corresponding variable is assigned only once - at the beginning of the parseLine method.

The body of the isOK method in both classes completely coincides, which suggests the idea of ​​another copy-paste of incorrect code. What exactly the author wanted to check, I'm not sure, but I can assume that this method should be fixed as follows:

private static boolean isOK(long[] data) { for (int i=0; i < data.length; i++) { if ((data[i] == 0) || (data[i] == 0xFFFFFFFFL)) { return false; } } return true; } 

Fragment 2: hiding exceptions

public void putByte(long offsetInMemBlock, byte b) throws MemoryAccessException, IOException { long offsetInSubBlock=offsetInMemBlock - subBlockOffset; try { if (ioPending) { new MemoryAccessException("Cyclic Access");//<= } ioPending=true; doPutByte(mappedAddress.addNoWrap(offsetInSubBlock/8), (int) (offsetInSubBlock % 8), b); } catch (AddressOverflowException e) { new MemoryAccessException("No memory at address");//<= } finally { ioPending=false; } } 

PVS-Studio warning: V6006 The object was created but it is not being used. The 'throw' keyword could be missing: 'new MemoryAccessException ("Cyclic Access")'. BitMappedSubMemoryBlock.java:99

Exception objects themselves are known to do nothing (or, at least, should not do anything). Almost always, their new instances are thrown through throw , in some rare cases - they are transferred somewhere, or maybe placed in a collection.

The class containing this method is a wrapper over a memory block that allows reading and writing data. Here, due to the fact that exceptions are not thrown, the imposed restriction of access to the current memory block using the ioPending flag may be violated and, in addition, AddressOverflowException is ignored. Thus, the data can be silently corrupted, and instead of explicitly indicating an error at a particular place, the developer will receive strange artifacts that have to be analyzed by the debugger.

There were eight of these lost exceptions:

  • BitMappedSubMemoryBlock.java: lines 77, 99, 106, 122
  • ByteMappedSubMemoryBlock.java: lines 52, 73, 92, 114

It is characteristic that in the same files there are extremely similar methods in which throw is present. Most likely, one method was originally written similarly to the above fragment, after which it was copied several times, somehow found an error and corrected it in those places that they could remember.

Fragment 3: minefield

private void processSelection(OptionsTreeNode selectedNode) { if (selectedNode == null) { setViewPanel(defaultPanel, selectedNode);//<= return; } ... } private void setViewPanel(JComponent component, OptionsTreeNode selectedNode) { ... setHelpLocation(component, selectedNode); ... } private void setHelpLocation(JComponent component, OptionsTreeNode node) { Options options=node.getOptions(); ... } 

PVS-Studio warning: V6008 Null dereference of 'selectedNode' in function 'setViewPanel'. OptionsPanel.java:266

The analyzer lied a little - at the moment, calling the processSelection method does not lead to NullPointerException , since this method is called only two times, and before it is called selectedNode explicitly checked for null . This should not be done, since another developer can see that the method explicitly handles the case of selectedNode == null , and decide that this is a valid value, which then will result in the application crashing. Especially dangerous are such surprises just in open projects, since people who do not know the code base thoroughly participate in them.

In general, I have to say that the whole processSelection method looks rather strange. It is likely that this is a copy-paste error, since in the same method an if block with the same body is encountered twice more, although with different conditions. However, at this point, selectedNode will no longer be null , and the call chain setViewPanel-setHelpLocation will not result in a NullPointerException .

Fragment 4: Autocomplete into evil

public static final int[] UNSUPPORTED_OPCODES_LIST={... }; public static final Set<Integer> UNSUPPORTED_OPCODES=new HashSet<>(); static { for (int opcode : UNSUPPORTED_OPCODES) { UNSUPPORTED_OPCODES.add(opcode); } } 

PVS-Studio warning: V6053 The collection is modified while the iteration is in progress. ConcurrentModificationException may occur. DWARFExpressionOpCodes.java:205

In this case, the analyzer again lied a little - the exception will not be thrown, since the UNSUPPORTED_OPCODES collection is always empty and the loop simply will not execute. In addition, it so happened that the collection is many, and adding an already present element will not change it. Most likely, the author entered the name of the collection in the for-each through auto-completion and did not notice that the wrong field was proposed.Modification of the collection during iteration on it is impossible, but in good circumstances, as in this case, the application may not fall. Here, this typo has an indirect effect: the machine that parses DWARF files relies on this collection to stop the analysis when finding unsupported opcodes.

Starting with Java 9, it is worth using the factory methods of the standard library for constant collections: for example, Set.of (T... elements) is not only much more convenient, but also immediately makes the created collection immutable, which increases the reliability of the code.

Fragment 5: there is everything

public void setValueAt(Object aValue, int row, int column) { ... int index=indexOf(newName); if (index >= 0) {//<= Window window=tool.getActiveWindow(); Msg.showInfo(getClass(), window, "Duplicate Name", "Name already exists: " + newName); return; } ExternalPath path=paths.get(row);//<= ... } private int indexOf(String name) { for (int i=0; i < paths.size(); i++) { ExternalPath path=paths.get(i); if (path.getName().equals(name)) { return i; } } return 0; } 

PVS-Studio Warnings:

  • V6007 Expression 'index >=0' is always true. ExternalNamesTableModel.java:105
  • V6019 Unreachable code detected. It is possible that an error is present. ExternalNamesTableModel.java:109

The author thought about it and, in the indexOf method, instead of "index" -1 for an unknown value, returns 0 - the index of the first element of the paths collection. Even if the collection is empty. Or maybe the method was generated, but forgot to change the default return value. As a result, the setValueAt method will discard any value passed to it and display a “Name already exists” error even if there is no existing name.

By the way, indexOf is not used anywhere else, and its value is needed only in order to determine whether the element you are looking for exists. Perhaps, instead of a separate method, you should write for-each directly in setValueAt and make return on the matching element instead of games with indexes.

Note: I could not reproduce the alleged error. The setValueAt method is probably no longer used or only called under certain conditions.

Fragment 6: keep silence

final static Map<Character, String> DELIMITER_NAME_MAP=new HashMap<>(20);//Any non-alphanumeric char can be used as a delimiter. static { DELIMITER_NAME_MAP.put(' ', "Space"); DELIMITER_NAME_MAP.put('~', "Tilde"); DELIMITER_NAME_MAP.put('`', "Back quote"); DELIMITER_NAME_MAP.put('@', "Exclamation point"); DELIMITER_NAME_MAP.put('@', "At sign"); DELIMITER_NAME_MAP.put('#', "Pound sign"); DELIMITER_NAME_MAP.put('$', "Dollar sign"); DELIMITER_NAME_MAP.put('%', "Percent sign"); ... } 

PVS-Studio warning: V6033 An item with the same key '@' has already been added. FilterOptions.java:45

Ghidra supports filtering data in a different context: for example, you can filter the list of project files by name. In addition, filtering by several keywords at once is implemented: '.java,.c' when the 'OR' mode is on, displays all files whose name contains either '.java' or '.c'. It is understood that any special character can be used as a word separator (a specific separator is selected in the filter settings), but in reality the exclamation mark was not available.

In such initialization sheets, it is very easy to seal up, as they are often written using copy-paste, and when you look at such code, your eyes quickly blur. And if the typo was not on two adjacent lines, then by hand it will almost certainly no one see.

Fragment 7: remainder of division is always 0

void setFactorys(FieldFactory[] fieldFactorys, DataFormatModel dataModel, int margin) { factorys=new FieldFactory[fieldFactorys.length]; int x=margin; int defaultGroupSizeSpace=1; for (int i=0; i < factorys.length; i++) { factorys[i]=fieldFactorys[i]; factorys[i].setStartX(x); x += factorys[i].getWidth();//add in space between groups if (((i + 1) % defaultGroupSizeSpace) == 0) {//<= x += margin * dataModel.getUnitDelimiterSize(); } } width=x - margin * dataModel.getUnitDelimiterSize() + margin; layoutChanged(); } 

PVS-Studio Warnings:

  • V6007 Expression '((i + 1)% defaultGroupSizeSpace) == 0' is always true. ByteViewerLayoutModel.java:66
  • V6048 This expression can be simplified. Operand 'defaultGroupSizeSpace' in the operation equals 1. ByteViewerLayoutModel.java:66

The hex byte viewer supports the choice of the size of the displayed groups: for example, you can configure the output in the format 'ffff ffff' or 'ff ff ff ff'. The setFactorys method is responsible for the location of these groups in the user interface. Despite the fact that customization and display work correctly, the cycle in this method looks extremely suspicious: the remainder of the division by one is always zero, which means that the coordinate x will increase at each iteration. The presence of the groupSize property of the dataModel parameter also adds suspicion.

Remaining garbage after refactoring? Or maybe the calculation of the variable defaultGroupSizeSpace is lost? In any case, an attempt to replace its value with dataModel.getGroupSize () broke the layout, and perhaps only the author of this code can give an unambiguous answer.

Fragment 8: Broken Validation, Part 2

private String parseArrayDimensions(String datatype, List<Integer> arrayDimensions) { String dataTypeName=datatype; boolean zeroLengthArray=false; while (dataTypeName.endsWith("]")) { if (zeroLengthArray) {//<= return null;//only last dimension may be 0 } int rBracketPos=dataTypeName.lastIndexOf(']'); int lBracketPos=dataTypeName.lastIndexOf('['); if (lBracketPos < 0) { return null; } int dimension; try { dimension=Integer.parseInt(dataTypeName.substring(lBracketPos + 1, rBracketPos)); if (dimension < 0) { return null;//invalid dimension } } catch (NumberFormatException e) { return null; } dataTypeName=dataTypeName.substring(0, lBracketPos).trim(); arrayDimensions.add(dimension); } return dataTypeName; } 

PVS-Studio warning: V6007 Expression 'zeroLengthArray' is always false. PdbDataTypeParser.java:278

This method parses the dimension of a multidimensional array and returns either the text remaining after parsing or null for invalid data. A comment next to one of the validation checks states that only the last read size can be zero. The analysis goes from right to left, so it is understood that '[0] [1] [2]' is a valid input text, and '[2] [1] [0]' is not.

But, the trouble: no one added the check that the next size is zero, and the parser eats invalid data without unnecessary questions. You should probably fix the try block as follows:

try { dimension=Integer.parseInt(dataTypeName.substring(lBracketPos + 1, rBracketPos)); if (dimension < 0) { return null;//invalid dimension } else if (dimension == 0) { zeroLengthArray=true; } } catch (NumberFormatException e) { return null; } 

Naturally, there is a possibility that this criterion of validity turned out to be unnecessary over time, or the author’s comment has a different meaning and you need to check the first dimension read. In any case, data validation is a critical part of any application, which must be taken with full responsibility. Errors in it can lead to fairly harmless application crashes, as well as to security holes, data leakage, data corruption or loss (for example, if you skip SQL injection during query validation).

A bit about the rest of the warnings

The reader may notice that a lot of warnings were issued, and few were considered. Not very neatly tuned cloc counted in the project about 1.25 million lines of Java code (not empty and not comments). The fact is that almost all warnings are extremely similar: they forgot to check for null here, they did not delete the unused legacy code there. I don’t really want to bore the reader by listing the same thing, and I mentioned part of such cases at the beginning of the article.

Another example is the fifty warnings " V6009 Function receives an odd argument" in the context of inaccurate use of the substring (CParserUtils.java:280, ComplexName.java:48 and others) to get the rest of the line after any separator. Developers often hope that this separator will be present in the string and forget that otherwise indexOf will return -1, which is an incorrect value for substring . Naturally, if the data was validated or received not from outside, then the probability of the application crashing is significantly reduced. However, in general, these are potentially dangerous places that we want to help get rid of.


In general, Ghidra pleased with the quality of the code - no special nightmares are evident. The code is well-formatted and has a fairly consistent style: in most cases, variables, methods, and everything else are given clear names, explanatory comments are found, and a huge number of tests are present.

Naturally, there were no problems, among which:

  • Dead code, which, most likely, remained after numerous refactoring;
  • Many javadocs are hopelessly outdated and, for example, indicate non-existent parameters;
  • There is no convenient development option when using IntelliJ IDEA ;
  • A modular system built around reflection makes navigating a project and finding dependencies between components much more difficult.

Please do not neglect developer tools. Static analysis, like seat belts, is not a panacea, but it will help prevent some disasters before release. And nobody likes to use the logged-in software.

You can read about other proven projects in our blog . And we also have trial license and various options to use the analyzer without having to pay for it.

ITKarma picture

If you want to share this article with an English-speaking audience, then please use the link to the translation: Nikita Lazeba. NSA, Ghidra, and Unicorns .