Deserialization of Untrusted Data
The application deserializes untrusted data without sufficiently verifying that the resulting data will be valid.
It is often convenient to serialize objects for communication or to save them for later use. However, deserialized data or code can often be modified without using the provided accessor functions if it does not use cryptography to protect itself. Furthermore, any cryptography would still be client-side security -- which is a dangerous security assumption.
Data that is untrusted can not be trusted to be well-formed.
When developers place no restrictions on "gadget chains," or series of instances and method invocations that can self-execute during the deserialization process (i.e., before the object is returned to the caller), it is sometimes possible for attackers to leverage them to perform unauthorized actions, like generating a shell.
Serialization and deserialization refer to the process of taking program-internal object-related data, packaging it in a way that allows the data to be externally stored or transferred ("serialization"), then extracting the serialized data to reconstruct the original object ("deserialization").
The following examples help to illustrate the nature of this weakness and describe methods or techniques which can be used to mitigate the risk.
Note that the examples here are by no means exhaustive and any given weakness may have many subtle varieties, each of which may require different detection methods or runtime controls.
This code snippet deserializes an object from a file and uses it as a UI button:
This code does not attempt to verify the source or contents of the file before deserializing it. An attacker may be able to replace the intended file with a file that contains arbitrary malicious code which will be executed when the button is pressed.
To mitigate this, explicitly define final readObject() to prevent deserialization. An example of this is:
In Python, the Pickle library handles the serialization and deserialization processes. In this example derived from [R.502.7], the code receives and parses data, and afterwards tries to authenticate a user based on validating a token.
Unfortunately, the code does not verify that the incoming data is legitimate. An attacker can construct a illegitimate, serialized object "AuthToken" that instantiates one of Python's subprocesses to execute arbitrary commands. For instance,the attacker could construct a pickle that leverages Python's subprocess module, which spawns new processes and includes a number of arguments for various uses. Since Pickle allows objects to define the process for how they should be unpickled, the attacker can direct the unpickle process to call Popen in the subprocess module and execute /bin/sh.
Weaknesses in this category are related to the CISQ Quality Measures for Security. Presence of these weaknesses could reduce the security of the software.
Weaknesses in this category are related to the rules and recommendations in the Serialization (SER) section of the SEI CERT Oracle Secure Coding Standard for Java.
Weaknesses in this category are related to the A8 category in the OWASP Top Ten 2017.
This view (slice) covers all the elements in CWE.
CWE entries in this view are listed in the 2020 CWE Top 25 Most Dangerous Software Weaknesses.
This view outlines the SMM representation of the Automated Source Code Data Protection Measurement specifications, as identified by the Consortium for Information & So...