#Grammar Rules defining the stream format of Java object serialization

1 messages · Page 1 of 1 (latest)

vernal rampart
#

Could you explain the grammar rules that define the stream format of Java object serialization?

visual sparrowBOT
#

<@&987246399047479336> please have a look, thanks.

#

Here is an AI assisted attempt to answer your question 🤖. Maybe it helps! In any case, a human is on the way 👍. To continue talking to the AI, you can use </chatgpt:1108714622413963314>.

#
TJ-Bot
Could you explain the grammar rules that define the stream format of Java object serialization?

Java object serialization stream format is defined by the grammar rules specified in the Object Serialization Specification.

The stream format consists of a sequence of data elements such as object instances, class descriptions, and primitive data types. Each data element is encoded with specific markers and lengths to indicate its type and structure within the stream.

The grammar rules define how objects are serialized into a byte stream, including handling cyclic references, class versioning, and externalizable objects.

Useful links:

vernal rampart
#

What do the grammar rules for the Java object serialization stream format mean, and how should I interpret them?

#

stream:
magic version contents

contents:
content
contents content

content:
object
blockdata

object:
newObject
newClass
newArray
newString
newEnum
newClassDesc
prevObject
nullReference
exception
TC_RESET

newClass:
TC_CLASS classDesc newHandle

classDesc:
newClassDesc
nullReference
(ClassDesc)prevObject // an object required to be of type ClassDesc

superClassDesc:
classDesc

#

newClassDesc:
TC_CLASSDESC className serialVersionUID newHandle classDescInfo
TC_PROXYCLASSDESC newHandle proxyClassDescInfo

classDescInfo:
classDescFlags fields classAnnotation superClassDesc

className:
(utf)

serialVersionUID:
(long)

classDescFlags:
(byte) // Defined in Terminal Symbols and Constants

proxyClassDescInfo:
(int)<count> proxyInterfaceName[count] classAnnotation
superClassDesc

proxyInterfaceName:
(utf)

fields:
(short)<count> fieldDesc[count]

fieldDesc:
primitiveDesc
objectDesc

primitiveDesc:
prim_typecode fieldName

objectDesc:
obj_typecode fieldName className1

fieldName:
(utf)

className1:
(String)object // String containing the field's type,
// in field descriptor format

classAnnotation:
endBlockData
contents endBlockData // contents written by annotateClass

pseudo canyon
#

i doubt many people here know anything about the internal binary format used by javas default serialization

vernal rampart
#

prim_typecode:
'B' // byte
'C' // char
'D' // double
'F' // float
'I' // integer
'J' // long
'S' // short
'Z' // boolean

obj_typecode:
'[' // array
'L' // object

newArray:
TC_ARRAY classDesc newHandle (int)<size> values[size]

newObject:
TC_OBJECT classDesc newHandle classdata[] // data for each class

classdata:
nowrclass // SC_SERIALIZABLE & classDescFlag &&
// !(SC_WRITE_METHOD & classDescFlags)
wrclass objectAnnotation // SC_SERIALIZABLE & classDescFlag &&
// SC_WRITE_METHOD & classDescFlags
externalContents // SC_EXTERNALIZABLE & classDescFlag &&
// !(SC_BLOCKDATA & classDescFlags
objectAnnotation // SC_EXTERNALIZABLE & classDescFlag&&
// SC_BLOCKDATA & classDescFlags

nowrclass:
values // fields in order of class descriptor

wrclass:
nowrclass

objectAnnotation:
endBlockData
contents endBlockData // contents written by writeObject
// or writeExternal PROTOCOL_VERSION_2.

blockdata:
blockdatashort
blockdatalong

blockdatashort:
TC_BLOCKDATA (unsigned byte)<size> (byte)[size]

#

blockdatalong:
TC_BLOCKDATALONG (int)<size> (byte)[size]

endBlockData:
TC_ENDBLOCKDATA

externalContent: // Only parseable by readExternal
(bytes) // primitive data
object

externalContents: // externalContent written by
externalContent // writeExternal in PROTOCOL_VERSION_1.
externalContents externalContent

newString:
TC_STRING newHandle (utf)
TC_LONGSTRING newHandle (long-utf)

newEnum:
TC_ENUM classDesc newHandle enumConstantName

enumConstantName:
(String)object

prevObject:
TC_REFERENCE (int)handle

nullReference:
TC_NULL

exception:
TC_EXCEPTION reset (Throwable)object reset

magic:
STREAM_MAGIC

version:
STREAM_VERSION

values: // The size and types are described by the
// classDesc for the current object

newHandle: // The next number in sequence is assigned
// to the object being serialized or deserialized

reset: // The set of known objects is discarded
// so the objects of the exception do not
// overlap with the previously sent objects
// or with objects that may be sent after
// the exception

#

I'm not quite sure what these represent.

vernal rampart
pseudo canyon
vernal rampart
#

I'll look into it when I have time.

vernal rampart
#

This code converts an object into a serialized byte array—does this mean that the stream internally looks like this?

#

The grammar rules of the Java serialization stream are like a fixed template or a blueprint.

#

"This grammar is like a map for analyzing byte streams. For example, when looking at a byte sequence such as AC ED 00 05 73 72 ...:

AC ED 00 05 → header

73 → TC_OBJECT (a new object)

72 → TC_CLASSDESC (class description)

70 → TC_NULL (null reference)

This grammar allows us to interpret what each byte means."

boreal depot
#

There's essentially zero value to understanding the internals of this stream format.

  • You'll only use JDK classes to produce and consume these streams
  • The serialisation feature is actively being replaced (by a mechanism that doesn't constrain the stream format directly)