Friday, February 25, 2011

Java Serialization : Introduction

Introduction

Serialization is the process of saving an object in a storage medium (such as a file, or a memory buffer) or to transmit it over a network connection in binary form. The serialized objects are JVM independent and can be re-serialized by any JVM. In this case the "in memory" java objects state are converted into a byte stream. This type of the file can not be understood by the user. It is a special types of object i.e. reused by the JVM (Java Virtual Machine). This process of serializing an object is also called deflating or marshalling an object.

Note : Note that its ‘primary purpose’ is to write an object into a stream, so that it can be transported through a network and that object can be rebuilt again, but sometimes people use java serialization as a replacement for database, which is not our main motive. Just a placeholder where you can persist an object across sessions. This is not the primary purpose of java serialization. Sometimes, when I interview candidates for Java I hear them saying java serialization is used for storing (to preserve the state) an object and retrieving it. They use it synonymously with database. This is a wrong perception for serialization. Other ways you can leverage the feature of serialization is, you can use it to perform a deep copy.

You can see this tutorial for quick tutorial.

Overview

Serialization is the process of saving an object in a storage medium (such as a file, or a memory buffer) or to transmit it over a network connection in binary form. The serialized objects are JVM independent and can be re-serialized by any JVM. In this case the "in memory" java objects state are converted into a byte stream. This type of the file can not be understood by the user. It is a special types of object i.e. reused by the JVM (Java Virtual Machine). This process of serializing an object is also called deflating or marshalling an object. But when deserializing the object, the deserializer must know by some protocol how to deserialize. Here protocol means, understanding between serializing person and de-serializing person.

Implementation

For an object to be serialized, it must be an instance of a class that implements either the Serializable or Externalizable interface in java.io. Serializable is type of marker interface. Both interfaces only permit the saving of data associated with an object's variables. They depend on the class definition being available to the Java Virtual Machine at reconstruction time in order to construct the object.


The Serializable interface relies on the Java runtime default mechanism to save an object's state. Writing an object is done via the writeObject() method in the ObjectOutputStream class (or the ObjectOutput interface). Writing a primitive value may be done through the appropriate write<datatype>() method.

Reading the serialized object is accomplished using the readObject() method of the ObjectInputStream class, and primitives may be read using the various read<datatype>() methods.
What about other objects that may be referred to by the object we are serializing? For instance, what if our object is a Frame containing a set of (AWT) Panel and TextArea instance variables? Using the Serializable interface, these references (and their associated data) also are converted and written to the stream. All state information necessary to reconstruct our Frame object and any objects that it references gets stored together.
If those other objects or their formats weren't stored, our reconstructed Frame would contain null object references, and the content of those Panels and TextAreas would be gone. Plus, any methods that rely on the existence of the Panels or TextAreas would throw exceptions.
The Externalizable interface specifies that the implementing class will handle the serialization on its own, instead of relying on the default runtime mechanism. This includes which fields get written (and read), and in what order. The class must define a writeExternal() method to write out the stream, and a corresponding readExternal() method to read the stream. Inside of these methods the class calls ObjectOutputStream writeObject(), ObjectInputStream readObject(), and any necessary write<datatype>() and read<datatype>() methods, for the desired fields.
 

Hiding Data From Serialization

Sometimes you may wish to prevent certain fields from being stored in the serialized object. The Serializable interface allows the implementing class to specify that some of its fields do not get saved or restored. This is accomplished by placing the keyword transient before the data type in the variable declaration. For example, you may have some data which is confidential and can be re-read from a master file later (as opposed to saving it with the serialized object). Or you decide (wisely) to preserve the privacy of file references by declaring any such variables as transient. Otherwise, all fields automatically get written without any additional effort by the class.
In addition to those fields declared as transient, static fields are not serialized (written out), and so cannot be deserialized (read back in).
Another way to use Serializable, and control which fields get written, is to override the writeObject() method of the Serializable interface. Inside of this method, you are responsible for writing out the appropriate fields. If you take this approach, you will want to override readObject() as well, to control the restoration process. This is similar to using Externalizable, except that interface requires writeExternal() and readExternal().
For the Externalizable interface, since both writeExternal() and readExternal() must be declared public, this increases the risk that a rogue object could use them to determine the format of the serialized object. For this reason, you should be careful when saving object data with this interface.
It is worth considering the amount of security you need for any objects that you serialize. When reading them back in, all of the normal Java security checks (such as the bytecode verifier) are in effect. You can define certain values within the class that should remain intact in serialized objects. Perhaps they should contain a specific value, or a value within a particular range. You can easily check the value of any numeric variable read in from a serialized object, especially if you know that only a portion of the available range for that data type is used by your variable.
You can also encrypt the outgoing data stream. The implementation is up to you, and don't forget to decrypt the object format when reading it back in.

Versioning

Example of serialization:

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.Serializable;
 
class SerializationBox implements Serializable {
 
  private byte serializableProp = 10;
 
  public byte getSerializableProp() {
    return serializableProp;
  }
}
 
public class SerializationSample {
 
  public static void main(String args[]) throws IOException,
      FileNotFoundException, ClassNotFoundException {
 
    SerializationBox serialB = new SerializationBox();
    serialize("serial.out", serialB);
    SerializationBox sb = (SerializationBox) deSerialize("serial.out");
    System.out.println(sb.getSerializableProp());
  }
 
  public static void serialize(String outFile, Object serializableObject)
      throws IOException {
    FileOutputStream fos = new FileOutputStream(outFile);
    ObjectOutputStream oos = new ObjectOutputStream(fos);
    oos.writeObject(serializableObject);
  }
 
  public static Object deSerialize(String serilizedObject)
      throws FileNotFoundException, IOException, ClassNotFoundException {
    FileInputStream fis = new FileInputStream(serilizedObject);
    ObjectInputStream ois = new ObjectInputStream(fis);
    return ois.readObject();
  }
}

Format of a Serialized Object 
Conclusion
Adding object persistence to Java applications using serialization is easy. Serialization allows you to save the current state of an object to a container, typically a file. At some later time, you can retrieve the saved data values and create an equivalent object. Depending on which interface you implement, you can choose to have the object and all its referenced objects saved and restored automatically, or you can specify which fields should be saved and restored. Java also provides several ways of protecting sensitive data in a serialized object, so objects loaded from a serialized representation should prove no less secure than those classes loaded at application startup. Versioning provides a measure of the backward compatibility of class versions. The code needed to add serialization to your application is simple and flexible.

No comments:

Post a Comment

Chitika