Wednesday, March 21. 2007
Some remarks to serialization without pity
Terry Chay made some remarks to my last blog entry about a solution for lazy class loading without using __autoload(). Some of his statements seem like I explained my implementation not good enough leading to wrong interpretations. In this blog entry I'll use some of his statements to take a deeper look into my implementation and show that he has drawed some conclusions which I want to disprove.
He writes: Frank makes his classes that need to be serialized implement an interface (technically, it is subclassed from his base object). This object has a method (not __sleep) that serializes itself into a special object containing this data along with a string containing the class path. This object is included before session_start and reads the full path name to include the class definition just in time. That is not correct. To show this I will dig into some technical details of my implementation:
class stubSerializedObject implements stubObject
{
/**
* full qualified class name of the serialized class
*
* @var string
*/
protected $className;
[...]
/**
* the serialized class data
*
* @var string
*/
protected $data;
/**
* constructor
*
* @param stubObject $object the stubObject instance to serialize
*/
public function __construct(stubObject $object)
{
$this->className = $object->getClassName();
$this->hashCode = $object->hashCode();
$this->data = serialize($object);
}
[... rest of class ...]
I have marked the important line. It shows that the object itself is serialized which means that one can still use __sleep() and __wakeup() within the class to serialize. Obviously, there is no parallel architecture for serialization and deserialization, its just wrapped. You can’t use two frameworks, because in this case they would have conflicting ways of deserializing themselves. If you wanted to use a library from PEAR, you’d be forced to put an adapter pattern in front of it just to get the fucker to serialize. As it should become clear from what I explained above you can still put any other classes you use (maybe those from PEAR) into the session without worrying about how they should be serialized - it just uses the default serialization mechanism.
$session->putValue('stubblesClass', $stubblesClass);
$session->putValue('PEAR_Example_Class', $pearExampleClass);
No need to have an adapter. More to the point, not storing classpaths with the serialized object is a Good Thing™. Since the session is most-likely stored across servers (via the memcache best practice), storing class paths with the sessions means the directory architecture has to be shared across servers. Well, not the whole classpath is stored. Our class loader knows where the classes reside. The "class path" is just the path from there to the correct subdirectory containing the file with the class. Therefore the directory structure can be different on another server, it is just required that the directory structure of your source directory is the same, but the path to the source directory can be different on each server. So there is nowhere a hard path, the stored class path is just a relative one. ![]() Trackbacks
Trackback specific URI for this entry
No Trackbacks
![]() Comments
Display comments as
(Linear | Threaded)
Actually it sounds like I need to explain what I'm saying:
"This object has a method (not __sleep) that serializes itself into a special object containing this data along with a string containing the class path." In other words, I claimed stubSerializedObject is an adapter that contains the object to be serialized, and path of the object to serialize. I assumed the serialize function of it would serialize the object and the unserialize would load the class and unserialize the object it wraps. I didn't take into account a HashCode. These function itself could be the __sleep and __wakeup. In fact I assumed as much because if I generate a stubSerializedObject in your case, then manipulate the object, and then shut down my session, I have serialized the wrong state into the session variable. Of course, you probably have framework code or convention to prevent this. Which sort of begs the question. But since you have an extra mode of indirection anyway (between the registering and unregistering of session variables) it would not be able to be used the way $_SESSION is used since there is an extra level of indirection. This means the client logic is not transparent to the existence of the adapter, which is my point. At some point, the framework forces the person using a library to build an adapter. On the "not the entire classpath is stored" issue, this covers one edge case in one specific example. My point is: where there is one, even if solved, there be others. Here is another. Let's say I want to serialize two things into the session. I'm going to serialize an object A and also another object B that happens to contain A as a property. In your example, when start up my session, B does not contain A, but a copy of A. Sure you can work around that by storing a uniq ID in a reference table and then checking that reference table, etc. But you see, there are probably a lot more edge cases I didn't consider. You might argue. Who the hell would do that? And I'd say, "Well sure it is a poor programmer that does that, but you can certainly imagine this happening." Frameworks, like Design Patterns themselves, have consequences. I personally advocate avoiding using a framework for web development unless you use it as a library: i.e. using Wordpress which is its own framework because you want a blog. I have a large resistance to them that comes from experience. I feel that in PHP it is almost hopeless to make a generalized web development framework unless the framework concentrates solely on UI (where there is a compelling return for the cost of using a framework). Odds are that in case this supposed framework is written in Javascript instead of PHP. I'm always open to being wrong so I encourage people to write frameworks if it strikes their fancy. I just think you'll get more mileage (downloads, users, $$, babes, whatever) from writing a compelling application and making a framework that is tailored to the needs of this application.
This series of posts from you guys has been fairly interesting. Last year, I implemented a system for our server in which I basically do the stubbles style. I have a Foo_LazyObject which actually gets stored in the session, and wraps the real object (which is passed into LazyObject's constructor). It implements Serializable, and it returns as its "serialization" the serialized form of the stored object. On unserialize, it just stores that exact data into itself as a string, delaying the actual unserialization until a call is made to $lazy->get(). This way, all I need loaded before calling session_start() is Foo_LazyObject. There's also a Foo_Session which hides all this detail from the user with the ArrayAccess, Countable, and Iterator interfaces.
And if someone is wandering several of our vhosts at once, interacting with several apps, each app stores a unique key (+vhost if it's a multi-host app) so that none of the keys get crossed up. This was all before I was introduced to The Rails Way: store an object ID in the session, load it from DB/memcache, and be done with it. If we need to, though, I can re-wire the Foo_Session to do just that without disturbing any callers. Such is the power (and the attractiveness) of abstraction.
"each app stores a unique key so that none of the keys get crossed up."
Sorry, that wasn't too clear--Foo_Session's constructor takes an app-specific key, and stores all session data read/written through that object underneath that key in $_SESSION. So '$x = new Foo_Session("app1"); $x["user"] = 3;' is equivalent to '$_SESSION["app1"]["user"] = 3;'
The Rails approach deals large serialization/unserialize problem, but it does this by breaking things into many smaller serialize/unserialzes which is, depending on the application, worse. Unfortunately, Rails has made that choice for you.
IMO, the vast majority of applications should store very little data with the session nor store much session data by proxy (as in the Rails instance). Being aware of and not abstracted from the cost associated with using session data is something that can only be good. Also, I have a lot of trepedation of storing an application-wide registry that requires an application-wide unique id. Methinks it is thinking like that that causes Rails to to perform like dogshit. ![]() ![]() ![]() |
![]() ![]() Calendar
![]() Archives![]() Categories![]() Quicksearch![]() Blogs we read...![]() Syndicate This Blog![]() Blog Administration![]() ![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




