Write in the PHP Language
Copy-on-Write in the PHP Language
Akihiko Tozawa Michiaki Tatsubori
Tamiya Onodera
IBM Research, Tokyo Research Laboratory
atozawa@jp.ibm.com,
mich@acm.org,tonodera@jp.ibm.com
Yasuhiko Minamide
Department of Computer Science
University of Tsukuba
minamide@cs.tsukuba.ac.jp
Abstract
PHP is a popular language for server-side applications. In PHP, as-
signment to variables copies the assigned values, according to its
so-called copy-on-assignment semantics. In contrast, a typical PHP
implementation uses a copy-on-write scheme to reduce the copy
overhead by delaying copies as much as possible. This leads us to
ask if the semantics and implementation of PHP coincide, and ac-
tually this is not the case in the presence of sharings within values.
In this paper, we describe the copy-on-assignment semantics with
three possible strategies to copy values containing sharings. The
current PHP implementation has inconsistencies with these seman-
tics, caused by its na?¨ve use of copy-on-write. We fix this problem
by the novel mostly copy-on-write scheme, making the copy-on-
write implementations faithful to the semantics. We prove that our
copy-on-write implementations are correct, using bisimulation with
the copy-on-assignment semantics.
Categories and Subject Descriptors D.3.0 [Programming Lan-
guages]: General
General Terms Design, Languages
1. Introduction
Assume that we want to maintain some data locally. This local data
is mutable, but any change to it should not affect the global, master
data. So, we may want to create and maintain a copy of the master
data. However such copying is often costly. In addition, the copied
data may not be modified after all, in which case the cost of copy
is wasted. This kind of situation leads us to consider the copy-on-
write technique.
Copy-on-write is a classic optimization technique, based on the
idea of delaying the copy until there is a write to the data. The
name of the technique stems from the copy of the original data
being forced by the time of the write. One example of copy-on-
write is found in the UNIX fork, where the process-local memory
corresponds to the local data, which should be copied from the
address space of the original process to the space of the new
process by the fork operation. In modern UNIX systems, this copy
is usually delayed by copy-on-write.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specific permission and/or a fee.
POPL鈥09, January 18鈥24, 2009, Savannah, Georgia, USA.
Copyright c? 2009 ACM 978-1-60558-379-2/09/01. . . $5.00
Another example is found in the PHP language, a popular script-
ing language for server-side Web applications. Here is an example
with PHP鈥檚 associative arrays.
$r["box"] = "gizmo";
$l = $r; // assignment from $r to $l
$l["box"] = "gremlin";
echo $r["box"]; // prints out gizmo
The change of $l at Line 3, following the assignment $l = $r, only
has local effects on $l which cannot be seen from $r. The behavior
or semantics in PHP is called copy-on-assignment, since the value
of $r seems to be copied before it is passed to $l. We can consider
the copy-on-write technique to implement this behavior. Indeed, the
by far dominant PHP runtime, called the Zend runtime1, employs
copy-on-write and delays the above copy until the write at Line 3.
For readers in the functional or declarative languages commu-
nity, the semantics of PHP arrays may first sound like a familiar
one, e.g., PHP arrays are similar to functional arrays. However
their similarity becomes less clear as we learn how we can share
values in PHP. In PHP, we have the reference assignment state-
ment, =&, with which we can declare a sharing between two vari-
ables. Such a sharing breaks the