Write in the PHP Language

2013 年 2 月 14 日8410

Copy-on-Write in the PHP Language

Akihiko Tozawa Michiaki Tatsubori

Tamiya Onodera

IBM Research, Tokyo Research Laboratory

atozawa@jp.ibm.com,

mich@acm.org,tonodera@jp.ibm.com

Yasuhiko Minamide

Department of Computer Science

University of Tsukuba

minamide@cs.tsukuba.ac.jp

Abstract

PHP is a popular language for server-side applications. In PHP, as-

signment to variables copies the assigned values, according to its

so-called copy-on-assignment semantics. In contrast, a typical PHP

implementation uses a copy-on-write scheme to reduce the copy

overhead by delaying copies as much as possible. This leads us to

ask if the semantics and implementation of PHP coincide, and ac-

tually this is not the case in the presence of sharings within values.

In this paper, we describe the copy-on-assignment semantics with

three possible strategies to copy values containing sharings. The

current PHP implementation has inconsistencies with these seman-

tics, caused by its na?¨ve use of copy-on-write. We fix this problem

by the novel mostly copy-on-write scheme, making the copy-on-

write implementations faithful to the semantics. We prove that our

copy-on-write implementations are correct, using bisimulation with

the copy-on-assignment semantics.

Categories and Subject Descriptors D.3.0 [Programming Lan-

guages]: General

General Terms Design, Languages

1. Introduction

Assume that we want to maintain some data locally. This local data

is mutable, but any change to it should not affect the global, master

data. So, we may want to create and maintain a copy of the master

data. However such copying is often costly. In addition, the copied

data may not be modified after all, in which case the cost of copy

is wasted. This kind of situation leads us to consider the copy-on-

write technique.

Copy-on-write is a classic optimization technique, based on the

idea of delaying the copy until there is a write to the data. The

name of the technique stems from the copy of the original data

being forced by the time of the write. One example of copy-on-

write is found in the UNIX fork, where the process-local memory

corresponds to the local data, which should be copied from the

address space of the original process to the space of the new

process by the fork operation. In modern UNIX systems, this copy

is usually delayed by copy-on-write.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for profit or commercial advantage and that copies bear this notice and the full citation

on the first page. To copy otherwise, to republish, to post on servers or to redistribute

to lists, requires prior specific permission and/or a fee.

POPL鈥09, January 18鈥24, 2009, Savannah, Georgia, USA.

Copyright c? 2009 ACM 978-1-60558-379-2/09/01. . . $5.00

Another example is found in the PHP language, a popular script-

ing language for server-side Web applications. Here is an example

with PHP鈥檚 associative arrays.

$r["box"] = "gizmo";

$l = $r; // assignment from $r to $l

$l["box"] = "gremlin";

echo $r["box"]; // prints out gizmo

The change of $l at Line 3, following the assignment $l = $r, only

has local effects on $l which cannot be seen from $r. The behavior

or semantics in PHP is called copy-on-assignment, since the value

of $r seems to be copied before it is passed to $l. We can consider

the copy-on-write technique to implement this behavior. Indeed, the

by far dominant PHP runtime, called the Zend runtime1, employs

copy-on-write and delays the above copy until the write at Line 3.

For readers in the functional or declarative languages commu-

nity, the semantics of PHP arrays may first sound like a familiar

one, e.g., PHP arrays are similar to functional arrays. However

their similarity becomes less clear as we learn how we can share

values in PHP. In PHP, we have the reference assignment state-

ment, =&, with which we can declare a sharing between two vari-

ables. Such a sharing breaks the

0 0