10966

1 
(* Title: HOL/Unix/Unix.thy


2 
ID: $Id$


3 
Author: Markus Wenzel, TU Muenchen


4 
License: GPL (GNU GENERAL PUBLIC LICENSE)


5 
*)


6 


7 
header {* Unix filesystems \label{sec:unixfilesystem} *}


8 


9 
theory Unix = Nested_Environment + List_Prefix:


10 


11 
text {*


12 
We give a simple mathematical model of the basic structures


13 
underlying the Unix filesystem, together with a few fundamental


14 
operations that could be imagined to be performed internally by the


15 
Unix kernel. This forms the basis for the set of Unix systemcalls


16 
to be introduced later (see \secref{sec:unixtrans}), which are the


17 
actual interface offered to processes running in userspace.


18 


19 
\medskip Basically, any Unix file is either a \emph{plain file} or a


20 
\emph{directory}, consisting of some \emph{content} plus


21 
\emph{attributes}. The content of a plain file is plain text. The


22 
content of a directory is a mapping from names to further


23 
files.\footnote{In fact, this is the only way that names get


24 
associated with files. In Unix files do \emph{not} have a name in


25 
itself. Even more, any number of names may be associated with the


26 
very same file due to \emph{hard links} (although this is excluded


27 
from our model).} Attributes include information to control various


28 
ways to access the file (read, write etc.).


29 


30 
Our model will be quite liberal in omitting excessive detail that is


31 
easily seen to be ``irrelevant'' for the aspects of Unix


32 
filesystems to be discussed here. First of all, we ignore


33 
character and block special files, pipes, sockets, hard links,


34 
symbolic links, and mount points.


35 
*}


36 


37 


38 
subsection {* Names *}


39 


40 
text {*


41 
User ids and file name components shall be represented by natural


42 
numbers (without loss of generality). We do not bother about


43 
encoding of actual names (e.g.\ strings), nor a mapping between user


44 
names and user ids as would be present in a reality.


45 
*}


46 


47 
types


48 
uid = nat


49 
name = nat


50 
path = "name list"


51 


52 


53 
subsection {* Attributes *}


54 


55 
text {*


56 
Unix file attributes mainly consist of \emph{owner} information and


57 
a number of \emph{permission} bits which control access for


58 
``user'', ``group'', and ``others'' (see the Unix man pages @{text


59 
"chmod(2)"} and @{text "stat(2)"} for more details).


60 


61 
\medskip Our model of file permissions only considers the ``others''


62 
part. The ``user'' field may be omitted without loss of overall


63 
generality, since the owner is usually able to change it anyway by


64 
performing @{text chmod}.\footnote{The inclined Unix expert may try


65 
to figure out some exotic arrangements of a realworld Unix


66 
filesystem such that the owner of a file is unable to apply the


67 
@{text chmod} system call.} We omit ``group'' permissions as a


68 
genuine simplification as we just do not intend to discuss a model


69 
of multiple groups and group membership, but pretend that everyone


70 
is member of a single global group.\footnote{A general HOL model of


71 
user group structures and related issues is given in


72 
\cite{Naraschewski:2001}.}


73 
*}


74 


75 
datatype perm =


76 
Readable


77 
 Writable


78 
 Executable  "(ignored)"


79 


80 
types perms = "perm set"


81 


82 
record att =


83 
owner :: uid


84 
others :: perms


85 


86 
text {*


87 
For plain files @{term Readable} and @{term Writable} specify read


88 
and write access to the actual content, i.e.\ the string of text


89 
stored here. For directories @{term Readable} determines if the set


90 
of entry names may be accessed, and @{term Writable} controls the


91 
ability to create or delete any entries (both plain files or


92 
subdirectories).


93 


94 
As another simplification, we ignore the @{term Executable}


95 
permission altogether. In reality it would indicate executable


96 
plain files (also known as ``binaries''), or control actual lookup


97 
of directory entries (recall that mere directory browsing is


98 
controlled via @{term Readable}). Note that the latter means that


99 
in order to perform any filesystem operation whatsoever, all


100 
directories encountered on the path would have to grant @{term


101 
Executable}. We ignore this detail and pretend that all directories


102 
give @{term Executable} permission to anybody.


103 
*}


104 


105 


106 
subsection {* Files *}


107 


108 
text {*


109 
In order to model the general tree structure of a Unix filesystem


110 
we use the arbitrarily branching datatype @{typ "('a, 'b, 'c) env"}


111 
from the standard library of Isabelle/HOL


112 
\cite{Baueretal:2001:HOLLibrary}. This type provides


113 
constructors @{term Val} and @{term Env} as follows:


114 


115 
\medskip


116 
{\def\isastyleminor{\isastyle}


117 
\begin{tabular}{l}


118 
@{term [source] "Val :: 'a \<Rightarrow> ('a, 'b, 'c) env"} \\


119 
@{term [source] "Env :: 'b \<Rightarrow> ('c \<Rightarrow> ('a, 'b, 'c) env option) \<Rightarrow> ('a, 'b, 'c) env"} \\


120 
\end{tabular}}


121 
\medskip


122 


123 
Here the parameter @{typ 'a} refers to plain values occurring at


124 
leaf positions, parameter @{typ 'b} to information kept with inner


125 
branch nodes, and parameter @{typ 'c} to the branching type of the


126 
tree structure. For our purpose we use the type instance with @{typ


127 
"att \<times> string"} (representing plain files), @{typ att} (for


128 
attributes of directory nodes), and @{typ name} (for the index type


129 
of directory nodes).


130 
*}


131 


132 
types


133 
file = "(att \<times> string, att, name) env"


134 


135 
text {*


136 
\medskip The HOL library also provides @{term lookup} and @{term


137 
update} operations for general tree structures with the subsequent


138 
primitive recursive characterizations.


139 


140 
\medskip


141 
{\def\isastyleminor{\isastyle}


142 
\begin{tabular}{l}


143 
@{term [source] "lookup :: ('a, 'b, 'c) env \<Rightarrow> 'c list \<Rightarrow> ('a, 'b, 'c) env option"} \\


144 
@{term [source]


145 
"update :: 'c list \<Rightarrow> ('a, 'b, 'c) env option \<Rightarrow> ('a, 'b, 'c) env \<Rightarrow> ('a, 'b, 'c) env"} \\


146 
\end{tabular}}


147 


148 
@{thm [display, indent = 2, eta_contract = false] lookup_eq [no_vars]}


149 
@{thm [display, indent = 2, eta_contract = false] update_eq [no_vars]}


150 


151 
Several further properties of these operations are proven in


152 
\cite{Baueretal:2001:HOLLibrary}. These will be routinely used


153 
later on without further notice.


154 


155 
\bigskip Apparently, the elements of type @{typ file} contain an


156 
@{typ att} component in either case. We now define a few auxiliary


157 
operations to manipulate this field uniformly, following the


158 
conventions for record types in Isabelle/HOL


159 
\cite{Nipkowetal:2000:HOL}.


160 
*}


161 


162 
constdefs


163 
attributes :: "file \<Rightarrow> att"


164 
"attributes file \<equiv>


165 
(case file of


166 
Val (att, text) \<Rightarrow> att


167 
 Env att dir \<Rightarrow> att)"


168 


169 
attributes_update :: "att \<Rightarrow> file \<Rightarrow> file"


170 
"attributes_update att file \<equiv>


171 
(case file of


172 
Val (att', text) \<Rightarrow> Val (att, text)


173 
 Env att' dir \<Rightarrow> Env att dir)"


174 


175 
lemma [simp]: "attributes (Val (att, text)) = att"


176 
by (simp add: attributes_def)


177 


178 
lemma [simp]: "attributes (Env att dir) = att"


179 
by (simp add: attributes_def)


180 


181 
lemma [simp]: "attributes (file \<lparr>attributes := att\<rparr>) = att"


182 
by (cases file) (simp_all add: attributes_def attributes_update_def


183 
split_tupled_all)


184 


185 
lemma [simp]: "(Val (att, text)) \<lparr>attributes := att'\<rparr> = Val (att', text)"


186 
by (simp add: attributes_update_def)


187 


188 
lemma [simp]: "(Env att dir) \<lparr>attributes := att'\<rparr> = Env att' dir"


189 
by (simp add: attributes_update_def)


190 


191 


192 
subsection {* Initial filesystems *}


193 


194 
text {*


195 
Given a set of \emph{known users} a filesystem shall be initialized


196 
by providing an empty home directory for each user, with readonly


197 
access for everyone else. (Note that we may directly use the user


198 
id as home directory name, since both types have been identified.)


199 
Certainly, the very root directory is owned by the super user (who


200 
has user id 0).


201 
*}


202 


203 
constdefs


204 
init :: "uid set \<Rightarrow> file"


205 
"init users \<equiv>


206 
Env \<lparr>owner = 0, others = {Readable}\<rparr>


207 
(\<lambda>u. if u \<in> users then Some (Env \<lparr>owner = u, others = {Readable}\<rparr> empty)


208 
else None)"


209 


210 


211 
subsection {* Accessing filesystems *}


212 


213 
text {*


214 
The main internal filesystem operation is access of a file by a


215 
user, requesting a certain set of permissions. The resulting @{typ


216 
"file option"} indicates if a file had been present at the


217 
corresponding @{typ path} and if access was granted according to the


218 
permissions recorded within the filesystem.


219 


220 
Note that by the rules of Unix filesystem security (e.g.\


221 
\cite{Tanenbaum:1992}) both the superuser and owner may always


222 
access a file unconditionally (in our simplified model).


223 
*}


224 


225 
constdefs


226 
access :: "file \<Rightarrow> path \<Rightarrow> uid \<Rightarrow> perms \<Rightarrow> file option"


227 
"access root path uid perms \<equiv>


228 
(case lookup root path of


229 
None \<Rightarrow> None


230 
 Some file \<Rightarrow>


231 
if uid = 0


232 
\<or> uid = owner (attributes file)


233 
\<or> perms \<subseteq> others (attributes file)


234 
then Some file


235 
else None)"


236 


237 
text {*


238 
\medskip Successful access to a certain file is the main


239 
prerequisite for systemcalls to be applicable (cf.\


240 
\secref{sec:unixtrans}). Any modification of the filesystem is


241 
then performed using the basic @{term update} operation.


242 


243 
\medskip We see that @{term access} is just a wrapper for the basic


244 
@{term lookup} function, with additional checking of


245 
attributes. Subsequently we establish a few auxiliary facts that


246 
stem from the primitive @{term lookup} used within @{term access}.


247 
*}


248 


249 
lemma access_empty_lookup: "access root path uid {} = lookup root path"


250 
by (simp add: access_def split: option.splits)


251 


252 
lemma access_some_lookup:


253 
"access root path uid perms = Some file \<Longrightarrow>


254 
lookup root path = Some file"


255 
by (simp add: access_def split: option.splits if_splits)


256 


257 
lemma access_update_other: "path' \<parallel> path \<Longrightarrow>


258 
access (update path' opt root) path uid perms = access root path uid perms"


259 
proof 


260 
assume "path' \<parallel> path"


261 
then obtain y z xs ys zs where


262 
"y \<noteq> z" and "path' = xs @ y # ys" and "path = xs @ z # zs"


263 
by (blast dest: parallel_decomp)


264 
hence "lookup (update path' opt root) path = lookup root path"


265 
by (blast intro: lookup_update_other)


266 
thus ?thesis by (simp only: access_def)


267 
qed


268 


269 


270 
section {* Filesystem transitions \label{sec:unixtrans} *}


271 


272 
subsection {* Unix system calls \label{sec:unixsyscall} *}


273 


274 
text {*


275 
According to established operating system design (cf.\


276 
\cite{Tanenbaum:1992}) user space processes may only initiate system


277 
operations by a fixed set of \emph{systemcalls}. This enables the


278 
kernel to enforce certain security policies in the first


279 
place.\footnote{Incidently, this is the very same principle employed


280 
by any ``LCFstyle'' theorem proving system according to Milner's


281 
principle of ``correctness by construction'', such as Isabelle/HOL


282 
itself.}


283 


284 
\medskip In our model of Unix we give a fixed datatype @{text


285 
operation} for the syntax of systemcalls, together with an


286 
inductive definition of filesystem state transitions of the form


287 
@{text "root \<midarrow>x\<rightarrow> root'"} for the operational semantics.


288 
*}


289 


290 
datatype operation =


291 
Read uid string path


292 
 Write uid string path


293 
 Chmod uid perms path


294 
 Creat uid perms path


295 
 Unlink uid path


296 
 Mkdir uid perms path


297 
 Rmdir uid path


298 
 Readdir uid "name set" path


299 


300 
text {*


301 
The @{typ uid} field of an operation corresponds to the


302 
\emph{effective user id} of the underlying process, although our


303 
model never mentions processes explicitly. The other parameters are


304 
provided as arguments by the caller; the @{term path} one is common


305 
to all kinds of systemcalls.


306 
*}


307 


308 
consts


309 
uid_of :: "operation \<Rightarrow> uid"


310 
primrec


311 
"uid_of (Read uid text path) = uid"


312 
"uid_of (Write uid text path) = uid"


313 
"uid_of (Chmod uid perms path) = uid"


314 
"uid_of (Creat uid perms path) = uid"


315 
"uid_of (Unlink uid path) = uid"


316 
"uid_of (Mkdir uid path perms) = uid"


317 
"uid_of (Rmdir uid path) = uid"


318 
"uid_of (Readdir uid names path) = uid"


319 


320 
consts


321 
path_of :: "operation \<Rightarrow> path"


322 
primrec


323 
"path_of (Read uid text path) = path"


324 
"path_of (Write uid text path) = path"


325 
"path_of (Chmod uid perms path) = path"


326 
"path_of (Creat uid perms path) = path"


327 
"path_of (Unlink uid path) = path"


328 
"path_of (Mkdir uid perms path) = path"


329 
"path_of (Rmdir uid path) = path"


330 
"path_of (Readdir uid names path) = path"


331 


332 
text {*


333 
\medskip Note that we have omitted explicit @{text Open} and @{text


334 
Close} operations, pretending that @{term Read} and @{term Write}


335 
would already take care of this behind the scenes. Thus we have


336 
basically treated actual sequences of real systemcalls @{text


337 
open}@{text read}/@{text write}@{text close} as atomic.


338 


339 
In principle, this could make big a difference in a model with


340 
explicit concurrent processes. On the other hand, even on a real


341 
Unix system the exact scheduling of concurrent @{text open} and


342 
@{text close} operations does \emph{not} directly affect the success


343 
of corresponding @{text read} or @{text write}. Unix allows several


344 
processes to have files opened at the same time, even for writing!


345 
Certainly, the result from reading the contents later may be hard to


346 
predict, but the systemcalls involved here will succeed in any


347 
case.


348 


349 
\bigskip The operational semantics of system calls is now specified


350 
via transitions of the filesystem configuration. This is expressed


351 
as an inductive relation (although there is no actual recursion


352 
involved here).


353 
*}


354 


355 
consts


356 
transition :: "(file \<times> operation \<times> file) set"


357 


358 
syntax


359 
"_transition" :: "file \<Rightarrow> operation \<Rightarrow> file \<Rightarrow> bool"


360 
("_ \<midarrow>_\<rightarrow> _" [90, 1000, 90] 100)


361 
translations


362 
"root \<midarrow>x\<rightarrow> root'" \<rightleftharpoons> "(root, x, root') \<in> transition"


363 


364 
inductive transition


365 
intros


366 


367 
read:


368 
"access root path uid {Readable} = Some (Val (att, text)) \<Longrightarrow>


369 
root \<midarrow>(Read uid text path)\<rightarrow> root"


370 
write:


371 
"access root path uid {Writable} = Some (Val (att, text')) \<Longrightarrow>


372 
root \<midarrow>(Write uid text path)\<rightarrow> update path (Some (Val (att, text))) root"


373 


374 
chmod:


375 
"access root path uid {} = Some file \<Longrightarrow>


376 
uid = 0 \<or> uid = owner (attributes file) \<Longrightarrow>


377 
root \<midarrow>(Chmod uid perms path)\<rightarrow> update path


378 
(Some (file \<lparr>attributes := attributes file \<lparr>others := perms\<rparr>\<rparr>)) root"


379 


380 
creat:


381 
"path = parent_path @ [name] \<Longrightarrow>


382 
access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>


383 
access root path uid {} = None \<Longrightarrow>


384 
root \<midarrow>(Creat uid perms path)\<rightarrow> update path


385 
(Some (Val (\<lparr>owner = uid, others = perms\<rparr>, []))) root"


386 
unlink:


387 
"path = parent_path @ [name] \<Longrightarrow>


388 
access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>


389 
access root path uid {} = Some (Val plain) \<Longrightarrow>


390 
root \<midarrow>(Unlink uid path)\<rightarrow> update path None root"


391 


392 
mkdir:


393 
"path = parent_path @ [name] \<Longrightarrow>


394 
access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>


395 
access root path uid {} = None \<Longrightarrow>


396 
root \<midarrow>(Mkdir uid perms path)\<rightarrow> update path


397 
(Some (Env \<lparr>owner = uid, others = perms\<rparr> empty)) root"


398 
rmdir:


399 
"path = parent_path @ [name] \<Longrightarrow>


400 
access root parent_path uid {Writable} = Some (Env att parent) \<Longrightarrow>


401 
access root path uid {} = Some (Env att' empty) \<Longrightarrow>


402 
root \<midarrow>(Rmdir uid path)\<rightarrow> update path None root"


403 


404 
readdir:


405 
"access root path uid {Readable} = Some (Env att dir) \<Longrightarrow>


406 
names = dom dir \<Longrightarrow>


407 
root \<midarrow>(Readdir uid names path)\<rightarrow> root"


408 


409 
text {*


410 
\medskip Certainly, the above specification is central to the whole


411 
formal development. Any of the results to be established later on


412 
are only meaningful to the outside world if this transition system


413 
provides an adequate model of real Unix systems. This kind of


414 
``realitycheck'' of a formal model is the wellknown problem of


415 
\emph{validation}.


416 


417 
If in doubt, one may consider to compare our definition with the


418 
informal specifications given the corresponding Unix man pages, or


419 
even peek at an actual implementation such as


420 
\cite{Torvaldsetal:Linux}. Another common way to gain confidence


421 
into the formal model is to run simple simulations (see


422 
\secref{sec:unixexamples}), and check the results with that of


423 
experiments performed on a real Unix system.


424 
*}


425 


426 


427 
subsection {* Basic properties of single transitions \label{sec:unixsingletrans} *}


428 


429 
text {*


430 
The transition system @{text "root \<midarrow>x\<rightarrow> root'"} defined above


431 
determines a unique result @{term root'} from given @{term root} and


432 
@{term x} (this is holds rather trivially, since there is even only


433 
one clause for each operation). This uniqueness statement will


434 
simplify our subsequent development to some extent, since we only


435 
have to reason about a partial function rather than a general


436 
relation.


437 
*}


438 


439 
theorem transition_uniq: "root \<midarrow>x\<rightarrow> root' \<Longrightarrow> root \<midarrow>x\<rightarrow> root'' \<Longrightarrow> root' = root''"


440 
proof 


441 
assume root: "root \<midarrow>x\<rightarrow> root'"


442 
assume "root \<midarrow>x\<rightarrow> root''"


443 
thus "root' = root''"


444 
proof cases


445 
case read


446 
with root show ?thesis by cases auto


447 
next


448 
case write


449 
with root show ?thesis by cases auto


450 
next


451 
case chmod


452 
with root show ?thesis by cases auto


453 
next


454 
case creat


455 
with root show ?thesis by cases auto


456 
next


457 
case unlink


458 
with root show ?thesis by cases auto


459 
next


460 
case mkdir


461 
with root show ?thesis by cases auto


462 
next


463 
case rmdir


464 
with root show ?thesis by cases auto


465 
next


466 
case readdir


467 
with root show ?thesis by cases auto


468 
qed


469 
qed


470 


471 
text {*


472 
\medskip Apparently, filesystem transitions are \emph{typesafe} in


473 
the sense that the result of transforming an actual directory yields


474 
again a directory.


475 
*}


476 


477 
theorem transition_type_safe:


478 
"root \<midarrow>x\<rightarrow> root' \<Longrightarrow> \<exists>att dir. root = Env att dir


479 
\<Longrightarrow> \<exists>att dir. root' = Env att dir"


480 
proof 


481 
assume tr: "root \<midarrow>x\<rightarrow> root'"


482 
assume inv: "\<exists>att dir. root = Env att dir"


483 
show ?thesis


484 
proof (cases "path_of x")


485 
case Nil


486 
with tr inv show ?thesis


487 
by cases (auto simp add: access_def split: if_splits)


488 
next


489 
case Cons


490 
from tr obtain opt where


491 
"root' = root \<or> root' = update (path_of x) opt root"


492 
by cases auto


493 
with inv Cons show ?thesis


494 
by (auto simp add: update_eq split: list.splits)


495 
qed


496 
qed


497 


498 
text {*


499 
The previous result may be seen as the most basic invariant on the


500 
filesystem state that is enforced by any proper kernel


501 
implementation. So user processes  being bound to the


502 
systemcall interface  may never mess up a filesystem such that


503 
the root becomes a plain file instead of a directory, which would be


504 
a strange situation indeed.


505 
*}


506 


507 


508 
subsection {* Iterated transitions *}


509 


510 
text {*


511 
Iterated system transitions via finite sequences of system


512 
operations are modeled inductively as follows. In a sense, this


513 
relation describes the cumulative effect of the sequence of


514 
systemcalls issued by a number of running processes over a finite


515 
amount of time.


516 
*}


517 


518 
consts


519 
transitions :: "(file \<times> operation list \<times> file) set"


520 


521 
syntax


522 
"_transitions" :: "file \<Rightarrow> operation list \<Rightarrow> file \<Rightarrow> bool"


523 
("_ =_\<Rightarrow> _" [90, 1000, 90] 100)


524 
translations


525 
"root =xs\<Rightarrow> root'" \<rightleftharpoons> "(root, xs, root') \<in> transitions"


526 


527 
inductive transitions


528 
intros


529 
nil: "root =[]\<Rightarrow> root"


530 
cons: "root \<midarrow>x\<rightarrow> root' \<Longrightarrow> root' =xs\<Rightarrow> root'' \<Longrightarrow> root =(x # xs)\<Rightarrow> root''"


531 


532 
text {*


533 
\medskip We establish a few basic facts relating iterated


534 
transitions with single ones, according to the recursive structure


535 
of lists.


536 
*}


537 


538 
lemma transitions_nil_eq: "root =[]\<Rightarrow> root' = (root = root')"


539 
proof


540 
assume "root =[]\<Rightarrow> root'"


541 
thus "root = root'" by cases simp_all


542 
next


543 
assume "root = root'"


544 
thus "root =[]\<Rightarrow> root'" by (simp only: transitions.nil)


545 
qed


546 


547 
lemma transitions_cons_eq:


548 
"root =(x # xs)\<Rightarrow> root'' = (\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' =xs\<Rightarrow> root'')"


549 
proof


550 
assume "root =(x # xs)\<Rightarrow> root''"


551 
thus "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' =xs\<Rightarrow> root''"


552 
by cases auto


553 
next


554 
assume "\<exists>root'. root \<midarrow>x\<rightarrow> root' \<and> root' =xs\<Rightarrow> root''"


555 
thus "root =(x # xs)\<Rightarrow> root''"


556 
by (blast intro: transitions.cons)


557 
qed


558 


559 
text {*


560 
The next two rules show how to ``destruct'' known transition


561 
sequences. Note that the second one actually relies on the


562 
uniqueness property of the basic transition system (see


563 
\secref{sec:unixsingletrans}).


564 
*}


565 


566 
lemma transitions_nilD: "root =[]\<Rightarrow> root' \<Longrightarrow> root' = root"


567 
by (simp add: transitions_nil_eq)


568 


569 
lemma transitions_consD:


570 
"root =(x # xs)\<Rightarrow> root'' \<Longrightarrow> root \<midarrow>x\<rightarrow> root' \<Longrightarrow> root' =xs\<Rightarrow> root''"


571 
proof 


572 
assume "root =(x # xs)\<Rightarrow> root''"


573 
then obtain r' where r': "root \<midarrow>x\<rightarrow> r'" and root'': "r' =xs\<Rightarrow> root''"


574 
by cases simp_all


575 
assume "root \<midarrow>x\<rightarrow> root'"


576 
with r' have "r' = root'" by (rule transition_uniq)


577 
with root'' show "root' =xs\<Rightarrow> root''" by simp


578 
qed


579 


580 
text {*


581 
\medskip The following fact shows how an invariant @{term Q} of


582 
single transitions with property @{term P} may be transferred to


583 
iterated transitions. The proof is rather obvious by rule induction


584 
over the definition of @{term "root =xs\<Rightarrow> root'"}.


585 
*}


586 


587 
lemma transitions_invariant:


588 
"(\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r') \<Longrightarrow>


589 
root =xs\<Rightarrow> root' \<Longrightarrow> Q root \<Longrightarrow> \<forall>x \<in> set xs. P x \<Longrightarrow> Q root'"


590 
proof 


591 
assume r: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"


592 
assume "root =xs\<Rightarrow> root'"


593 
thus "Q root \<Longrightarrow> (\<forall>x \<in> set xs. P x) \<Longrightarrow> Q root'" (is "PROP ?P root xs root'")


594 
proof (induct ?P root xs root')


595 
fix root assume "Q root"


596 
thus "Q root" .


597 
next


598 
fix root root' root'' and x xs


599 
assume root': "root \<midarrow>x\<rightarrow> root'"


600 
assume hyp: "PROP ?P root' xs root''"


601 
assume Q: "Q root"


602 
assume P: "\<forall>x \<in> set (x # xs). P x"


603 
hence "P x" by simp


604 
with root' Q have Q': "Q root'" by (rule r)


605 
from P have "\<forall>x \<in> set xs. P x" by simp


606 
with Q' show "Q root''" by (rule hyp)


607 
qed


608 
qed


609 


610 
text {*


611 
As an example of applying the previous result, we transfer the basic


612 
typesafety property (see \secref{sec:unixsingletrans}) from


613 
single transitions to iterated ones, which is a rather obvious


614 
result indeed.


615 
*}


616 


617 
theorem transitions_type_safe:


618 
"root =xs\<Rightarrow> root' \<Longrightarrow> \<exists>att dir. root = Env att dir


619 
\<Longrightarrow> \<exists>att dir. root' = Env att dir"


620 
proof 


621 
case antecedent


622 
with transition_type_safe show ?thesis


623 
proof (rule transitions_invariant)


624 
show "\<forall>x \<in> set xs. True" by blast


625 
qed


626 
qed


627 


628 


629 
section {* Executable sequences *}


630 


631 
text {*


632 
An inductively defined relation such as the one of


633 
@{text "root \<midarrow>x\<rightarrow> root'"} (see \secref{sec:unixsyscall}) has two


634 
main aspects. First of all, the resulting system admits a certain


635 
set of transition rules (introductions) as given in the


636 
specification. Furthermore, there is an explicit leastfixedpoint


637 
construction involved, which results in induction (and


638 
caseanalysis) rules to eliminate known transitions in an exhaustive


639 
manner.


640 


641 
\medskip Subsequently, we explore our transition system in an


642 
experimental style, mainly using the introduction rules with basic


643 
algebraic properties of the underlying structures. The technique


644 
closely resembles that of Prolog combined with functional evaluation


645 
in a very simple manner.


646 


647 
Just as the ``closedworld assumption'' is left implicit in Prolog,


648 
we do not refer to induction over the whole transition system here.


649 
So this is still purely positive reasoning about possible


650 
executions; exhaustive reasoning will be employed only later on (see


651 
\secref{sec:unixbogosity}), when we shall demonstrate that certain


652 
behavior is \emph{not} possible.


653 
*}


654 


655 


656 
subsection {* Possible transitions *}


657 


658 
text {*


659 
Rather obviously, a list of system operations can be executed within


660 
a certain state if there is a result state reached by an iterated


661 
transition.


662 
*}


663 


664 
constdefs


665 
can_exec :: "file \<Rightarrow> operation list \<Rightarrow> bool"


666 
"can_exec root xs \<equiv> \<exists>root'. root =xs\<Rightarrow> root'"


667 


668 
lemma can_exec_nil: "can_exec root []"


669 
by (unfold can_exec_def) (blast intro: transitions.intros)


670 


671 
lemma can_exec_cons:


672 
"root \<midarrow>x\<rightarrow> root' \<Longrightarrow> can_exec root' xs \<Longrightarrow> can_exec root (x # xs)"


673 
by (unfold can_exec_def) (blast intro: transitions.intros)


674 


675 
text {*


676 
\medskip In case that we already know that a certain sequence can be


677 
executed we may destruct it backwardly into individual transitions.


678 
*}


679 


680 
lemma can_exec_snocD: "\<And>root. can_exec root (xs @ [y])


681 
\<Longrightarrow> \<exists>root' root''. root =xs\<Rightarrow> root' \<and> root' \<midarrow>y\<rightarrow> root''"


682 
(is "PROP ?P xs" is "\<And>root. ?A root xs \<Longrightarrow> ?C root xs")


683 
proof (induct xs)


684 
fix root


685 
{


686 
assume "?A root []"


687 
thus "?C root []"


688 
by (simp add: can_exec_def transitions_nil_eq transitions_cons_eq)


689 
next


690 
fix x xs


691 
assume hyp: "PROP ?P xs"


692 
assume asm: "?A root (x # xs)"


693 
show "?C root (x # xs)"


694 
proof 


695 
from asm obtain r root'' where x: "root \<midarrow>x\<rightarrow> r" and


696 
xs_y: "r =(xs @ [y])\<Rightarrow> root''"


697 
by (auto simp add: can_exec_def transitions_nil_eq transitions_cons_eq)


698 
from xs_y hyp obtain root' r' where xs: "r =xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> r'"


699 
by (auto simp add: can_exec_def)


700 
from x xs have "root =(x # xs)\<Rightarrow> root'"


701 
by (rule transitions.cons)


702 
with y show ?thesis by blast


703 
qed


704 
}


705 
qed


706 


707 


708 
subsection {* Example executions \label{sec:unixexamples} *}


709 


710 
text {*


711 
We are now ready to perform a few experiments within our formal


712 
model of Unix systemcalls. The common technique is to alternate


713 
introduction rules of the transition system (see


714 
\secref{sec:unixtrans}), and steps to solve any emerging side


715 
conditions using algebraic properties of the underlying filesystem


716 
structures (see \secref{sec:unixfilesystem}).


717 
*}


718 


719 
lemmas eval = access_def init_def


720 


721 
theorem "u \<in> users \<Longrightarrow> can_exec (init users)


722 
[Mkdir u perms [u, name]]"


723 
apply (rule can_exec_cons)


724 
 {* backchain @{term can_exec} (of @{term [source] Cons}) *}


725 
apply (rule mkdir)


726 
 {* backchain @{term Mkdir} *}


727 
apply (force simp add: eval)+


728 
 {* solve preconditions of @{term Mkdir} *}


729 
apply (simp add: eval)


730 
 {* peek at resulting dir (optional) *}


731 
txt {* @{subgoals [display]} *}


732 
apply (rule can_exec_nil)


733 
 {* backchain @{term can_exec} (of @{term [source] Nil}) *}


734 
done


735 


736 
text {*


737 
By inspecting the result shown just before the final step above, we


738 
may gain confidence that our specification of Unix systemcalls


739 
actually makes sense. Further common errors are usually exhibited


740 
when preconditions of transition rules fail unexpectedly.


741 


742 
\medskip Here are a few further experiments, using the same


743 
techniques as before.


744 
*}


745 


746 
theorem "u \<in> users \<Longrightarrow> can_exec (init users)


747 
[Creat u perms [u, name],


748 
Unlink u [u, name]]"


749 
apply (rule can_exec_cons)


750 
apply (rule creat)


751 
apply (force simp add: eval)+


752 
apply (simp add: eval)


753 
apply (rule can_exec_cons)


754 
apply (rule unlink)


755 
apply (force simp add: eval)+


756 
apply (simp add: eval)


757 
txt {* peek at result: @{subgoals [display]} *}


758 
apply (rule can_exec_nil)


759 
done


760 


761 
theorem "u \<in> users \<Longrightarrow> Writable \<in> perms1 \<Longrightarrow>


762 
Readable \<in> perms2 \<Longrightarrow> name3 \<noteq> name4 \<Longrightarrow>


763 
can_exec (init users)


764 
[Mkdir u perms1 [u, name1],


765 
Mkdir u' perms2 [u, name1, name2],


766 
Creat u' perms3 [u, name1, name2, name3],


767 
Creat u' perms3 [u, name1, name2, name4],


768 
Readdir u {name3, name4} [u, name1, name2]]"


769 
apply (rule can_exec_cons, rule transition.intros,


770 
(force simp add: eval)+, (simp add: eval)?)+


771 
txt {* peek at result: @{subgoals [display]} *}


772 
apply (rule can_exec_nil)


773 
done


774 


775 
theorem "u \<in> users \<Longrightarrow> Writable \<in> perms1 \<Longrightarrow> Readable \<in> perms3 \<Longrightarrow>


776 
can_exec (init users)


777 
[Mkdir u perms1 [u, name1],


778 
Mkdir u' perms2 [u, name1, name2],


779 
Creat u' perms3 [u, name1, name2, name3],


780 
Write u' ''foo'' [u, name1, name2, name3],


781 
Read u ''foo'' [u, name1, name2, name3]]"


782 
apply (rule can_exec_cons, rule transition.intros,


783 
(force simp add: eval)+, (simp add: eval)?)+


784 
txt {* peek at result: @{subgoals [display]} *}


785 
apply (rule can_exec_nil)


786 
done


787 


788 


789 
section {* Odd effects  treated formally \label{sec:unixbogosity} *}


790 


791 
text {*


792 
We are now ready to give a completely formal treatment of the


793 
slightly odd situation discussed in the introduction (see


794 
\secref{sec:unixintro}): the filesystem can easily reach a state


795 
where a user is unable to remove his very own directory, because it


796 
is still populated by items placed there by another user in an


797 
uncouth manner.


798 
*}


799 


800 
subsection {* The general procedure \label{sec:unixinvprocedure} *}


801 


802 
text {*


803 
The following theorem expresses the general procedure we are


804 
following to achieve the main result.


805 
*}


806 


807 
theorem general_procedure:


808 
"(\<And>r r'. Q r \<Longrightarrow> r \<midarrow>y\<rightarrow> r' \<Longrightarrow> False) \<Longrightarrow>


809 
(\<And>root. init users =bs\<Rightarrow> root \<Longrightarrow> Q root) \<Longrightarrow>


810 
(\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r') \<Longrightarrow>


811 
init users =bs\<Rightarrow> root \<Longrightarrow>


812 
\<not> (\<exists>xs. (\<forall>x \<in> set xs. P x) \<and> can_exec root (xs @ [y]))"


813 
proof 


814 
assume cannot_y: "\<And>r r'. Q r \<Longrightarrow> r \<midarrow>y\<rightarrow> r' \<Longrightarrow> False"


815 
assume init_inv: "\<And>root. init users =bs\<Rightarrow> root \<Longrightarrow> Q root"


816 
assume preserve_inv: "\<And>r x r'. r \<midarrow>x\<rightarrow> r' \<Longrightarrow> Q r \<Longrightarrow> P x \<Longrightarrow> Q r'"


817 
assume init_result: "init users =bs\<Rightarrow> root"


818 
{


819 
fix xs


820 
assume Ps: "\<forall>x \<in> set xs. P x"


821 
assume can_exec: "can_exec root (xs @ [y])"


822 
then obtain root' root'' where


823 
xs: "root =xs\<Rightarrow> root'" and y: "root' \<midarrow>y\<rightarrow> root''"


824 
by (blast dest: can_exec_snocD)


825 
from init_result have "Q root" by (rule init_inv)


826 
from preserve_inv xs this Ps have "Q root'"


827 
by (rule transitions_invariant)


828 
from this y have False by (rule cannot_y)


829 
}


830 
thus ?thesis by blast


831 
qed


832 


833 
text {*


834 
Here @{prop "P x"} refers to the restriction on filesystem


835 
operations that are admitted after having reached the critical


836 
configuration; according to the problem specification this will


837 
become @{prop "uid_of x = user1"} later on. Furthermore, @{term y}


838 
refers to the operations we claim to be impossible to perform


839 
afterwards, we will take @{term Rmdir} later. Moreover @{term Q} is


840 
a suitable (auxiliary) invariant over the filesystem; choosing


841 
@{term Q} adequately is very important to make the proof work (see


842 
\secref{sec:unixinvlemmas}).


843 
*}


844 


845 


846 
subsection {* The particular setup *}


847 


848 
text {*


849 
We introduce a few global declarations and axioms to describe our


850 
particular setup considered here. Thus we avoid excessive use of


851 
local parameters in the subsequent development.


852 
*}


853 


854 
consts


855 
users :: "uid set"


856 
user1 :: uid


857 
user2 :: uid


858 
name1 :: name


859 
name2 :: name


860 
name3 :: name


861 
perms1 :: perms


862 
perms2 :: perms


863 


864 
axioms


865 
user1_known: "user1 \<in> users"


866 
user1_not_root: "user1 \<noteq> 0"


867 
users_neq: "user1 \<noteq> user2"


868 
perms1_writable: "Writable \<in> perms1"


869 
perms2_not_writable: "Writable \<notin> perms2"


870 


871 
lemmas "setup" =


872 
user1_known user1_not_root users_neq


873 
perms1_writable perms2_not_writable


874 


875 
text {*


876 
\medskip The @{term bogus} operations are the ones that lead into


877 
the uncouth situation; @{term bogus_path} is the key position within


878 
the filesystem where things go awry.


879 
*}


880 


881 
constdefs


882 
bogus :: "operation list"


883 
"bogus \<equiv>


884 
[Mkdir user1 perms1 [user1, name1],


885 
Mkdir user2 perms2 [user1, name1, name2],


886 
Creat user2 perms2 [user1, name1, name2, name3]]"


887 


888 
bogus_path :: path


889 
"bogus_path \<equiv> [user1, name1, name2]"


890 


891 


892 
subsection {* Invariance lemmas \label{sec:unixinvlemmas} *}


893 


894 
text {*


895 
The following invariant over the root filesystem describes the


896 
bogus situation in an abstract manner: located at a certain @{term


897 
path} within the filesystem is a nonempty directory that is


898 
neither owned and nor writable by @{term user1}.


899 
*}


900 


901 
constdefs


902 
invariant :: "file \<Rightarrow> path \<Rightarrow> bool"


903 
"invariant root path \<equiv>


904 
(\<exists>att dir.


905 
access root path user1 {} = Some (Env att dir) \<and> dir \<noteq> empty \<and>


906 
user1 \<noteq> owner att \<and>


907 
access root path user1 {Writable} = None)"


908 


909 
text {*


910 
Following the general procedure (see


911 
\secref{sec:unixinvprocedure}) we


912 
will now establish the three key lemmas required to yield the final


913 
result.


914 


915 
\begin{enumerate}


916 


917 
\item The invariant is sufficiently strong to entail the


918 
pathological case that @{term user1} is unable to remove the (owned)


919 
directory at @{term "[user1, name1]"}.


920 


921 
\item The invariant does hold after having executed the @{term


922 
bogus} list of operations (starting with an initial filesystem


923 
configuration).


924 


925 
\item The invariant is preserved by any filesystem operation


926 
performed by @{term user1} alone, without any help by other users.


927 


928 
\end{enumerate}


929 


930 
As the invariant appears both as assumptions and conclusions in the


931 
course of proof, its formulation is rather critical for the whole


932 
development to work out properly. In particular, the third step is


933 
very sensitive to the invariant being either too strong or too weak.


934 
Moreover the invariant has to be sufficiently abstract, lest the


935 
proof become cluttered by confusing detail.


936 


937 
\medskip The first two lemmas are technically straight forward 


938 
we just have to inspect rather special cases.


939 
*}


940 


941 
lemma cannot_rmdir: "invariant root bogus_path \<Longrightarrow>


942 
root \<midarrow>(Rmdir user1 [user1, name1])\<rightarrow> root' \<Longrightarrow> False"


943 
proof 


944 
assume "invariant root bogus_path"


945 
then obtain file where "access root bogus_path user1 {} = Some file"


946 
by (unfold invariant_def) blast


947 
moreover


948 
assume "root \<midarrow>(Rmdir user1 [user1, name1])\<rightarrow> root'"


949 
then obtain att where


950 
"access root [user1, name1] user1 {} = Some (Env att empty)"


951 
by cases auto


952 
hence "access root ([user1, name1] @ [name2]) user1 {} = empty name2"


953 
by (simp only: access_empty_lookup lookup_append_some) simp


954 
ultimately show False by (simp add: bogus_path_def)


955 
qed


956 


957 
lemma init_invariant: "init users =bogus\<Rightarrow> root \<Longrightarrow> invariant root bogus_path"


958 
proof 


959 
note eval = "setup" access_def init_def


960 
case antecedent thus ?thesis


961 
apply (unfold bogus_def bogus_path_def)


962 
apply (drule transitions_consD, rule transition.intros,


963 
(force simp add: eval)+, (simp add: eval)?)+


964 
 "evaluate all operations"


965 
apply (drule transitions_nilD)


966 
 "reach final result"


967 
apply (simp add: invariant_def eval)


968 
 "check the invariant"


969 
done


970 
qed


971 


972 
text {*


973 
\medskip At last we are left with the main effort to show that the


974 
``bogosity'' invariant is preserved by any filesystem operation


975 
performed by @{term user1} alone. Note that this holds for any


976 
@{term path} given, the particular @{term bogus_path} is not


977 
required here.


978 
*} (* FIXME The overall structure of the proof is as follows \dots *)


979 


980 
lemma preserve_invariant: "root \<midarrow>x\<rightarrow> root' \<Longrightarrow>


981 
invariant root path \<Longrightarrow> uid_of x = user1 \<Longrightarrow> invariant root' path"


982 
proof 


983 
assume tr: "root \<midarrow>x\<rightarrow> root'"


984 
assume inv: "invariant root path"


985 
assume uid: "uid_of x = user1"


986 


987 
from inv obtain att dir where


988 
inv1: "access root path user1 {} = Some (Env att dir)" and


989 
inv2: "dir \<noteq> empty" and


990 
inv3: "user1 \<noteq> owner att" and


991 
inv4: "access root path user1 {Writable} = None"


992 
by (auto simp add: invariant_def)


993 


994 
from inv1 have lookup: "lookup root path = Some (Env att dir)"


995 
by (simp only: access_empty_lookup)


996 


997 
from inv1 inv3 inv4 and user1_not_root


998 
have not_writable: "Writable \<notin> others att"


999 
by (auto simp add: access_def split: option.splits if_splits)


1000 


1001 
show ?thesis


1002 
proof cases


1003 
assume "root' = root"


1004 
with inv show "invariant root' path" by (simp only:)


1005 
next


1006 
assume changed: "root' \<noteq> root"


1007 
with tr obtain opt where root': "root' = update (path_of x) opt root"


1008 
by cases auto


1009 
show ?thesis


1010 
proof (rule prefix_cases)


1011 
assume "path_of x \<parallel> path"


1012 
with inv root'


1013 
have "\<And>perms. access root' path user1 perms = access root path user1 perms"


1014 
by (simp only: access_update_other)


1015 
with inv show "invariant root' path"


1016 
by (simp only: invariant_def)


1017 
next


1018 
assume "path_of x \<le> path"


1019 
then obtain ys where path: "path = path_of x @ ys" ..


1020 


1021 
show ?thesis


1022 
proof (cases ys)


1023 
assume "ys = []"


1024 
with tr uid inv2 inv3 lookup changed path and user1_not_root


1025 
have False


1026 
by cases (auto simp add: access_empty_lookup dest: access_some_lookup)


1027 
thus ?thesis ..


1028 
next


1029 
fix z zs assume ys: "ys = z # zs"


1030 
have "lookup root' path = lookup root path"


1031 
proof 


1032 
from inv2 lookup path ys


1033 
have look: "lookup root (path_of x @ z # zs) = Some (Env att dir)"


1034 
by (simp only:)


1035 
then obtain att' dir' file' where


1036 
look': "lookup root (path_of x) = Some (Env att' dir')" and


1037 
dir': "dir' z = Some file'" and


1038 
file': "lookup file' zs = Some (Env att dir)"


1039 
by (blast dest: lookup_some_upper)


1040 


1041 
from tr uid changed look' dir' obtain att'' where


1042 
look'': "lookup root' (path_of x) = Some (Env att'' dir')"


1043 
by cases (auto simp add: access_empty_lookup lookup_update_some


1044 
dest: access_some_lookup)


1045 
with dir' file' have "lookup root' (path_of x @ z # zs) =


1046 
Some (Env att dir)"


1047 
by (simp add: lookup_append_some)


1048 
with look path ys show ?thesis


1049 
by simp


1050 
qed


1051 
with inv show "invariant root' path"


1052 
by (simp only: invariant_def access_def)


1053 
qed


1054 
next


1055 
assume "path < path_of x"


1056 
then obtain y ys where path: "path_of x = path @ y # ys" ..


1057 


1058 
obtain dir' where


1059 
lookup': "lookup root' path = Some (Env att dir')" and


1060 
inv2': "dir' \<noteq> empty"


1061 
proof (cases ys)


1062 
assume "ys = []"


1063 
with path have parent: "path_of x = path @ [y]" by simp


1064 
with tr uid inv4 changed obtain file where


1065 
"root' = update (path_of x) (Some file) root"


1066 
by cases auto


1067 
with parent lookup have "lookup root' path = Some (Env att (dir(y\<mapsto>file)))"


1068 
by (simp only: update_append_some update_cons_nil_env)


1069 
moreover have "dir(y\<mapsto>file) \<noteq> empty" by simp


1070 
ultimately show ?thesis ..


1071 
next


1072 
fix z zs assume ys: "ys = z # zs"


1073 
with lookup root' path


1074 
have "lookup root' path = Some (update (y # ys) opt (Env att dir))"


1075 
by (simp only: update_append_some)


1076 
also obtain file' where


1077 
"update (y # ys) opt (Env att dir) = Env att (dir(y\<mapsto>file'))"


1078 
proof 


1079 
have "dir y \<noteq> None"


1080 
proof 


1081 
have "dir y = lookup (Env att dir) [y]"


1082 
by (simp split: option.splits)


1083 
also from lookup have "\<dots> = lookup root (path @ [y])"


1084 
by (simp only: lookup_append_some)


1085 
also have "\<dots> \<noteq> None"


1086 
proof 


1087 
from ys obtain us u where rev_ys: "ys = us @ [u]"


1088 
by (cases ys rule: rev_cases) auto


1089 
with tr path


1090 
have "lookup root ((path @ [y]) @ (us @ [u])) \<noteq> None \<or>


1091 
lookup root ((path @ [y]) @ us) \<noteq> None"


1092 
by cases (auto dest: access_some_lookup)


1093 
thus ?thesis by (blast dest!: lookup_some_append)


1094 
qed


1095 
finally show ?thesis .


1096 
qed


1097 
with ys show ?thesis


1098 
by (insert that, auto simp add: update_cons_cons_env)


1099 
qed


1100 
also have "dir(y\<mapsto>file') \<noteq> empty" by simp


1101 
ultimately show ?thesis ..


1102 
qed


1103 


1104 
from lookup' have inv1': "access root' path user1 {} = Some (Env att dir')"


1105 
by (simp only: access_empty_lookup)


1106 


1107 
from inv3 lookup' and not_writable user1_not_root


1108 
have "access root' path user1 {Writable} = None"


1109 
by (simp add: access_def)


1110 
with inv1' inv2' inv3 show ?thesis by (unfold invariant_def) blast


1111 
qed


1112 
qed


1113 
qed


1114 


1115 


1116 
subsection {* Putting it all together \label{sec:unixmainresult} *}


1117 


1118 
text {*


1119 
The main result is now imminent, just by composing the three


1120 
invariance lemmas (see \secref{sec:unixinvlemmas}) according the the


1121 
overall procedure (see \secref{sec:unixinvprocedure}).


1122 
*}


1123 


1124 
theorem main:


1125 
"init users =bogus\<Rightarrow> root \<Longrightarrow>


1126 
\<not> (\<exists>xs. (\<forall>x \<in> set xs. uid_of x = user1) \<and>


1127 
can_exec root (xs @ [Rmdir user1 [user1, name1]]))"


1128 
proof 


1129 
case antecedent


1130 
with cannot_rmdir init_invariant preserve_invariant


1131 
show ?thesis by (rule general_procedure)


1132 
qed


1133 


1134 
end
