These are extensions to the Basis library that are specific to SML/NJ. You can find reference documentation to them in the "Special features of SML/NJ" page via the SML/NJ home page[SML].
The Unsafe API is a collection of functions that bypass the normal safety checks of the language and the Basis library. These functions are available in the Unsafe structure. It provides:
Access to the elements of arrays and vectors, including strings, without the usual subscript range checks.
Access to information about the memory representation of values.
An interface to C functions in the runtime.
Miscellaneous operations used internally by the compiler and associated subsystems.
Unchecked subscripting is used internally by the array and vector functions in the Basis library. Wherever possible you should design your code to make use of the Basis functions. Using the unchecked operations directly puts your program at risk of crashing.
The following monomorphic vectors and arrays are available.
This operates on strings, which are vectors of characters.
This operates on vectors of bytes.
These operate on arrays of characters or bytes.
This operates on arrays of double precision reals. The C equivalent would be the array type double[].[1]
These structures conform to one of these two signatures.
signature UNSAFE_MONO_VECTOR =
sig
type vector
type elem
val sub : (vector * int) -> elem
val update : (vector * int * elem) -> unit
val create : int -> vector
end
signature UNSAFE_MONO_ARRAY =
sig
type array
type elem
val sub : (array * int) -> elem
val update : (array * int * elem) -> unit
val create : int -> array
end |
So you can see that you get to update elements of vectors in place just as you can with arrays. The create functions create a vector or array of the given length with uninitialised elements.
For arrays and vectors of other kinds of elements there are the structures Unsafe.Vector and Unsafe.Array which conform to the following signatures.
signature UNSAFE_VECTOR =
sig
val sub : ('a vector * int) -> 'a
val create : (int * 'a list) -> 'a vector
end
signature UNSAFE_ARRAY =
sig
val sub : ('a array * int) -> 'a
val update : ('a array * int * 'a) -> unit
val create : (int * 'a) -> 'a array
end |
The vector create function creates a vector from a list. You have to supply the length of the list as the first argument. The array create function creates an array given a length and an initial value for each element.
The Unsafe.Object structure provides some functions for getting information about the memory representation. Read the source code in the boot/Unsafe/object* files of the compiler. You won't find much use for this in your programs. The most useful functions look like being the toWord32, toInt32 functions which can convert a byte array to a 32 bit integer. But there isn't enough functionality here to be useful for serialising values into a wire protocol. (See the section called Integers in Chapter 3 for serialising integers).
You could use this structure to estimate the size of objects in memory. Here is my version of a function to estimate the size of a value, including pointed-to values. I've used O as an alias for Unsafe.Object.
(* Estimate the size of v in 32-bit words.
Boxed objects have an extra descriptor word
which also contains the length for vectors
and arrays.
*)
fun sizeof v =
let
fun obj_size obj =
(
case O.rep obj of
O.Unboxed => 1 (* inline 31 bits *)
| O.Real => 1+2
| O.Pair => tup_size obj
| O.Record => tup_size obj
| O.RealArray => tup_size obj
| O.PolyArray => arr_size obj
(* includes Word8Vector.vector
and CharVector.vector
*)
| O.ByteVector => 1 +
((size(O.toString obj)+3) div 4)
(* includes Word8Array.array
and CharArray.array
*)
| O.ByteArray => 1 +
((Array.length(O.toArray obj)+3) div 4)
| _ => 2 (* punt for other objects *)
)
(* Count the record plus the size of
pointed-to objects in the heap.
*)
and tup_size obj =
let
fun sz obj =
if O.boxed obj
then
1 + (obj_size obj)
else
1
in
Vector.foldl
(fn (obj, s) => s + (sz obj))
1
(O.toTuple obj)
end
and arr_size obj =
let
fun sz obj =
if O.boxed obj
then
1 + (obj_size obj)
else
1
in
Array.foldl
(fn (obj, s) => s + (sz obj))
1
(O.toArray obj)
end
in
obj_size(O.toObject v)
end |
This is a main function to try it out.
fun main(arg0, argv) =
let
fun show name v = print(concat[
"Size of ", name,
" = ", Int.toString(sizeof v),
" 32-bit words\n"])
in
show "integer" 3;
show "real" 3.3;
show "string" "abc";
show "pair" ("abc", 42);
show "record" {a = 1, b = 4.5, c = "fred"};
OS.Process.success
end |
See the section called Heap Object Layout in Chapter 7 for more information on object layout in the heap.
The runtime includes a collection of C functions that implement the low-level Basis operations such as those in the Posix structure. The SML code calls these C functions using the functions in the Unsafe.CInterface structure. These functions must be specially written to take arguments in the form of SML values. This is not a general purpose interface to C functions. I only mention it in case you think that it is for general purpose use.
Later versions of SML/NJ will include a general purpose interface for calling any C function in a shared library which is loaded at run-time.
The Unsafe.blastRead and Unsafe.blastWrite functions are used to serialise/deserialise entire data structures for writing to files. The blastWrite function is expensive to run since it uses the garbage collector to traverse the data structure to locate all values reachable from the root value. You shouldn't call it often to serialise small data structures. Instead it is intended that you build up an entire data structure and then dump it into a file at exit time.
The Unsafe.cast function can be used to cast a value to any other type. This of course is very dangerous unless you know the underlying memory representation. Most cases where you might want to do this are already provided for. For example converting between bytes and characters is provided in the Byte structure.
The other functions in Unsafe should not be used. Some are used by separate systems such as the Concurrent ML library which we will be using later.
The Unsafe.Poll structure is not normally accessible and isn't interesting to us.
| [1] | There should be a Unsafe.Real64Vector but it isn't implemented yet. |