JSON, null and
DuncanTemple Lang
University of California at Davis
Department of Statistics
Limitations of JSON regarding the meaning of null
JavaScript Object Notation (JSON) is a convenient format for
representing data and facilitates transferring data between
applications. It is widely used in different applications, Web
services and other contexts. As such, it is useful for R to be able
to import and export data in this format. Unfortunately, JSON is a
little too simple and cannot faithfully represent all of the types and
values in R. Most specifically, there is no way to support ,
Inf or NaN. Typically, these
values are represented in JSON as "null". However, that is also used
to represent a null object. So there is ambiguity in how we interpret
null in JSON.
We don't know whether it is , NaN, Inf or in R.
This many-to-one mapping results in a loss of information.
In spite of the shortcomings of the format, we can still work with
JSON. However, how we convert null values to R and how we
convert values from R is not automatic and uniquely defined.
For that reason, the caller must control how these are mapped.
We provide some mechanisms to do this in the
fromJSON and toJSON functions.
When converting R objects to JSON via toJSON,
one can specify how to map values to JSON.
One provides a value for the parameter .na
to control this.
For example, suppose we want to transform the R list
x = list(1, 2, NA, c(TRUE, NA, FALSE))
to JSON and want values to map to null.
We can achieve this with
toJSON(x, .na = "null")
In some applications, we represent a missing value with a fixed number that is unlikely
to occur in actual data, e.g. -99999.
We can map values to such a number with
toJSON(list(1, 2, list(NA)), .na = -99999)
Now consider round-tripping , e.g.
o = toJSON ( NA )
[1] "[ null ]"
fromJSON( o )
So we have lost information.
We can correct this loss of information by
specifying how to map null values in JSON
to R values. We use the nullValue
fromJSON( toJSON ( NA ), nullValue = NA)
Again, here we as the caller of fromJSON
(and also toJSON) we are providing
information about how to transfer the null value from JSON to R.
Only we know what it means in this case.
If we knew that the null corresponded to Inf,
we could specify that:
fromJSON( "[null]", nullValue = Inf)
Where this mechanism breaks down is when we have multiple
null values in our JSON content and they map to different
R values, e.g. , and NaN.
The nullValue parameter is a global replacement for
null entries in the JSON. To adaptively process these null
entries in a context specific manner, we have to use a customized
parser. We can do this by providing an R function as the
callback handler for the JSON parser.