Homeless in New York City

Professor Jochen Albrecht
Department of Geography
Hunter College, CUNY

On generating spatio-temporal data

Professor Edzer Pebesma
Co-director of Institute for Geoinformatics
University of Munster

Abstract: In order to gain knowledge, we (scientists) often generate new data products from primary observation data, and disseminate what we did by (i) text, e.g. in the form of a scientific publication, and (ii) publishing the input and output data and computational procedures used, e.g. an R script along with software version information. Although this may be sufficient to understand, reproduce and verify the work, it is still too difficult to assess whether two independently generated products can be compared, or whether method Y can be meaningfully applied to data set X. And how can we discover all datasets generated by procedure Z, or advertise the one we just generated?

We [1] introduce a generative algebra for spatio-temporal information. Using functions on the basic types Space, Time, Quality, and Entity, we construct data generation procedures (fields, lattices, point patterns, objects, events, trajectories) that can be executed to generate actual data. Data derivation operations (the algebra) are used for generating new data types, e.g. for generating objects from fields, fields from objects, or lattices from trajectories. As opposed to data which is always discrete, data generation procedures can be continuous. They also make the data support explicit, i.e. whether values refer to points/time instances or to areas/time intervals. In contrast to the when/where/what questions usually addressed by semantics, we believe that the algebra can be used to describe the why and how questions, and as such describe data provenance.

[1] joint work with Simon Scheider, Benedikt Graeler and Christoph Stasch; see

