TEACHING GEOG 385.02/GTECH 785.02 
GIS APPLICATIONS IN SOCIAL GEOGRAPHY
Back to MP home page

GIS SG course homepage
Back to GIS SG schedule
GIS links and resources
Geography resources


Competency exercise 3/CLASS DEMO 2-1. GIS FILE STRUCTURES

This demo/lecture demonstrates the basics of Idrisi raster and vector file structures.

Download Demo2.exe for this demo. Downloading class demo data.

 

Complete the demo and answer all the questions. Find wrong answer and hand in correct answers for these questions only. Questions are shown in purple.

 

GIS file structures 

VECTOR FILE STRUCTURES - OVERVIEW

TOPOLOGICAL VECTOR FILE STRUCTURES

Raster file structures:

IDRISI VECTOR FILE STRUCTURES:

 

Raster data files in Idrisi

Raster documentation files in Idrisi

 

Vector data files in Idrisi

Point files

Vector documentation files in Idrisi

Line files

Polygon files

INTRODUCTION TO ARCVIEW DEMO

 

VECTOR ATTRIBUTE FILE STRUCTURES

 


GIS FILE STRUCTURES

Data structures provide the information that the computer requires to reconstruct the spatial data model in digital form.

Two types of information is recorded in GIS data files: data and metadata

  • Data consists of spatial data (coordinates, georeferencing, feature types, topology, etc.) and attribute values.
  • Metadata is information about the data (origin, quality, accuracy, etc.). Also called documentation of data files.

Data and metadata can be stored

  • in one file (satellite data - SPOT files, etc.)
  • in different files (like in Idrisi, in some cases in ArcView, ArcInfo, and ArcGIS). In this case, data file and documentation file have the same filename but different extensions.

When you copy or move a GIS file, make sure that you copy both data and documentation files! Some programs do this by default but Windows Explorer will not.

Top of page


RASTER FILE STRUCTURES

Raster data files include simple raster files (most programs), run-length encoding, quad-tree structures. See the textbook for details.

Image (Raster) data file structure (.rst)

Raster files have a very simple structure. They consist of a long list of pixel values (column of values). Documentation file stores the number of columns and rows, which tells the software where to begin to draw the next raw of pixels. Cell values are displayed starting with the upper left pixel and going left to right top to bottom.
 

23

 

 

 

 

 

 

 

25

 

 

 

 

 

 

 

28

 

 

 

 

 

 

 

31

 

 

 

 

 

 

 

27

 

 

 

 

 

 

 

25

 

Information from the documentation file tells the software when to draw the next row of pixels. 

==========================>

23

25

28

31

27

26

 

25

26

31

32

24

31

 

30

26

24

23

25

32

 

 

 

 

 

 

24

 

 

 

 

 

 

30

 

 

 

 

 

 

 

26

 

 

 

 

 

 

 

24

 

 

 

 

 

 

 

23

 

 

 

 

 

 

 

25

 

 

 

 

 

 

 

In Idrisi, rows are counted starting from 0 top to bottom, columns starting from 0 left to right. The above image 3 by 5 has rows numbers from 0 to 2 and column numbers from 0 to 4. Pixel position is usually defined as column number/ row number.

Column and row position of value 30 is (_____).

To see the contents of a raster data file:

  • Display Worcwest (\Idrisi Tutorial\Introductory GIS).
  • Activate Cursor Query mode button and query raster cell values. They correspond to landuse categories reflected in the legend.
  • Using Zoom Window button, zoom in to display a very small portion of the image so that the raster structre is very visible. Query cell values again.

Idrisi File Explorer utility under File menu serves for file management and viewing contents of data and documentation files.

  • File/Idrisi File Explorer. Display Raster files in \Idrisi Tutorial\Introductory GIS folder. Find and highlight Worcwest image.
  • Click View Structure while Worcwest is highlighted. You can see cell attribute values organized in a grid. Scroll up and down using arrows. Note column and row positions displayed for every attribute value.

Top of page


Image (Raster) documentation file (*.rdc)

Stores the information about the data itself and how it should be displayed.

To see the contents of a raster documentation file:

  • From Idrisi File Explorer click View Metadata while Worcwest is highlighted in the file list. Metadata icon is also accessible from File from the Tool bar at the top of the Idrisi application window.
  • The following information is recorded in the documentation file. My comments are to the right of each documentation field. Also use Help/ File structure to read about file structures in Idrisi. Make sure you understand the difference between min and max X and Y, min and max attribute values, and min and max display values.

file format : IDRISI Raster A.1

 Here vector or raster file

file title : Worcester West - Land Use / Land Cover

 Brief content of the data file

data type : byte

 In Idrisi raster files - byte, integer (simple integer), and real (floating point). See module CONVERT.

file type : binary

 Can be binary, packed binary (compressed) or ASCII. See module CONVERT. Only binary files can be used analytically.

columns : 480

 Number of columns

rows : 480

 Number of rows

ref. system : plane

 Name of the geographic reference system (e.g, plane, geographic (Lat/Long), or a specific referencing system defined by a Reference System Parameter file, such as UTM, SPC, and hundreds of others). See HELP for the info on Reference System Parameter files)

ref. units : m

 Units in which spatial information is measured (e.g., cell resolution, distance between objects, etc.). Here it is meters. Other common ref. units are km, ft, miles, degrees, etc.

unit dist. : 1.0000000

 Scaling factor for reference units (almost always 1).

min. X : 0.0000000

 Min X coordinate

max. X : 14400.0000000

 Max X coordinate

min. Y : 0.0000000

 Min Y coordinate

max. Y : 14400.0000000

 Max Y coordinate

pos'n error : unknown

 RMS error of coordinate accuracy

resolution : 30.0000000

 Cell resolution (length of a side of the cell). Here 30 meters (see reference units above).

min. value : 1

This is the min ATTRIBUTE VALUE, not a coordinate value. In this case value units are landuse categories (classes). So, the minimum value used to designate a landuse class is 1.

max. value : 14

 Max attribute value or landuse class is 14. Note, that because the data is qualitative, these codes only indicate the difference in types of landuse classes.

display min : 1

 The minimum attribute value actually displayed

display max : 14

 The maximum attribute value actually displayed

value units : unspecified

 Units of measurement for attribute data. Here they are landuse categories or classes.

value error : unknown

 Attribute value error can be recorded here.

flag value : 0

 Value used to mark pixels for which a landuse value is unavailable. Can be any other agreed upon value.

flag def'n : none

Defines what actually is meant by a flag value in this case (unavailable data, excluded data values, etc.) 

legend cats : 14

 Number of legend categories (landuse classes in this case)

code 1 : Water

 Code 1 is the minimum attribute values as discussed above. Designates landuse type "Water".

code 2 : Deciduous 1

 

code 3 : Deciduous 2

 

code 4 : Deciduous 3

 

code 5 : Conifer 1

 

code 6 : Conifer 2

 

code 7 : Grass/Suburb

 

code 8 : Agriculture

 

code 9 : Urban Resid.

 

code 10 : Urban Comm.

 

code 11 : Pavement 1

 

code 12 : Pavement 2

 

code 13 : Gravel

 

code 14 : Barren

 Code 14 is the maximum attribute value as discussed above. Designates Landuse "Barren".

Top of page


VECTOR FILE STRUCTURES

There are three major types of vector files: "spaghetti", feature-encoded, and topological files. They differ by the degree to which the computer recognizes and can analyze vector objects as spatial features.

See notes on Vector file structures here.

Idrisi VECTOR FILE STRUCTURES.

Vector Data File Structure (.vct)

Idrisi uses feature-encoded vector files. In contrast to "spaghetti" vector structures, Idrisi recognizes vector features or objects (points, lines, and polygons), but, in contrast to Arc/Info or Cartalinx, it does not recognize topological information (connectiveness or neighborliness of vector features). In other words, features are recognized, but spatial analysis is limited.

  • Feature encoded vector format allows us to display vector features properly, to convert data between raster and vector formats, and to link vector features to the attribute table and query database. But no complex spatial analysis is possible, because there is no topology.
  • Points, Lines and Polygons must be kept in separate files (as in many GIS)
  • Attributes may be embedded (assigned to points and stored in the vector file itself) OR may be stored externally in a table and linked through IDs stored in the vector file itself.

For analytical purposes, vector files must be coded in binary format, but the View structure feature in Idrisi File Explorer displays vector file structures in ASCII format so that you can evaluate the content of the file as it is. Feature-encoded vector files are very simple.

  • For point files, the number of points in the file, point IDs, and their coordinates are listed.
  • For line files, the total number of lines is specified, and then each line is listed individually including its ID, a bounding rectangle of each line, and coordinates of each point included into a line.
  • For the polygon files, the number of polygons is indicated, and then they are listed with their IDs, coordinates of the bounding rectangle, and coordinates of each point that form a polygon. Note that, but the coordinates of the first and last point in each polygon are the same.
  • Coordinates of the bounding rectangle for the entire vector file (the boundary of the area displayed) is specified in their documentation files.

Examples of each type of vector file are included below. Also see HELP/FILE STRUCTURES.

Top of page


Point files

  • Display a point file Points (\compex).
  • Query point values.

There are four points with attribute values of 1, 10, 10, and 5.

  • Using Idrisi File Explorer, display ASCII structure of this file. It is also presented below.

A point file with 4 points (Points), would look as follows. In fact, you can create a vector file such as above or below in ASCII format using a text editor and then import it into Idrisi.
 

Vector Layer Name : POINTS

 Name of the vector file

Vector Layer Type : Point

 Feature type (or Object type). Tells the computer that these are point features.

Reference System : plane

 Reference system name

Reference Units : m

 Reference units

Unit Distance : 30

 Unit distance (usually 1).

ID/Value Type : Integer

 Data type of the ID or attribute value

Number of Features : 4

 Number of points in the file. Here it is 4

 Then, each point is described:

 

Feature Number : 1

 Sequential (internal) number of each feature, usually starting with 1.

ID or Value : 1

 User-assigned unique ID or attribute value (see data type above)

Coordinates (X,Y) : 189.000000 282.000000

  Coordinates of the location where point 1 is plotted. 

 

 

Feature Number : 2

 Sequential number of point 2

ID or Value : 10

 Its user ID or attribute value could be 10 (any integer number)

Coordinates (X,Y) : 335.771971 309.548694

 Coordinates of the point 2.

 

 

Feature Number : 3

 Sequential number of point 3

ID or Value : 10

 It has the same user ID or attribute value as the second point

Coordinates (X,Y) : 140.807601 113.444181

 Coordinates of the point location.

 

 

Feature Number : 4

 Last point is number 4

ID or Value : 5

 Its attribute value is 5

Coordinates (X,Y) : 306.128266 188.693587

 Coordinates of its location.

Top of page


Vector documentation file (*.vdc)

  • From Idrisi File Explorer display documentation file for Points.

Vector documentation files have additional information about the spatial and attribute (ID) data.

file format : IDRISI Vector A.1

 Idrisi Vector file 

file title : Four points

 Title briefly describes the content of the file

id type : integer

 Data type of ID or attribute value (byte, integer, real)

file type : binary

 File type (binary or ASCII)

object type : point

 Feature (object) type (point, line, polygon)

ref. system : plane

 reference system name

ref. units : m

 reference units (meters here)

unit dist. : 30

 usually 1

min. X : 0

 Min X of the bounding rectangle

max. X : 480

 Max X of the bounding rectangle

min. Y : 0

 Min Y of the bounding rectangle

max. Y : 480

 Max Y of the bounding rectangle

pos'n error : unknown

 RMS error of the coordinate accuracy

resolution : unknown

 in vector is not calculated

min. value : 1

 Min attribute value

max. value : 10

 Max attribute value

display min : 1

 Min value displayed

display max : 10

 Max value displayed

value units : 

 units of measurement of attribute data

value error : unknown

 error of the attribute data

flag value : none

 value used as a flag value

flag def'n : none

definititon of the flag value 

legend cats : 0

 number and description of legend categories (if they exist)

What if you had a file that consisted of only one point? What would it look like?

  • Display Newplant from \Idrisi Tutorial\Introductory GIS
  • View structure of this file.
  • Then examine its documentation file (Metadata button)

Vector Layer Name : NEWPLANT

 

Vector Layer Type : Point

 

Reference System : plane

 

Reference Units : m

 

Unit Distance : 30

 

ID/Value Type : Integer

 

Number of Features : 1

This file consists of one point only.

 Then, this point is described:

 

Feature Number : 1

 This point is feature 1 (and the only one)

ID or Value : 1

 

Coordinates (X,Y) : 189.000000 282.000000

 

 Top of page


Line files

  • Using Idrisi File Explorer, display Westroad (Idrisi Tutorial\Using Idrisi). Query values.
  • Examine the structure of this file (View structure) and its documentation (Metadata).

Using documentation file, answer the following questions:

      • What are minimum and maximum attribute values? (1 and 10)
      • What are value units? What do they represent? (Classes, road types)
      • What attribute values represent which road classes? (1 - primary, 2-secondary, and 10-ramp access roads)
      • What are the limits of the bounding rectangle? (X: 0, 12300; Y: 0,11160)
      • What are distances measured in? (reference units are meters)
  • Display the structure of the file. My comments are in the right hand column.

The structure of the line file is as follows (See Idrisi HELP/FILE STRUCTURES):
 

Vector Layer Name : WESTROAD

 File name

Vector Layer Type : Line

 Feature (object) type

Reference System : plane

 Reference system

Reference Units : m

 Reference units are meters

Unit Distance : 1

 

ID/Value Type : Integer

 Data type of feature ID or attribute value (here attribute values represent road classes)

Number of Features : 2444

 Number of lines in this file

 Then, each line is described in turn:

 

Feature Number : 1

Sequential number of the first line 

ID or Value : 2

 Attribute value - Road class 2 (see legend for detail)

Minimum X : 1257.22190856934

 Next 4 entries show the limits of the "bounding rectangle" of the line itself, 

Maximum X : 1500.23391723633

 its min and max X and Ycoordinates

Minimum Y : 7890.67016601563

 

Maximum Y : 8000.11291503906

 

Number of Vertices : 4

 Number of points in the first line

Coordinates (X,Y) : 1500.233917 7890.670166

 Coordinates of the first point

: 1500.233917 7890.670166

 Coordinates of the second point

: 1349.243164 7982.913818

 Coordinates of the third point

: 1257.221909 8000.112915

 Coordinates of the forth point

***                   ***                          ***

 Lines 2-4 follow....

Feature Number : 5

 Sequential Id of line 5

ID or Value : 1

 Its value is 1

Minimum X : 3489.49249267578

 

Maximum X : 3548.27667236328

 

Minimum Y : 11776.1022949219

 

Maximum Y : 11835.1928710938

 

Number of Vertices : 4

 This line also has 4 points

Coordinates (X,Y) : 3489.492493 11776.102295

 

: 3489.492493 11776.102295

 

: 3509.815979 11790.983887

 

: 3548.276672 11835.192871

 

***                       ***                    ***

 Lines 6-2444 follow

 

Answer the following questions while looking at the Westroad line file structure:

      • What is the sequential number of the last feature in the file? (2444)
      • What is the attribute value of the feature 2440? (2)
      • How many points does this line consist of? (8)

Top of page


Polygon files

  • Using Idrisi File Explorer, display Netherlands (Idrisi Tutorial\Using Idrisi). Query values.
  • Examine the structure of this file (View structure) and its documentation (Metadata).

Using documentation file, answer the following questions:

        • What are minimum and maximum attribute values? (1 and 794)
        • What are value units? What do they represent? (Zip codes for Netherlands)
        • What are the limits of the bounding rectangle? (X: 525685.375693, 782578.032324; Y: 5625837.055968, 5939526.026695)
        • What are distances measured in? (reference units are meters)
        • What reference system is used? UTM-30N

Display the structure of the file. My comments are in the right hand column.

 

Vector Layer Name : NETHERLANDS

 File name

Vector Layer Type : Polygon

 Feature (object) type

Reference System : utm-30n

 reference system UTM, zone 30 North

Reference Units : m

 ref.units (meters)

Unit Distance : 1

 

ID/Value Type : Integer

 data type

Number of Features : 921

 total number of polygons

 Then, each polygon is described in turn:

 

Feature Number : 1

 First polygon's sequential ID

ID or Value : 1

 User ID or value (ZIP code number)

Minimum X : 625497.5

 Coodinates of the bounding rectangle for this polygon

Maximum X : 633108.9375

 

Minimum Y : 5802387

 

Maximum Y : 5808616

 

Number of Parts : 1

 Some polygons may consist of more than one part (one ZIP code made of several separate areas). Here a Zip code includes one polygon (part)

 

 

Part Number : 1

 Number of this part within the polygon

Number of Vertices : 67

 Number of points in the polygon

Coordinates (X,Y) : 625497.500000 5805307.500000

 Coordinates of the first point

: 625537.312500 5806776.000000

 second point

: 625712.250000 5807116.000000

 third point

***

points 4-65

: 626605.000000 5805393.500000

 point 66

: 625497.500000 5805307.500000

 point 67 = point 1

****

Polygons 2-13

Feature Number : 14

 Polygon 14

ID or Value : 13

 Zip code ID 13

Minimum X : 637517.8125

 

Maximum X : 641648.5

 

Minimum Y : 5816766.5

 

Maximum Y : 5822814

 

Number of Parts : 1

 

 

 

Part Number : 1

 

Number of Vertices : 38

 

Coordinates (X,Y) : 640178.937500 5817456.500000

 

: 640003.937500 5817661.000000

 

: 639467.875000 5817452.000000

 

***

Vertices 4-36

: 640514.062500 5817413.500000

 

: 640178.937500 5817456.500000

 

***

Polygons 15-921

 

Answer the following questions while looking at the Netherlands polygon file structure:

      • What is the sequential number of the last feature in the file? (921)
      • Which ZIP code is represented by polygon 726? (Zip code 635)
      • How many points make up this polygon? (48 vertices)

 Top of page


ATTRIBUTE FILE STRUCTURES

Attribute files contain characteristics of spatial features. They, too, may consist of a data file and a documentation file. In Idrisi there are two types of attribute files: a simple attribute values file and a database table in Microsoft Access format.

Simple Attribute Data File Structure (.avl + .adc)

Attribute values files are simple ASCII files that consist of two columns of numbers separated by a space.

  • The first column is the current value or the identifier while the second is the new value or the derived value.

1 1
2 1
3 2
4 1
5 1
6 2
7 3

An attribute documentation file (*.adc) contains the description of data and file format. it does accompany attribute data files.

  • Using Idrisi File Explorer, View structure and documentation for attribute value files Westfor and Westres. They are very simple and consist of only one row of numbers each.

Relational databases ( .mdb+.adc, .dbf attribute tables in ArcView)

Access database files (*.mdb) and dbase files (*.dbf) are binary and cannot be viewed directly. They store data in the form of a relational database (as opposed to spreadsheet tables such as Excel tables).

  • Rows are called records and columns are called fields.
  • Records represent units for which attribute information is collected (in a GIS database these are spatial units - points, lines, or polygons).
  • Each field contains one attribute (or variable) and specific attribute values are distributed among records (spatial units).
  • In a relational database each record is usually marked by a unique identifier and its integrity is always maintained. This means that all information pertaining to one record is always kept together across all database fields. You can add/remove records and fields but the attribute values for each existing record are always kept together in a row.

The database files are linked to spatial vector files via unique IDs that are located in the first field of the database. ArcView data files (*.dbf) have a similar structure. In Idrisi, you can manipulate relational databases within Database Workshop.

  • Open Database workshop under Data Entry Menu (or click on the fifth button from the right in the tool bar). Then open (File/Open in Database workshop menu) any *.mdb file in \Using Idrisi folder.
  • Find ID field and attribute fields. Note data type indication and total number of records on the bottom of the window.
  • Using Metadata button, view documentation for database files.

Top of page


TOPOLOGICAL VECTOR FILE STRUCTURES

Download Demo8 from scratch or from BB.

Exploring topological data structures in Cartalinx

As a topological vector data editor and compiler, Cartalinx shows clearly the topological structure of vector layers (or coverages).

  • Run Cartalinx and choose File/Open University.lnx layer.
  • You should see the polygon layer in a map window and three tables to the left of the map
  • If you see only one table, go to File/Preferences, click tables tab, and check on all three tables (nodes, arcs, polygons) to make them visible.
  • Also make sure that options: Select database record... and Highlight coverage feature... are on.

On the map of a university campus (Clark University in Worcester, MA):

  • vertices (points of which arcs consist) are shown as brown points,
  • nodes (beginning and end of an arc) are shown as green squares, and
  • polygon locators (points that establish identity of a polygon) are shown with salmon color.

Topological information is stored in three tables:

  • First table displays nodes (unlinked nodes - arcs) and polygon locators. Each node has a unique ID and all locators are associated with polygons.
  • Second table contains a dictionary for arcs. Each arc has a unique ID, start and end node IDs (matching the nodes table) which give arc a direction, and left and right polygon IDs (matching the polygon table).
  • Third table describes polygons. Each polygon has a unique ID matching that of a locator node (from the nodes table) and various types of attribute information stored in different fields. Attribute data is stored in relational database that preserves integrity of rows (records) and columns (fields) and keeps records associated with vector features in the coverage.

Boundaries between polygons are made of arcs - no double boundaries, no sliver polygons, strong vector analysis based on topology.

Top of page