Research background and data
Establishing a working directory
Part 1: Calculating Nearest Neighbour Index using ArcGIS
Part 2: Calculating Nearest Neighbour Index using CrimeState III
Part 3: Calculating the Kth Order Nearest Neighbour Index using CrimeStat III
One of the things a spatial analyst (eg. business manager, biogeographer, demographer, forester, and marketer) might want to do is evaluate the spatial pattern of distribution of a phenomenon (eg. plant species, retail outlets, incidence of disease, settlement of migrants) in a particular area. There are all kinds of techniques that can be used for this sort of "point pattern analysis" problem. This lesson introduces you to nearest neighbour analysis, which is often used to describe a spatial pattern as "clumped", "random", or "uniform." The terms clumped and clustered are used interchangeably throughout this lesson.
The Nearest Neighbour Analysis techniques used in this lesson are based upon a method described by Clark & Evans (1954). Clark and Evans were botanists who developed the Nearest Neighbour Index (NNI) primarily for analysing botanical field data, but it has been used in a variety of disciplines as a quantitative tool. Often when visually interpreting the geography of a point data set, the personal knowledge and experience of the analyst can lead to bias. This approach is often referred to as qualitative. Therefore, by adopting a statistical approach the method may be repeated and compared, certainly one of the principles of good science!
The NNI compares the distances between nearest points and distances that would be expected on the basis of chance. It is an index that is the ratio of two summary measures. First, there is the observed average nearest neighbour distance, Robs. For each point (or incident location, i) in turn, the distance to the closest other point (nearest neighbour, j) is calculated and averaged over all points. The mathematical expression for this is as follows:
Second, the expected average nearest neighbour distance, Rexp, of random points is the density of points within the study area, defined mathematically as follows:
Therefore, the NNI is the ratio of the above two separate statistics as follows:
The total area is the area of the study site. When R = 1, the points are randomly located. When R < 1 clustering is suggested and when R > 1 there is a tendency towards dispersion (Rogerson, 2001).
The above Figures show a regular (R>1), random (R = 1) and clustered (R<1) distribution of points (graphics generated using ArcGIS, Hawth's Tools and IrfanView).
Some differences from 1.0 in the nearest neighbour index would be expected by chance. Clark and Evans (1954) proposed a Z-test to indicate whether the observed average nearest neighbour distance was significantly different from the mean random distance. The test is between the observed nearest neighbour distance and that expected from a random distribution and is given by:
where the standard error of the mean random distance is approximately given by:
with A being the area of a region and N the number of points within that region. There have been other suggested tests for the nearest neighbour distance as well as corrections for edge effects.
This lesson is largely based around the PhD research of Dr. Heather Builth. While undertaking this research through Flinders University, Healther also Lectured in geography for the School and even took some survey students away to collect some of the data used in this lesson.
Please read the following articles to find out more about Heather's research:
- Life was not a walkabout for Victoria's Aborigines
- Aborigines may have farmed eels, built huts
For this exercise you will need four shapefiles (courtesy of Dr. Heather Builth):
Download this as a compressed zip file here exercise_data.zip
The shapefile house.shp defines the location of the centres of stone circles over the whole study area. The shapefile called bdy.shp defines the actual study area survey boundary. The shapefile called water.shp defines the extent of the floodplain. The shapefile rocks.shp defines individual rocks that were surveyed accross a small area within the main site. An image of the rock can be seen above-right.
Further pattern recognition analysis has lent support to a hypothesis that the stone clusters are culturally altered (unnatural) and circular in nature. The archaeological interpretation is that these rocks are possibly aboriginal stone shelters or eel storage areas.
You will need several different software applications for this exercise, including:
- ESRI (2008) ArcGIS version 9.x. Environmental Systems Research Institute, Redlands, CA.
- available on campus and provided on DVD
- Beyer, H. (2007) Hawth's Analysis Tools for ArcGIS.
- download directly from http://www.spatialecology.com/htools/download.php
- Levine, N. (2004) CrimeStat: A Spatial Statistics Program for the Analysis of Crime Incident Locations (v 3.0). Ned Levine & Associates, Houston, TX, and the National Institute of Justice, Washington, DC. May.
- Sawada, M. (2001) Nearest Neighbor Program (VBA Macro) for ArcGIS.
Please ensure all these programs have been downloaded and installed on the PC you are using for this exercise.
For this exercise you will need to establish a working directory. Create a folder on your computer named NNA (eg. in C:\NNA). Create two sub folders, one for the Data, another for Software. Ensure all software has been downloaded into the Software folder and all data downloaded into the Data folder. If any downloaded files are zipped, then unzip directly into the relevant folders.
Note that you will always get better performance if you use a local disk such as the "C" DRIVE rather than an external hard drive or USB flash drive. It is preferable if you do not use a flash drive for this lesson. If you wish, you can always copy the working directory to your USB flash drive for transfer to another computer or for backup.
As you work through this exercise, you must ensure that all files created and results obtained are saved into your working directory.
You are required to record all results. Produce a report in scientific journal format. This includes an Introduction, Methods, Results and Discussion. The focus of your report should be on your results and discussion.
The introduction should include a brief paragraph on NNA, with aims of this exercise clearly summarised. Methods should be kept brief, making reference to the software applications used and options chosen. Your results section should just include figures and statistical results, and include simple figures of the two study sites from ArcMap.
Your discussion needs to be approximately 3/4 of a page, and include a critical reflection of the techniques used. In your discussion consider the following:
- boundary effects
- size and shape of the study area
- projections, datums
- measurement error of the phenomena being analysed
- Application and usefulness of NNA and Kth Order NNA
Launch ArcMap and remove all toolbars except Main menu, Standard and Hawth's Tools. Add all four shapefiles. Within the Table of Contents (TOC) ensure that rocks is above house, which is above water which is above bdy. Save your ArcMap session to your working directory (eg. C:/NNA/*.mxd) and name it NNA.mxd
For each shapefile, change the symbology as follows:
- rocks.shp - Symbol = Circle 1; Fill Colour = Cherrywood Brown; Size = 5.00
- house.shp - Symbol = Circle 1; Fill Colour = Leaf Green; Size = 5.00
- water.shp - Fill Colour = Yogo Blue; Outline Width = 0.00; Outline Colour = No Colour
- bdy.shp - Fill Colour = No Colour; Outline Width = 2.00; Outline Colour = Tuscan Red
Hit the Save button again (or Ctrl + S).
Your ArcMap window should look like the following:
To perform NNA in ArcGIS you will need to install a VBA macro into the normal.mxt document template. This is described in detail in the PDF that accompanies the VBA macro. Read this carefully and install the routine.
NOTE: There is an omission on Pages 6-7 where you are required to load the *.FRM file. The file is named frmNN (not frmLayerInfo as indicated). Also, on Page 11 there is a code reference to frmLayerInfo. Change this to frmNN also.
Use the VBA macro to calculate the NNI for house.shp (the "events") within the bdy.shp study area. Shown below are the input parameters. You should also see that there is an option to use an automatic polygon extent, with or without applying a buffer. We will come back to this later.
You will see in the above figure that the option to Add NN distances & OIDs to feature table is checked. This will add the calculated distance from each event (house point) to its nearest neighbour to the attribute table, and give each event an Object ID (OID) so that the user can reference the points.
Record your results. Tip: you can opt to save the results to a text file, which you can the Copy and Paste from into your report.
Open the Attribute table for house.shp and view the results of the NN distances. Close the Attribute table.
Repeat the Nearest Neighbour statistic for rocks.shp but first you will need to create a Minimum Convex Polygon (MVP) around the sample data. An effective tool for calculating a MCP around a point data set is found in Hawth's Tools.
Zoom to the full extent of rocks.shp (in the TOC, right-click and select Zoom to Layer). From the HawthsTools drop down menu, select Animal Movements, then Create Minimum Convex Polygons (shown below):
The Point locations layer should be set to rocks, and the output will need to go into your working directory, with the output shapefile named rocks_mcp.shp
Click OK when done.
A Minimum Convex Polygon will then be created and added to your map window, which should look very similar to that shown below:
Save your ArcMap session.
Run the NNA macro for rocks.shp, using the MCP as the boundary layer.
Record your results.
Repeat the NNA macro and again but this time use the Automatic Polygon Extent option with a Convex Hull, and a buffer of 1.5 metres. Parameters and results shown below:
record your results.
In this part of the exercise you will repeat the calculation for NNA as for the two examples in Part 1 above. The purpose of this is two-fold. Firstly, CrimeStat vIII is an excellent (FREEWARE) spatial statistics package and it is important to be familiar with alternative applications. Secondly, it is often assumed that software is always correct. By repeating the exercise in CrimeStat vIII you should get the same results, however it is good practice to check results in alternative application where possible.
If CrimeStat has been properly unzipped into your software directory, you should see a list of files as shown below:
Double-click on the crimestat.exe to launch CrimeStat vIII. Accept the license. A "splash screen" appears. You can click on this to close it and see the CrimeStat application. The CrimeStat application is shown below:
Click on the Select Files button and set the Type: to shapefiles. For the Name: you will need to navigate to your working directory and load house.shp.
Next, you need to specify the fields containing the X-coordinates and Y-coordinates, as shown below.
Also, in the above figure, the Type of coordinate system must be set to Projected (Euclidean) and the Data units set to Meters.
Next, hit the Spatial description tab (the green one!) and the Distance Analysis I tab. Check the box next to Nearest neighbour analysis (Nna). Here you leave the number of neighbours to be computed as 1, and leave the border correction as None.
When ready, hit the Compute button. Record all results.
Your results should look something like:
Sample size........: 100
Measurement type...: Direct
Start time.........: 11:44:37 PM, 11/17/2008
Mean Nearest Neighbor Distance ..: 10.73 m
Standard Dev of Nearest
Neighbor Distance ...............: 17.95 m
Minimum Distance ................: 0.00 m
Maximum Distance ................: 1304.53 m
Based on Bounding Rectangle:
Area ............................: 865252.00 sq m
Mean Random Distance ............: 46.51 m
Mean Dispersed Distance .........: 99.95 m
Nearest Neighbor Index ..........: 0.2306
Standard Error ..................: 2.43 m
Test Statistic (Z) ..............: -14.7183
p-value (one tail) ..............: 0.0001
p-value (two tail) ..............: 0.0001
Mean Nearest Expected Nearest Nearest
Order Neighbor Distance (m) Neighbor Distance (m) Neighbor Index
***** ********************* ********************* **************
1 10.7271 46.5095 0.23064 <-----
Now repeat this process for the rocks.shp. Again, record all results.
An extension of the NNI is the Kth-Order Nearest Neighbour Index which compares not only the average distance for the nearest neighbour to an expected random distance, but also the distance between any event and the second, the third…., the Kth nearest neighbour. The NNI(K) is the ratio of the observed Kth nearest neighbour distance d(KNN) to the Kth mean random distance d(Kran) which is so calculated:
and the index is:
This index can be useful for understanding the overall spatial distributions and for comparing different distributions. For example comparing the values of the index till the, say, 50th order for two distributions A and B and showing the values in a graph we can observe that both the distribution are more concentrated for each of the 50 nearest neighbours than a random distribution (its index is always 1) and that the events of B-distribution are more concentrated than the events of A-distribution.
Now repeat the computation for Nearest Neighbour Analysis for house.shp, however this time we will compute the NN Index to the Nth order, that is, for all orders. If you return to ArcMap and open the Attribute table for house.shp you will see 100 records, which correspond to 100 points in your map window. So for the number of nearest neighbours to be computed, enter 100.
Hit the Compute button. Your results will include the information produced above, however,you now have the NNI for all orders.
Mean Nearest Expected Nearest Nearest
Order Neighbor Distance (m) Neighbor Distance (m) Neighbor Index
***** ********************* ********************* **************
1 10.7271 46.5095 0.23064
2 17.9109 69.7642 0.25673
3 25.7180 87.2052 0.29491
4 34.0131 101.7394 0.33432
5 43.3604 114.4569 0.37884
6 53.8606 125.9026 0.42780
7 60.5486 136.3944 0.44392
8 74.0338 146.1369 0.50661
9 80.1829 155.2705 0.51641
10 91.0217 163.8966 0.55536
11 100.5076 172.0914 0.58404
12 104.5308 179.9138 0.58101
13 108.4723 187.4102 0.57880
14 122.0274 194.6183 0.62701
15 125.7757 201.5689 0.62398
16 130.1702 208.2879 0.62495
17 138.1151 214.7969 0.64300
18 140.8884 221.1144 0.63717
19 145.2627 227.2565 0.63920
20 148.9154 233.2369 0.63847
21 152.3298 239.0678 0.63718
22 159.1417 244.7599 0.65019
23 163.9953 250.3227 0.65514
24 174.7052 255.7645 0.68307
25 182.8329 261.0929 0.70026
26 185.9819 266.3147 0.69835
27 205.2012 271.4362 0.75598
28 213.2770 276.4628 0.77145
29 223.3628 281.3996 0.79376
30 228.6277 286.2513 0.79870
31 231.0114 291.0222 0.79379
32 239.3908 295.7161 0.80953
33 244.2553 300.3367 0.81327
34 266.8836 304.8872 0.87535
35 282.1031 309.3709 0.91186
36 287.1416 313.7904 0.91507
37 291.8088 318.1486 0.91721
38 302.0245 322.4479 0.93666
39 316.4744 326.6907 0.96873
40 324.2783 330.8790 0.98005
41 388.2331 335.0150 1.15885
42 395.9717 339.1006 1.16771
43 402.3153 343.1375 1.17246
44 408.7465 347.1274 1.17751
45 411.3775 351.0721 1.17178
46 417.0142 354.9729 1.17478
47 419.7722 358.8313 1.16983
48 424.4844 362.6486 1.17051
49 463.7489 366.4262 1.26560
50 475.1212 370.1653 1.28354
51 525.3238 373.8669 1.40511
52 537.1146 377.5323 1.42270
53 539.3531 381.1624 1.41502
54 542.6058 384.7583 1.41025
55 544.1922 388.3208 1.40140
56 545.8260 391.8510 1.39294
57 549.4378 395.3497 1.38975
58 554.2567 398.8177 1.38975
59 597.8323 402.2558 1.48620
60 604.8910 405.6647 1.49111
61 626.7347 409.0452 1.53219
62 634.8238 412.3981 1.53935
63 642.4786 415.7239 1.54545
64 645.7652 419.0233 1.54112
65 716.9444 422.2969 1.69773
66 723.9547 425.5453 1.70124
67 727.8668 428.7691 1.69757
68 738.0152 431.9689 1.70849
69 744.3959 435.1452 1.71068
70 747.9839 438.2984 1.70656
71 759.6643 441.4291 1.72092
72 763.6157 444.5377 1.71777
73 815.1231 447.6248 1.82100
74 818.8826 450.6907 1.81695
75 821.5413 453.7359 1.81062
76 822.2031 456.7608 1.80007
77 824.5873 459.7659 1.79349
78 836.9893 462.7513 1.80872
79 840.0427 465.7177 1.80376
80 844.4218 468.6653 1.80176
81 847.2045 471.5944 1.79647
82 850.1759 474.5055 1.79171
83 861.1361 477.3988 1.80381
84 862.4848 480.2747 1.79582
85 863.6894 483.1335 1.78768
86 865.2495 485.9755 1.78044
87 871.9266 488.8009 1.78381
88 875.7699 491.6101 1.78143
89 890.6080 494.4034 1.80138
90 895.9462 497.1809 1.80205
91 898.2094 499.9430 1.79662
92 900.1293 502.6900 1.79063
93 936.5883 505.4220 1.85308
94 943.8823 508.1393 1.85753
95 957.1726 510.8422 1.87372
96 963.9898 513.5308 1.87718
97 969.2317 516.2054 1.87761
98 983.4022 518.8663 1.89529
99 1028.2139 521.5136 1.97160
100 1.#INF -1.#IND 1.#INF0
Copy and Paste your results into Notepad. Delete the column headings (you will add them again in Excel). Launch Microsoft Excel, import and format the data and plot the NNI (refer figure below). Label your chart axis and title for inclusion in your report.
Repeat for rocks.shp