Introduction
Location Map
Base Maps
Database Schema
Conventions
GIS Analyses
Flowchart
GIS Concepts
Results
Conclusion
References

GIS Analyses

GIS Analyses

   
Data Preparation: Data Analysis:
     Climate data processing      Generate a random sample
     Projecting raster data      Retrieve X,Y coordinates
     Projecting vector data      Extract data from random sample
     Resampling      Create the population density model
     Creating a slope layer      Run population model with future climate data
     Creating attribute tables  

 

 
Data Preparation
 
Climate
Climate data processing
 

Accessing Data:

Data are available at the WorldClim website: http://www.worldclim.org/download (Hijmans et al. 2005). Here you will find data available for three categories: past, current, and future.  We want to first explore the future scenarios available.  If you click on "future" you will find that there are three models available:  CCCMA, HADCM3, and CSIRO.  For each model two emission scenarios (a2a, b2a) are available, and three years (2020, 2050, 2080).  The resolution varies from 30 arc-seconds (finest scale) to 10 arc-minutes (very coarse scale).  The model structures of CCCMA, HADCM3 and CSIRO differ but are similar in their climate predictions.  Researchers frequently use all three in their research, but we will using the Hadley model.

 

Future Climate Data:*

*Data files for future scenarios are large and include coverage for the entire globe.  Make sure adequate space exists to work with climate data.

 

1.) On the WorldClim website, navigate to the download page and click on the link for “Future      conditions.”

2.) Download the zip file for the desired projected year, variable (tmin, tmax, precip), and      emission scenario of interest (e.g. 2020, tmin, a2a).

 

WorldClim
 
3.) Extract zipped files to appropriate folder to work with later.  Note: extracted files will have      .bil extensions at this point.
4.) WorldClim data is in geographic (lat/long) coordinates, WGS 1984.  You must define this      projection for your newly downloaded layer. 
          a.) Open ArcToolbox
          b.) Data Management Tools --> Projections and Transformations --> Define Projection.                 Navigate to the file of interest by clicking on the folder button to the right of the “Input                Data Set or Feature Class” input.
 
InputDataset
 
          c.) Select the data file of interest and click ‘Add.’
 
DefineProjection
 
          d.) Select the coordinate system. Choose “Select a predefined coordinate system.”                Select Geographic Coordinate Systems --> World --> WGS1984.prj --> Apply -->                OK
 

Unfortunately, the files with BIL extensions are not very compatible with ESRI products, so they must be converted to ASCII files before we can work with them directly.  There are several ways to do this, the easiest being through DIVA-GIS, a freeware program available for download at http://www.diva-gis.org/The following instructions demonstrate how to convert BIL files to ASCII using DIVA-GIS.  It is important that the previous steps occur before conversion with DIVA-GIS as header files tend to get lost in the conversion, and as a result the latitude/longitude data will also be lost for all converted raster files.

 
5.) Open DIVA-GIS.  Go to Data --> Import Gridfile --> Multiple files. Click on “BIL, BIP, BSQ”      and Apply.
 
ImportFiles
 
6.) Next, export the data. Go to Data --> Export Gridfile --> Multiple files. Click on “Arc ASCII”      and Apply.
 

Now the data are ready to be projected to UTMs.

 
7.) Open ArcCatalog.  Go to Data Management Tools --> Projections and Transformations -->      Raster --> Project Raster --> Select a predefined coordinate system --> Projected      Coordinate Systems --> WGS 1984 --> UTM Zone 37N.prj. 
8.) Repeat step 7 for all converted raster files. (This process will also later be repeated for the      population and elevation rasters.)
 

The next step involves clipping climate layers to the boundaries of Ethiopia. Political boundary layers are readily available on the web. Before clipping to a political boundary, however, the boundary must be defined and projected the same as the above climate layers.

 
9.) Open ArcCatalog.  Go to Data Management Tools --> Raster --> Clip.  Navigate to the      raster layer to be clipped under “Input Raster.”  Navigate to the layer the raster file is being      clipped to under “Output Extent.”  Click on “Use Input Features for Clipping Geometry.”       Click OK.
 
Clip
 
10.) Repeat step 9 for all data files converted to ASCII.
 

Depending on the questions you are asking in your research, you may decide that the BIOCLIM variables (19) are appropriate explanatory variables for your study.  These variables can be derived directly from the minimum temp, maximum temp, and precipitation variables using an aml script that can be run in ArcInfo.  The script and directions are available at http://www.worldclim.org/bioclim-aml.

 
 
 
placeholder

Project raster files to WGS 1984 UTM Zone 37N

     (Population and Elevation rasters)

 

1.) Open Arc Toolbox.
2.) Data Management Tools --> Projections and Transformations --> Raster --> Project Raster
3.) For "Input Raster," select the raster to be transformed. Name the output raster and navigate      to where it will be saved. Open the "Output Coordinate System" menu.
 
Project1
 
4.) The "Spatial Reference Properties" menu will open. Click: Select --> Projected Coordinate      Systems --> UTM --> WGS 1984 UTM Zone 37N. Repeat for each raster.
 
Project2
 
 
 
projections2
Project shape file to WGS 1984 UTM Zone 37N
     (Woreda shape file)
 

1.) Projecting vector data follows essentially the same steps as projecting raster data. Open      Arc toolbox.

2.) Data Management Tools --> Projections and Transformations --> Feature --> Project      Feature
3.) For "Input Dataset or Feature Class," select the shape file to be transformed. Name the      output file and navigate to where it will be saved. Open the "Output Coordinate System"      menu.
4.) Select --> Projected Coordinate Systems --> UTM --> WGS 1984 UTM Zone 37N
 
 
 
Resampling
Resample to create identical cell sizes for all rasters
 
For later analyses, all rasters must have the same cell sizes. To acheive equal cell sizes, resample data from the layer with larger cells to match the cell size of the higher resolution layer. This creates smaller cells with redundant values, but it prevents data from being lost, as would be the case if smaller cells were averaged into larger cells. For these data, resample the population and climate rasters so that they have the same cell size as the elevation raster. The population and climate resolution then be higher, but the accuracy of the data will remain the same.
 
1.) Identify the cell sizes for all rasters. For these data, the population and climate layers have      cells that are 932.13 x 932.13 m. The elevation layer has cells that are 93.21 x 93.21 m.      This means that each cell in the elevation layer is approximately 1 km2. Since the      population densities are expressed in people per km2, having each cell contain 1 km2 will      make interpretation of the data much easier.

          a.) In ArcMap, right click on each layer. Go to "Properties" --> "Source," and scroll to                cell size.

 
CellSize
 
2.) Arc Toolbox --> Data Management Tools --> Raster --> Raster Processing --> Resample
 
Resample
 
3.) Identify input raster (population). Name the output raster "pop_nearest" and navigate to      where it will be saved.
 
4.) Set output cell size to be the same as the Elevation layer.
 
5.) Select "NEAREST" as the resampling technique.

          a.) Nearest neighbor resampling is appropriate for both discrete and continuous values.                For this situation, it is preferable to the other resampling options because the input                values are maintained in the output. Because we are going from coarser to finer                resolution, we aren't interested in interpoating between points; we want to preserve                the values in the original data.

 
6.) Repeat these steps for all climate rasters.
 
 
 
Slope
Create a slope layer from the Digital Elevation Model
     (DEM raster)
 
Topographic slope may be an important variable contributing to distribution of human settlements in Ethiopia. In order to be able to add slope as a parameter in the population density model, a slope raster must first be derived from the digital elevation model.
 
1.) In ArcMap, enable Spatial Analyst

          a.) Tools --> Extensions --> Enable Spatial Analyst

          b.) View --> Toolbars --> Spatial Analyst

          c.) From the Spatial Analyst dropdown menu, select "Options" and select your working                directory
2.) Calculate slope: Spatial Analyst --> Surface Analysis --> Slope
 
SlopeCommand
 
          a.) The input surface should be the projected DEM ("dem_utm37n"). Name the output                raster "dem_slope" and navigate to where it will be saved.
          b.) Accept defaults. It won't affect the analysis if slope is measured in degrees or                percent.
 
Percent Slope of Ethiopian Land Surface
SlopeMap
Created by: Kelly Hopping and Greg Wann, December 5, 2009
Projection: WGS 1984 UTM Zone 37N
Data Source: NR 505, Warner College of Natural Resources, Colorado State University (Q:\nr505\nr505_08\EthiopiaData)
 
 
 
AttributeTable
Create an Attribute Table
     (Population raster)
 
The population raster currently has floating point data, but it must have integer data in order to be able to create an attribute table for it.
 
1.) As a preliminary step, the raster settings will need to be adjusted so that decimal places      can be retained in long integers.
          a.) In ArcMap --> Tools --> Options --> Raster --> Raster Attribute Table
          b.) Enter 1,000,000,000 where it says "Do not build raster attribute tables when the                number of unique values is greater than:"
 

If you are unable to modify your computer’s settings and the original population data is already at the maximum number of digits, go straight to the data analysis steps in the following section.   You will then have to use the “pop_nearest” layer for sampling random points.  This will not affect your results, but it will prevent you from being able to create an attribute table for the population data.

 
2.) In order to avoid losing data when converting to integer format, multiply population data by      10,000 to preserve all decimal places. (Data will be converted back to reasonable      population density values at a later stage in the analysis.)

          a.) On the Spatial Analyst toolbar in ArcMap, select "Raster Calculator" from the                 Spatial Analyst dropdown menu.

 
RasterCalculator
 

                    i.) Enter: [pop_10000] = [pop_nearest] * 10000

                    ii.) Evaluate

 
3.) To convert to integer data: Arc Toolbox --> Spatial Analyst Tools --> Math --> Int
          a.) Select the new "pop_10000" raster as the input raster. Name the output raster                "pop_integer" and navigate to where it will be saved.
 
Integer
 
4.) To create an attribute table: Arc Toolbox --> Data Management Tools --> Raster --> Raster      Properties --> Build Raster Attribute Table

          a.) Select "pop_integer" as the input raster. An attribute table may now be opened for                this layer.

 
AttributeTableTool
 
5.) At this point it is possible, but not neccessary, to convert population data back to      reasonable population density values in the attribute table.

          a.) In ArcMap Table of Contents, right click on "pop_integer" and open attribute table.

                    i.) Options --> Add Field: "people_km2"

          b.) Right click on new column heading --> Field Calculator

                    i.) Choose" type = "double" (for long numbers); precision = "8" to preserve                         integer places; scale = "4" to preserve decimal places

                    ii.) * Enter: [Value/1000000]

 
* Note: This number (1,000,000) comes from multiplying 10,000 by 100 because the original population data were 2 orders of magnitude higher than actual Ethiopian population densities, and then we multiplied them by 10,000 to preserve decimal places. Actual Ethiopian population density numbers were obtained from the Atlas of the Rural Economy, allowing us to deduce that the GIS data had already been multiplied by 100 in the original raster available to us (Atlas 2006). Aside from differences in decimal placement, we have no reason to believe that the population raster data is inaccurate.
 
 

 

 
Data Analysis
 
placeholder
Generate Random Sample Points
 
Random sample points will be generated using Hawth's Tools so that data can be obtained for a more manageable number of cells in each raster layer than if every cell were used. To ensure that both high and low population density areas are represented by the random sample, points are stratified by Woreda (Ethiopian political region). The population data seems to have been gathered by Woreda, so the population values do not vary much within each political region. However, climate and elevation will vary within Woredas, so we will select 10 random points per Woreda to capture climatic and topographic variation.
 
1.) Download and install Hawth's Tools, a free ArcGIS extension available online. (Hawth's      Tools; http://www.spatialecology.com/htools/download.php)
2.) In ArcMap: Tools --> Extenstions --> Enable Hawth's tools
3.) View --> Toolbars --> Hawth's Tools
4.) On the Hawth's Tools toobar, click: Sampling Tools --> Generate Random Points
 
RandomPoints
 

          a.) Select "Polygon Layers" and choose projected Woreda layer

          b.) Stratify points by Woreda. Select "ID" for "Polygon unique ID field." Generate 10                points per polygon.

          c.) Name the output shapefile of random points and navigate to where it will be saved.
 
Randomly sampled points, stratified by Woreda
SamplePoints
Created by: Kelly Hopping and Greg Wann, December 3, 2009
Projection: WGS 1984 UTM Zone 37N
Data Source: NR 505, Warner College of Natural Resources, Colorado State University (Q:\nr505\nr505_08\EthiopiaData)
 
 
 
 
X,Y
Retrieve X,Y coordinates for random points
 
In order to locate where the randomly generated points fall on each raster layer, X,Y coordinates must be known for each point.
 
1.) In ArcMap, right click on the Random Point shape file and open its attribute table.

          a.) Options --> Add field --> "Point_X"

          b.) Options --> Add field --> "Point_Y"

2.) Arc Toolbox --> Data Management Tools --> Features --> Add XY Coordinates

          a.) Coordinates are automatically added to new fields in attribute table

3.) In the attribute table, right click on "ID" column --> Field Calculator --> [FID] + 1

          a.) This step formats the table for future data analysis in Excel

 
 
 
Extract
Extract data from rasters for all random points
 
In order to do statistics on the data from the random sample points, the data must be entered into a worksheet. Before the data can be opened in Excel, it must be extracted from the raster layers.
 
1.) Toolbox --> Spatial Analyst Tools --> Extraction --> Sample

          a.) Add all Current Climate rasters (for variables 1-19), Elevation raster, Slope raster,                and Population raster.

          b.) For "input location raster or point features," input the random points shape file

          c.) Select your working directory as the output location for the table of extracted data                points.

          d.) Leave all defaults (e.g., "nearest," etc.)
 
Extraction
 
2.) In ArcCatalog, right click on the table that was created by step 1

          a.) Export --> to dBase (single) --> select a folder as the output location

3.) Name the new database file in ArcCatalog, as it may have lost its file name when it was      converted from a table to a .dbf file
4.) In Windows Explorer, open the new database file with Microsoft Excel
5.) Rename the column headings to follow the original raster layer titles, in exactly the order      that the rasters were input into the sample extraction in Step 1
6.) If the population data have not been converted back to people/km2, divide the population      numbers by 100,000,000 in Excel. (See explanation for this number under "Create an      Attribute Table.") If you were unable to multiply the original population data by 100,000,      only divide by 100 in this step.
 
 
 
Model
Create the population density model
 

A multiple regression approach was chosen to build a population density model for Ethiopia using a digital elevation layer and climate data (BIOCLIM) as our explanatory variables.  Population density was our response variable.  We chose SAS 9.2 as the statistical package for our analysis, but free statistical software is available that can readily implement our methods (for example, R).  As a result, all steps outlined below are assuming the use of SAS.

 
1.) Using the MS Excel file created in the above steps (which includes all of the stratified           random sample points), eliminate rows where all explanatory variables are zero (these are      a result of errors in the BIOCLIM data).  The ‘Sort’ option under ‘Data’ on the toolbar menu      will make this step easy, but make sure the entire range of columns is highlighted before      any sorting is done.
2.) Place the data into a template appropriate for the statistical software you have selected.          For example, SAS and R both consist of editor windows where the code underlying the      analysis is entered.  Using a text editing program (such as MS Notepad) to write code      makes  this process easier.  Examples of the steps that may be taken are outlined below      (using SAS as the program example).
3.) The code from the text editor can be copied and pasted directly into the SAS editor, or an      infile statement can be used directing SAS to read the data directly from the MS Excel      file.
4a.) First run all explanatory variables in a single model (the ‘global model’ = 19 BIOCLIM          variables + DEM + slope).  Check for multicollinearity in the explanatory variables using      collinearity diagnostics (variance inflation factor).  This can be done in Proc Reg using the      ‘/vif collinoint’ option at the end of the model statement. 
4b.) Typically a variance inflation factor (VIF) value greater than 10 for a variable is of concern.       Consider removing these variables from the model.  Depending on the goals of the      analysis, which correlated variables are removed will be a personal choice of the      modeler.         
4c.) Rerun the new global model (which excludes the variables removed in the previous step),      this time using ‘/selection=adjrsq best=10 aic’ which provides the best 10 subsets of the         global model based on the model selection criteria of AIC and adjusted R squared.
4d.) Consider running the model from step 2c again using a different selection criteria, such      as backward selection (‘/selection = backward sls = 0.05’) or forward selection (‘/selection      = forward sle=0.05’).  It is a personal choice which model you decide on, but frequently      different model selection methods will pick the same model as best.
5.) Examine model output.  Select the model with the lowest AIC value and run this model        just as the global model was run in step 2.
6.) Examine the output from step 5 above.  The coefficients of this model (including slope)         can now be applied to data layers in raster calculator (under Spatial Analyst).
 
 
 
FuturePopulation
Map future population density in response to climate change
 
Use raster calculator to run the model with future climate data.
 
1.) In ArcMap, insert 2 new data frames: "2020" and "2050." Add the projected DEM, as well      as the climate variables used in the final model (from 2020 and 2050, respectively) to each      frame. Add the "current" population layer to the 2020 data frame as well.
2.) Activate the 2020 data frame, and select Raster Calculator from the Spatial Analyst      Toolbar. Name the new layer "pop_2020" and enter your model equation. Add the current      population as a term so that the model will grow off of the population density that already      exists in each cell. This will allow the model to take account of where population density is      already high for reasons other than climate, such as in the capital.
 
ModelEquation
 
3.) Repeat step 2 for the 2050 climate variables, but add the new "pop_2020" term instead of      current population so that the final population density will take account of the climate-      driven population changes that occurred at the intermediate time step.
4.) To interpret how population changes in response to climate over time, use raster calculator      to subtract current population from the future population layers. Each cell will then      represent the amount that population density increased or declined in each cell.
5.) Step 4 may be repeated for the climate variables to see how they change through time as      well.
 



Updated: December 8, 2009 © 2009 All Rights Reserved.
Colorado State University, Fort Collins, CO 80522 USA