Wednesday, March 20, 2013

GIS II Project ~ Phase 2 Geocoding

Introduction

     If the ultimate goal of this semester-long flirtation with sand mining is to develop and analyze silica sand mining and its effects in the State of Wisconsin, a smart place to begin might be to know where the Silica Sand mines of Wisconsin are located.  To this end, the goal of this project is to geocode the addresses of as many mines as we can in the state for use in future projects.  Having only one goal may sound like small apples, but the time-intensity that can be involved with geocoding makes this exercise no joke.  In order to compile a set of addresses that can be accurately placed in a road network, the class has grouped into pods of four which will ultimately divide up the task of geocoding the data that can be found regarding Wisconsin Sand mines.  Finally, each individual's geocoding results will be compiled and projected in a single geodatabase which will act as the base data for this project as it progresses.

fig 1: Silica sand deposits and mines of 2011 in Wisconsin.


Methodology

     The first task in this project was to explore online data servers that may contain relevant information to the class, described in the previous exercise.  Eventually, the class was given data to download from http://www.wisconsinwatch.org/viz/fracmap/, which provided a list of sand mines in the state as well as their locations, owners, and some other relevant information.  In an ideal world, this table could be run through ArcMap's address locator tool and all of the locations would be output with their approximate real world coordinates attached.  Unfortunately, the table was not conducive to running through the address locator in the first place, and some address fields for mines were missing entirely but had PLSS descriptions or in some cases directions.  So a couple of steps were needed to geocode the mines.

 
fig 2: the original table. note the address field.

fig 3: the updated table, note how the address data is spaced into seperate fields.
 

     The first step was to have the software code as many addresses as possible. However, it would not work immediately because the address field was not stored in such a way as could be read by the address locator tool (fig 2).  In order to fix this, each group member went through their division of roughly 25 mines and separated the address field into a street address, state, and zip code field (fig 3).  Once this information was properly parsed, the table was fed to the address locator tool in ArcMap which generated a list of locations that could be found.  This left the group member with slightly more than half of their mines remaining to be found, on average.
     Having located several mines, the remainder needed to be geocoded by more creative methods.  As so many mines were described by PLSS, the next step was to overlay an aerial basemap of the state with a PLSS dataset in ArcMap and go visually hunt for these mines one by one.  The remaining mines, if they contained any description of location at all, were found by reading the descriptions contained given and looking around the aerials provided through ESRI.  A typical description of this caliber would read "between Knapp and Menominee, along highway xx and between yyy road and the rail line," and with such general descriptions many of these mines were not found.
     At about this point in the process, a google fusion table of this data was discovered and used to inform the geocoding process.  Also, the owner field of the data was used to look up the company websites of those running the silica mines which we were attempting to geocode.  However, the table was incomplete and websites did not provide directions to their mines, so even with collaboration among classmates and the online fusion table many addresses were still not found.
   

Results
     Unfortuntely, the group was not able to geocode all ~125 mines in one swipe, however we were proud to say that we did in fact successfully geocode 75 mines across the state.

Conclusions

     The first and most noticeable conclusion drawn from this exercise is that geocoding can take a lot of time.  Even simply running the tools was a time-consuming task during which I was grateful to have useful tasks that I could complete while the computer worked in the background.  The coder was also somewhat glitchy, as several students had to run the address locator tool multiple times in order to arrive at a useful dataset.
     The feature class developed is incomplete because of an inability to find several mines on the orthophotos which were missing addresses, and while for our purposes this dataset will be adequate I would not accept this work in a professional setting.  The data provided/collected simply needs to be of a higher caliber if detailed analysis is going to be completed on a project such as this.  By the same token, having a messy data set allowed for practice with a number of different methods for locating mines, which could prove useful in the future.
     However, for the purposes of this project it will be adequate to have a

No comments:

Post a Comment