Introduction
For
a department of tourism in a certain state, it’s important to understand the
spatial differences in the tourism potential. This knowledge allows public
incentives in different areas accordingly with its specific potential. In
Wisconsin, it’s considered a region called Up North where it’s assumed to have
a high tourism potential. This project
intends to analyze if there’s a genuine prominence of the northern counties over
the southern counties in Wisconsin.
A
large dataset was provided to choose a few variables to be analyzed. As a
preliminary examination, the capability to house tourists was the main subject.
Then, the variables selected were the number of campsites, hotel beds and
seasonal homes.
To
distinguish the counties, the highway 29 was used as the boundary between
north-south. With that, the analysis of the Chi Square over the state using
these variables can support the analysis of different characteristics exclusively
from north or not.
Methodology
The
first step was to use the software ArcGIS to create a division between north
and south of Wisconsin. For that, data was imported from the ESRI database:
counties and highways. To deal only with Wisconsin counties it was necessary to
run a query based on the State Field. With the selection, a feature class only
with Wisconsin counties was created. The feature was re-projected to the
Wisconsin Transverse Mercator (NAD1983) to minimize the projection distortion.
Over
this feature, the U.S highways were also added, again running a query it was
possible to select only the Highway 29 and create a simple layer to visualize
the counties from north or south. It was not necessary to create a feature
class for this highway because its use is only temporary.
After
creating a field to contain the position of each county (north or south) an
editing session is necessary to add new attributes. By selecting the counties
located at north from the highway, it was possible to give them the code “1”
all at once, in the same way as the ones in the south. With that, Wisconsin was
then divided in North and South (Figure 1)
Figure 1 – Northern and Southern Wisconsin
A
large database was provided with much tourism information about each county,
the look-up table was essential to identify which features were more relevant
to the matter. As said before, the decision of dealing with housing focused the
approach in seasonal homes, hotel beds and campsites. To map these variables,
it was necessary to join the stand-alone table with the information to the
counties feature class.
Firstly,
since there were a lot of non-necessary data in the table, in the fields’
properties of the table, all the fields were hidden, keeping visible only the
name of the county and the selected variables. With that, a simplified table
helped to keep the organization of the procedures. Subsequently, three new
fields were created to include the category each variable would fall into, to
simplify, the codes 1, 2, 3 and 4 were used. (Figure 2)
Figure 2 – Simplified table to maintain organization
These
different categories are necessary to allow the statistical applications. For
that, different classification methods could be used: natural breaks, the
default; standard deviation; equal interval and quintile. In this specific
case, the equal interval was used to symbolize the features in a map. With
these intervals mapped, it was possible to select the different categories and
add information to those fields created, following a similar procedure to the
south-north classification mentioned before.
After
having well organized and classified table, it was exported as a .dbf file to
allow its statistical manipulation inside SPSS software. Then, it was possible
to start the hypothesis testing using the Chi-Square method. This method
analyzes the observed distribution of a sample with the expected distribution
it would have. For that, some hypotheses are stated: the null hypothesis and
the alternative hypothesis.
The
null hypothesis is based in the idea that the observed frequency fits the
expected distribution, so there’s no difference between them. Then, the sample
is random and happens by chance. The opposite scenario is considered for the
alternative hypothesis. In this case, the observed frequency doesn’t fit the
expected distribution, showing that there’s a significant difference between
these values, concluding that the sample is not random and does not happen by
chance.
Therefore, the main
procedure in this exercise will be to analyze the hypotheses for each variable.
For that, the SPSS software will make the calculations using the default
confidence interval of 95%.
Results
The
first variable to be analyzed was the number of hotel beds per county, a
preliminary interpretation of a map symbolized by quantities shows that the
north is not prominent in the amount of hotel beds. In the contrary, the
counties with highest amounts are located in the southern portion of Wisconsin (Figure 3).
Figure 3 – Distribution of hotel beds in Wisconsin
It’s
important to notice that the visualization is affected by the classification
method used. The equal interval method commonly shows more elements in the
lowest categories instead of having a more diverse scenario, as it would be in
a quintile method – each category with the same amount of occurrences. Hence, it’s important that the analysis
doesn’t be simply the interpretation of one map in a specific classification
method. The map in this case works as a preliminary view of the distribution,
which in any of the methods would be recognizable. Then, the application of the
Chi-Square test (Table 1) can support the idea seen in the map or show a
possible distortion.
Table
1 – Chi-Square Test for Hotel Beds
Analyzing the Chi-Square results, the p-value of 0.676 is extremely higher than the significance level of 0.05. Then, the test failed to reject the null hypothesis: the expected distribution of hotel beds doesn’t have significant difference from the observed distribution. So, the segregation between north and south is not valid. The sample is random and happens by chance, not having relation with the category 1 and 2 (north and south). On that account, the statistical result confirms the idea interpreted on the map: the north is not prominent in this variable. However, the results show that neither the south is prominent: none are; both have a statistical similarity for that matter, based on the Chi-Square result.
A similar result is observed when analyzing the distribution of campsites (Figure 4). The southern counties appear in higher categories than the northern counties. The same issue with the classification method can be considered. However, in this case, there are more northern counties falling at least in the second classification, the same happens in south, the variation is higher than when dealing with hotel beds.
Figure
4 – Distribution of Campsites in Wisconsin
The apparent lack of difference between the distribution in north and south is confirmed when analyzing the results of the Chi-Square testing (Table 2). The p-value is again extremely higher, 0.637, than the significance level of 0.05, therefore, the null hypothesis is failed to be rejected. Again, the sample happens by chance, showing that there’s no significant difference between the frequencies of campsites observed and expected. That shows that neither southern nor northern Wisconsin have a notable distribution of campsites when comparing to each other.
Table
2 – Chi-Square Testing for Campsites in Wisconsin
For
last, the examination of the amount of seasonal homes finally present some sort
of difference between north and south when mapped differing the four categories
(Figure 5). Almost all the southern counties are in the lowest category, while
ten counties from the north appear in the upper categories. However, as said
before, only the visualization of a variable in a specific classification
method is not enough to guarantee a realistic difference.
Figure
5 – Distribution of Seasonal Homes in Wisconsin.
The
results of the Chi-Square test are, then, analyzed to confirm the
interpretation of the mapped distribution (Table 3). Different from the other
two variables analyzed in this project, in this case, the p-value is extremely
low: 0.002. The significance level of
0.05 is much higher than that, being possible then to reject the null
hypothesis.
Consequently,
there’s a significant difference between the distribution of seasonal homes
observed and expected. The sample doesn't occur by chance and it’s not random.
But, this result only means that the classification north-south has a relation
with the number of seasonal homes, it doesn’t say which relation. Hence, it’s
necessary to analyze the observed count and expected count. For the position 1
– North – the expected count is that the frequencies would be higher in the
lower categories (less seasonal homes) and lower in the higher categories (more
seasonal homes). The observation shows the opposite, reason why it can be
determined that the northern region has a higher frequency of seasonal homes
than the expected in comparison with the south. In this last variable, the map
illustrates well the results obtained by the statistical tests: the
concentration of seasonal homes in northern Wisconsin.
Conclusion
Gathering
all the results obtained in this project, it’s possible to affirm that in the
housing section of the Up North tourism, hotel and camping accommodations are
not the strength of the area. Two of the three variables failed to reject the
null hypothesis, but that doesn't mean that the Up North doesn't have a tourism
potential. The choice of the variables needs to be considered. Since the last
variable – seasonal homes – showed extremely out of the expected frequency,
it’s possible to say that the accommodation resources of the Up North are not
necessarily standard and without any remarkability in comparison with the rest
of the state. The meaning of that is that, within the accommodation resources,
the use of seasonal homes is much more prominent than other ways such as hotels
and camping.
The
reasons for that can lie in a predominance of regular-basis tourism,
considering that seasonal homes are more stable than campsites and hotels: it’s
generally always the same family who goes visit the Up North in specific
occasions, rather than random tourists from everywhere who don’t necessarily
have a relation with the area. It can be also suggested that the lower
temperatures of the northern area discourage the intensity of camping, reason
why it’s not prominent. However, there are no facts in this project showing this
relationship, but only the elaboration of possible causes for the results
found.
It’s
important also to remember that this project only analyzed variables related to
housing for tourists, so a deeper analysis with more diverse variables would
need to be made to characterize better the whole concept of the Up North.









