Data-vis: a sixth sense
Given a rich data set, it is exciting to imagine what kinds of questions to ask.
Statistics and data visualizations are the bicycle for this!
Snakes everywhere!
Southeast Asia is home to many snake species. How many?
While Google Gemini (early 2025 edition) has served me quite well for coding ideas, it whiffed badly on this question, evasively settling on “over 300 species.”
On iNaturalist.org, there are 726 species represented as of 2025-Mar-6, an impressive number for a largely citizen science effort. This is encompassed by 89,598 observations by 13,790 observers. Impressive!
Singapore, a city of nearly six million people on a land area of just 710 km2 (8387 pp km-1), hosts 63 of these snake species flourishing in its array of gutters, sungei, and nature parks.
When is a good time to happen upon a snake in Singapore?
This a great question to ask, given the iNaturalist data set.
Using specific fields of data queried via Python from the iNaturalist.org API, I dusted off my d3.js to visualize Singapore snake observations according to the time of day, starting from an example that I adapted to a more flexible and dynamic Angular console (private for now):
Looking at these data, it appears that a lot of observations have occurred just before midday, as well as later in the evening.
Does this mean that it’s likely to happen upon a snake around these times? Perhaps, but lets look further.
40,000 spider encounters
There are 484 spider species reported in Singapore on iNaturalist (40,110 observations by 1,991 observers).
ajustinfocus.com
The hourly distribution looks similar to that for snakes. Is it really that much more likely to see snakes and spiders at these times (10am-noon? 9-10pm?)
Observation bias
iNaturalist observation data are dominated by the hours during which most people go on nature walks.
Even the elusive Sunda Colugo has been most observed around 10am to noon, due to apparent tendency for us ‘citizen-naturalists’ to explore parks around this time on Saturdays and Sundays:
ajustinfocus.com
Indeed, people primarily report spiders found on their wildlife photo-walks on weekend mornings and evenings. The bursts of reports on Friday and Saturday nights show just how fun Singaporean nightlife can be!
(Be warned that many of the nature parks are officially closed from 7pm-7am–but we still have spots.)
Does this tell us anything about the actual behavior or ‘observability’ of this species, or others?
Or does it just tell us more about human behavior?
Can these data actually help us to decide when to look for a particular species?
Comparisons
Comparing within-taxon hourly z-scores can visualize time-based differences in observational patterns between species.
Spider observations are skewed toward the late evening (inner blue bars), when compared to distributions of colugo observations:
This comparative, normalized statistic cancels out some of the ‘human factor,’ and shows spider observations are shifted toward the nighttime, as compared to the distribution of colugo sightings.
David Bowie
The photogenic David Bowie spider (Heteropoda davidbowie) is almost only ever seen at night. Does it completely hide during the daytime?
ajustinfocus.com
This is plausible by comparison to the time distribution for all spiders. The distribution of David Bowie spider observations is skewed by about a standard deviation more toward 9-11pm when compared to all spiders, and is much less likely to be seen during the peak daytime hours when thousands of other spiders are often seen.
While this may be common knowledge for resident Singaporean and Malaysian spider buffs, it is pretty cool to see this borne out in crowdsourced observational data on iNaturalist.org for a rarer species.
Which other species can be identified as nocturnal (etc) from these data?
Check for part II of this series (coming soon.)
What else is there to see in these data? Heaps!