Applying Data Science Tools to Understand Web Traffic Data
Applying Data Science Tools to Understand Web Traffic Data


This talk is about how to combine Google Analytics data with data science tools including natural language processing (NLP), spatial data mapping and HTML Scraping, so as to conduct in-depth web analytics. Companies now rely heavily on their websites to share information and engage with audience. Web analytic tools enable companies to know whether intended audience is actually accessing web contents. Google Analytics, in particular, is one of the most widely used web analytics tools. Providing data such as web visitors’ demographic information, their behaviors on a website and how they find the website, Google Analytics captures data that can help companies maximize access to information and services. "To innovate is to combine." Combining Google Analytics data with data science tools will offer more in-depth analytic results. Examples are. Site search data (i.e. texts that users filled in the search box) can be analyzed using NLP tools such as and in Python. This set of analytics can help visualize searched texts using word cloud, and discover topic modeling using non-negative matrix factorization. While the finest grid regarding visitors’ location provided by Google Analytics is city, visitors’ street addresses can be obtained when visitors enter their addresses to search for services within a certain radius of that location. These location data can be mapped using Google’s Geocoding API and and in R. When both visitors’ location data and existing services’ location data are presented in different colors in the same map, a clear picture of demand vs. supply appears. Google Analytics data can be linked and compared to data scraped from webpages, and hence provide more valuable information regarding whether information provided is in a format and language that is accessible and appropriate for visitors. Example codes (Python and R), tables and figures in this talk will be drawn from analytic projects for websites of various companies and organizations, with identifiable information removed.


Weilin is currently a research scientist at Child Trends, a nonprofit research organization conducting high-quality research and sharing the resulting knowledge with practitioners and policymakers. She has been leading many data analytics projects of federal and state contracts. She is an EMC certified data scientist and holds a PhD from UC-Irvine.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google