public

Modelling Through Networks

(UPDATE REQUIRED :1.0.2 SOCIAL_SCIENCES) Using computational models and machine learning to empirically study social sciences.

Latest Post Poaching and Wildlife Trade by Dr. Hannah Ritchie public

Philosophy and social sciences are often at a disadvantage when they are used to objectively describe our world. This is not because they are naturally inferior to more empirical sciences such as biology and physics but rather because they are quite hard to quantify. However, with the rise of machine learning and computational models, this might be a problem of the past. These technologies allow us to systematically build frameworks that are both reliable and valid.

Let us first think about the concept of dimensionality when applied to data. Each variable is analogous to a dimension a dataset can be measured in. This is very easily understood when we think about how the options we have to visualize information depend on the dimensions we are working with. For example, a univariate dataset (assuming it is in a ratio scale of measurement) is only adequately described when you add frequency as another variable in order to form a distribution. Similarly, whenever we have too many variables, we rely on procedures that reduce the dimensionality of the data, for example principal component analysis, so that we can visualize it in a bi-dimensional plot. In simple terms, we heavily depend on the interaction between variables in order to interpret them and having too many or too little is a problem.

This is why modelling through networks presents an incredible opportunity for analysis as a framework. We no longer have to be concerned about quantifying a variable as we only care whether two points share a connection or not. There can obviously be complexity about this as we decide whether there should be a connection in the first place, or when determining the strength that each connection has. However, we can make a very simple model by making strong assumptions and just focusing on whether two items actually have a connection or not. But wait, there is more; we can extract multivariate data from just studying the structure itself since we can compute for metrics such as walking paths, different kinds of centrality and even classify points in communities. We can turn a non-parametric dataset into a network and then study the structure of the graph and compute parameters.

This is where using the planksip P.A.S.F. filter becomes useful. We determine whether two points share a link (to some extent even quantify it) and use consumer’s search habits from google to form an ontological framework to study semantic relationships between figures of authority and potentially be able to take “snapshots” of culture itself at different points in time. We just need to feed lists of people into our model, which we have done by extracting the names from an index of a book or by asking planksip writers for their favourite thinkers.

Above is one of the networks we have been able to build. We extracted the indexes of the textbooks used in introductory psychology classes from the most notable universities in the west coast of Canada and identified the most prevalent authors cited in them. We then fed this list of approximately 2,000 academics to the planksip's filter and pruned our dataset to only include the top 25 results of our database. We then used this information to make the graph shown above, which can be interpreted to track the semantic relationships of the most cited authors in introductory psychology in terms of consumer's search habits.

We can even further study the composition of our data by feeding our redefined graph to machine learning algorithms, which can sort through conditional probabilities in order to find patterns. Additionally, we can even render our model in a three-dimensional space to better conceptualize distance by defining a coordinate system. The possibilities are endless. We can do this for multiple subnetworks that derive from a master graph, and even study their interrelationships by focusing in their similarities and/or differences through intersections and unions.

Developing frameworks that can objectively describe concepts from social sciences is a great advantage for their practical use. In this article we talked about computationally describing relational datasets, which are usually unidimensional, but it is worth mentioning that there is also countless of machine learning models that allow you to interpret large-multivariate data.

Jose Emilio Velazquez

Published 21 days ago