SuSPECT – Scaffolding Student PErspectives for Critical Thinking

What’s the problem?

The issue of biased information is a societal issue which influences all of us. Examples include the recent European referendum vote in the UK, and the US presidential election.  There has been a surge in the number of resources online that are misleading or false. A recent white paper described the ability of learners to assess such information sources as “dismaying,” “bleak” and “[a] threat to democracy.”. Teaching facilitators have the ability and responsibility to educate their learners, not only in their areas of expertise, but also teach them how to think in a balanced way about the information they consume. While this has long been the role of educational institutions to nurture such skills, the means to do so need to evolve in pace with the ecosystem that our students are learning in.

While critical thinking is often implicitly integrated, or assumed in education, SuSPECT helps make critical thinking in education explicit, which then enables a better understanding of its importance in other domains like assessing online information.

What are we doing about it?

As a response, this project addresses how to help students critically evaluate and respond to online resources. SuSPECT is a short project funded by the Leiden-Delft-Erasmus Centre for Education and Learning, running from March 2017-March 2018. SuSPECT is a project aimed at helping learners develop more balanced thinking for materials they find online. This approach evaluates the efficacy of debate in the classroom, by building on the existing rbutr online argumentation system. This project aims to help learners not only assess the veracity of online resources, but also develop more nuanced and balanced thinking.

No really, this is what we are DOING about it.

  • Improving the Rbutr system to suggest timely rebuttals, and to use crowd sourcing to annotate potential rebuttals. The use case is a person who finds a website and wants to find rebuttals for it. This part of the system mines twitter and Reddit for occurrences of this specific URL and responses that contain other URLs. This user then annotated these URLs are rebuttals/contrary/irrelevant. These are then further used to rank potential results for other users who look for the same URL later. We are in the processes of hiring a developer to work on this task.
  • Working with lecturers to include debate in their curriculum. We have been discussing how to introduce the use of Rbutr and debate into two courses: IT & Values (Delft); Ethics, Culture, and Biotechnology (Leiden). We are also in deliberation with the Erasmus institute (Rotterdam) on how the intervention may be introduced to a MOOC called Deception Detox. Each course has a different curriculum and learning outcomes, which results in interesting differences in courses.

For example, IT & Values is expected to have a large number (~100) of students, which means that several tutorial groups will discuss the same topic in parallel and may support each other with materials via Rbutr.

Ethics, Culture, and Biotechnology is contrast more condensed, giving us the opportunity to experiment with a flipped classroom approach where students prepare their arguments ahead of time. Lecturers in each course prepare resources as a basis for debate (both for and against), but in the case of the flipped classroom this puts the bar higher. It also raises the bar for training the students in debate with validated resources before they do their own research (e.g., using Rbutr).

  • Doing experiments. Together with colleagues at EPFL we are investigating the impact of teaching students about critical thinking (short presentation on the Baloney Kit) on their opinions on typical debate topics online (such as vaccines causing autism). Students voted for topics using a mobile-based voting tool (SpeakUp), and we already see some promising results for this very short and simple intervention. Asking students to debate appears to help more students think critically, although this does not seem to be the case for students with strong opinions.  We are currently writing up the results, and look forward to sharing these with you. We also have loads of other ideas for experiments, so let us know if you want to collaborate (we have some small funds for running experiments).

Looking forward to share the progresses on all of these aspects with you soon!

Nava and the team


Special Issue on Human Interaction with Artificial Advice Givers

Many interactive systems in today’s world can be viewed as providing advice to their users. Commercial examples include recommender systems, satellite navigation systems, intelligent personal assistants on smartphones, and automated checkout systems in supermarkets. We will call these systems that support people in making choices and decisions artificial advice givers (AAGs) : They propose and evaluate options while involving their human users in the decision-making process. This special issue addresses the challenge of improving the interaction between artificial and human agents. It answers the question of how an agent of each type (human and artificial) can influence and understand the reasoning, working models, and conclusions of the other agent by means of novel forms of interaction. To address this challenge, the articles in the special issue are organized around three themes: (a) human factors to consider when designing interactions with AAGs (e.g., over- and under-reliance, overestimation of the system’s capabilities), (b) methods for supporting interaction with AAGs (e.g., natural language, visualization, and argumentation), and (c) considerations for evaluating AAGs (both criteria and methodology for applying them).

The full special issue can be found here:

ENSURE -ExplaiNing SeqUences in REcommendations

I have been awarded a Technology Fellowship at TU Delft, and will be joining the Web Information Systems group, Faculty of Electrical Engineering, Mathematics and Computer Science (with Geert-Jan Houben, Claudia Hauff, Alessandro Bozzon et al.), as an Assistant Professor from the 13th of February 2017!

This fellowship is focused on explaining sequences of recommended items (rather than single items or even sets).  This enables me to fund a Research Fellow to work on this challenge with me for 2 years. This job will be advertised more formally shortly, but to give you a teaser….

The research agenda involves: 

  • Gaining an understanding of people’s concerns regarding personalization for sequences of recommended items.
  • Gaining an understanding of people’s views on the kinds of explanations that alleviate their concerns and help them to make good decisions.
  • Producing guidelines for algorithms for constructing explainable recommender sequences.
  • Developing algorithms for explaining sequences containing both novelty and trade-offs effectively and while considering  privacy concerns. This includes investigating the role of context and personal characteristics.
  • Facilitating a dialogue between policy makers, researchers, and the general public regarding the findings above.

Job Requirements:

You hold a PhD in computer science or related disciplines. You have a track record of scientific excellence in the field(s) of recommender systems, user-modeling, multi-objective optimization, and/or human-computer interaction. You must demonstrate either an ability to design algorithms for sequences of items, or a deep experience designing interactions with recommender systems. You will be expected to lead or strongly contribute to academic publications, contribute to grant proposals, and to interact with stakeholders outside academia (e.g., end users, business, and public policy). Strong verbal and written communication skills are therefore also required.

If you (or someone you know) are interested in joining TU Delft and working with me on this challenge, I’d love to have an informal chat to see if we have a fit.  I can be reached at:

Toward Ethical Personalization

When we work with data analytics we often lose sight of the context in which our users and customers live their lives. As data scientists, we focus on collecting data, filtering it, and improving our predictive accuracy. After all, getting the prediction right is a challenge in and of itself. We do not want to make incorrect predictions, or miss out on correct predictions. However, there are much subtler ethical questions to consider when applying analytics to personal data.

At the end of 2016, I organized two events with an aim to address some of these questions. The first, was aimed at a more general audience. Together with Dr. Paolo Palmieri we organized an information session titled What is the Internet Hiding From You…?, as part of the ESRC festival of social science. At the end of the session participants contributed to focus groups where we discussed:

  • Which benefits they would like get from personalized services?
  • Which information they were willing to share; and when personalization happens?
  • How they want a computer to communicate to the the information it has used?

While a small and self-selecting sample, I was struck by how distrustful the participants were of personalization services, and the need for increased transparency and communication between personalization services and users.

The second event focused on industry, and key players in Big Data in Scotland.  In a panel on “Data Analytics: Balancing Insight, Privacy & Trust’‘ at the Big Data Conference, we discussed the following issues:

  • When do analytics become too intrusive? When can we make inferences across data sources, or inferences that users did not consent to being made when they initially provide the data? (Video)
  • How should we make algorithmic biases visible to users? How do we avoid filter bubbles like the one that happened during Brexit and the presidential vote in the US? How can explanations be used to improve transparency? (Video)
  • Is there going to be a swing in the balance of power towards individuals / consumers? How do we balance this with businesses’ need to be competitive? (Video)

The panel members represented key stakeholders in policy and industry:

  • Ken Macdonald, Head of ICO Regions, Scotland, NI & Wales, Information Commissioners Office
  • Martin Squires, Global Lead, Customer Intelligence and Data, Boots
  • Dr. Hannah Rudman, Director, Rudman Consulting Limiting

The discussions in the panel highlighted a corporate interest in personalizing in a way that is beneficial to users. From the conversation it appeared that many industry players are less aware of more complex and delicate ethical challenges such as data linking, using data for different purposes than it was initially supplied, or that an algorithmically correct personalization is not always the best from a user perspective (see e.g., the target story).

The panel also confirmed that these concerns are recognized on a policy level by the Information Commissioner’s Office (ICO) in the UK. The new EU General Data Protection Regulation (GDPR) coming into effect in 2018 recognises privacy as a legal right, and includes a “right to explanation‘’ whereby a user can ask for an explanation of an algorithmic decision that was made about them. Despite the planned UK exit from the EU, the ICO confirms that comparable regulations will be put into effect in the UK, and that the ICO will have legal capacity to enforce compliance with these regulations. Privacy policies will need to be geared towards the customer and expressed in clear and plain language.

It is largely a welcome development that analytics platforms have a great deal of power by having access to the usage data of individuals. It is now time to start using that power wisely. The public is justifiably concerned, and it is our responsibility to think critically about what data we need to collect and store, and for which purposes. Computers can make and collect data and run algorithms, but humans working with big data are the ones that establish the analytical programmes, professional practices, and codes surrounding them. Overlooking the person-centred ethical issues may result in negative social impact.

Let there be a balance between the innovation and economic opportunity of big data, and respecting privacy and human rights within open, tolerant societies. To allow this to happen, we need to work together to establish best practices, and make a record of positive case studies where these have been observed. This is a conversation that is going to need all hands on deck: customers, policy makers, data analytics companies, as well as academic researchers. Let’s get cracking!

What is the internet hiding from you?

Our ESRC Festival of Social Science event proposal has been accepted! We will be running focus groups and an information session on the topic: “What is the internet hiding from you?” on November 8th, 2016. Event held at the Executive Business Centre. Afternoon session 2.30-5pm, OR Evening session 6-8:30pm (two slots of the same sort of session).

Most of us know that our personal data is being used to filter our Facebook `timeline’ or that Amazon personalises which items it shows to us. However, as users, we have not always agreed to that personalisation, and do not know how our personal data is being used. It’s not surprising that many of us are unsure whether we can trust the internet and how our information is shared.  This workshop gives members of the public a chance to find out more about the issues and share their views, potentially shaping the future of big data research.

More details and registration here:

Big Data and Cloud Computing

This term I’ve been teaching Big Data and Cloud Computing as part of a Masters degree in Applied Data Analytics at Bournemouth University.

The students on this course worked on exploratory data analysis, using large, real world data-sets. To support interactive visualization we used R for analysis, and to create a web application. This was a great way for the class to learn hands on about issues with large datasets, including hetreogenity across data sources,  and the importance of being able to host and access the data (one of the groups reads JSON data from a live feed).

Each team covered a different interesting problem area including: climate change, crime rates in the Camden Borough of London, live earthquake updates, and live tweets of music listens (#nowplaying) across the globe. I am really proud of their work and thought you might want to have a look! Just click on the topic name, the images or the links to try out the systems.


This application is used for both visualization and exploratory analysis of earthquake data. The earthquake data are being downloaded online, in real time, from in geoJson format. Then the application manipulates them in order to take the final dataset that includes only the necessary variables. The names of the variables are Local.Time, magnitude, significance, place, longitude, langitude and depth.

Screen Shot 2016-05-16 at 16.48.10.png


The dataset ‘Crime’ is gathered from Camden Police crime report from January to June 2015. The data is geocoded with longitude and latitude information and mapped using the Leaflet map widget. Camden Police data can be found here

Main packages used are Shiny, Leaflet, shinydashboard, rpivotTable

Screen Shot 2016-05-16 at 16.49.02.png


Climate change:

Screen Shot 2016-05-16 at 16.50.14

Music listening patterns across the globe

A smaller data set to enable a quicker navigation through the app, even if the statistics become meaningless you will have a better overview with it.

The full data set (which has 356 845 observations); it takes around 35 seconds for the page to load.

Screen Shot 2016-05-16 at 16.44.05

While this is the first time these students have worked with R, I’m sure you’ll agree they’ve done a terrific job of visualizing complex real world data!