Message Distortion in Information Cascades

This repository contains the data and code of the paper "Message Distortion in Information Cascades" (which you can read clicking here):

@inproceedings{horta_ribeiro_message_2019,
author={Ribeiro, Manoel Horta and Gligori\'c, Kristina and West, Robert},
title={Message Distortion in Information Cascades},
booktitle={Proceedings of the 2019 World Wide Web Conference},
year={2019},
}

Check out the accompanying website which allows you to visualize the data.

Data

You may find the data in the data folder ./data (duh).

We make the data available in two formats: a .csv and .graphml. The latter is the format used in the analysis of the data (for convenience).

Field	Null in Root	Description
node_id	No	Unique identifier of the node.
level	No	Summarization level: (0: original abstract, 1: ~1024 chars, 2: ~512 chars, 3: ~256 chars, 4: ~124 chars, 5: ~64 chars)
branch	No	Source note used in this information cascade.
question	No	`node_id` of the text used as reference for the summarization, in case of root nodes, it is the same as `node_id`
Topic	No	Topic of the paper summarized (breast, cardio, immunization, diet)
Answer	No	Original abstract in the case of the root, summary otherwise.
Age	Yes	Age range of the worker which summarized the paper (18-24/24-39/40-60/60+)
Education	Yes	Education level of the worker which summarized the paper (Some High School, High School, Some College, College)
Gender	Yes	Gender of the worker which summarized the paper (male, female)
Qualification	Yes	Performance on qualification test (float, 0-1)
WorkerId	Yes	Unique worker identifier as provided by amazon mt
Doggos_crowd	No	Dictionary containing the values for facts in each category. `{"Coarse":{"Coarse_category1":["Val1", "Val2", ...] ...}, "Fine":{"Fine_category1":["Val1"} ...}`
Doggos_text	No	Dictionary containing the text for facts of each sub category. Null in non-root. `{"Fine_category1": "Text1", "Fine_category2": "Text2", ...` }
Tagging	No	In the `csv` files, this is a Dictionary, similar to doggos crowd, containing the keyphrases associated with each subcategory. `{"Coarse": {"keyword1": {"Course_category1", ...}, ... }, "Fine": {"keyword1": {"Fine_category1", ...}, ... }`. For the `graphml` files, this is actually a python object with this dictionary, and a bunch of helper functions to calculate the difference in keywords across hops.

Code

All the analysis performed may be found in the analyses.ipynb notebook.

To install all requirements simply run

pip install -r requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Message Distortion in Information Cascades

Data

Code

Files

README.md

Latest commit

History

README.md

File metadata and controls

Message Distortion in Information Cascades

Data

Code