#IranElection: Quantifying Online Activism

Devin Gaffney

01 Jun 2010 — 1 min read

For the final year of undergrad in college, I wanted to narrow down to an insanely specific degree and study the impact (if any) of Twitter on the protests following the Iran Election. I use Sandor Vegh’s categorization scheme to break any piece of particulate conversation (or tweet) into one his three categories of messages in a larger online activist instance (awareness/advocacy, organization/mobilization, action/reaction). By analyzing the data to see the breakdown of messages on a quantitative level, I was attempting to understand the predominance of data in one of these categories, which could in turn give us a rough idea of a data-based solution on how the Iran Election was impacted. While the paper that I wrote eventually came to no major conclusions (as most papers tend to), through the process, I realized that the job of collecting and analyzing the data, while difficult, could be repeated for any other type of study.

In a small way, answering questions like this for a particular platform (in this case, the low-hanging Twitter-fruit), we can start to get at the role of the internet in all our lives (kinda-sorta), which is hugely important and very often lacks hard numbers. For this reason, I split the time working on the thesis and working on a piece of code that automated all the work I had done on my thesis. This way, if anyone wanted to come back and try thing their own way with their own information, there wouldn’t be any issue.

bennington_thesis

bennington_thesis.pdf

3 MB

websci10_submission_6

websci10_submission_6.pdf

1 MB

The Future is Algorithmic Feeds on Bluesky

If the original sin of Web 1.0 was the pop-up ad, the original sin of web 2.0 was the move to algorithmic feeds. Opaque optimization strategies aimed at maximizing private revenue for the sake of what was otherwise externally billed as public goods became increasingly toxic, spawning discourse

How Much Data is Enough for Finetuning an LLM?

There's no shortage of analogies for explaining what an LLM is capable of - one of the best, though, is from this New Yorker article proclaiming it as a "blurry JPEG of the web". This metaphor is particularly useful for capturing many of the technical aspects

Using Synthetic Data Generators to Measure LSTM Lift

Long short-term memory models (LSTMs) are a family of neural networks that are predominantly used to predict the next value given a historical chain of previous values. These can be numerical predictions (i.e. where is the stock price going based on historical stock data) or categorical predictions (i.e.

Some supervision required: LLMs at scale in practice

Recently, I gave a talk at the PIE/Autodesk space to help contextualize some thoughts that have been percolating with regards to the nascent introduction of API-based, widely available LLMs like ChatGPT. In the hype cycle, I've observed some pretty broad claims about what's happening under

Read more

The Future is Algorithmic Feeds on Bluesky

How Much Data is Enough for Finetuning an LLM?

Using Synthetic Data Generators to Measure LSTM Lift

Some supervision required: LLMs at scale in practice