#2
I started looking at all kinds of databases. I'm still not sure if I want to use existing databases, APIs, or gather my own data, but it's worth knowing what's out there. A few interesting ones that caught my attention included the AIAAIC datasets (this is an independent public interest initiative that examines the field of AI and accountability). All their repositories are also linked in this neat spreadsheet.
I looked through Allison Parrish's list of interesting datasets and stumbled upon Conception — an attempt to link different linguistic concepts which sounds fascinating. I've been doing some other work in my Shared Minds class using ML embeddings models (these are models that give you a multi-dimensional point in the model's space, representing where the input you gave is positioned — it becomes interesting once you start comparing different inputs' positions). Anyway, I'm continuing to explore this alley too.
I've looked at all kinds of environmental data, as well as media-based data (from social platforms, news APIs, the city of New York, etc...). I'm trying to figure out if and how things would work if I use a "closed" dataset, in contrast to real-time one. I think my interest lies in open-ended outcomes, which would probably work better with real-time data. But on the other hand, I could maybe think about real-time generation that is based off a "closed" data-set (similarly to what I did in Loanwords).
I've also been reading about data collection and "cleaning", and about how quantifying abstract things is both necessary for analyzing the data (and arriving at conclusions...) but at the same time it's inherently reductive.
I got to talk to my academic advisor, Aidan Nelson, who recommended the work of Data Through Design, a yearly festival showcasing data-driven projects that usually use some NYC Open Data datasets.
The people I should definitely talk to once I have something more specific to talk about are: