sofiechan home

Super-Coordination Technical Exercise Problems

anon 0x499 said in #2716 2w ago: 1111

Another anon and I have been thinking about how to do supercoordination. We've got many ideas but we're still looking for the right problem to dive into together seriously.

Today we thought we should find some plausibly related pure technical problems that aren't too big that we could collaborate on. Pure technical problems are nice because they are very real, aren't encumbered by trying to solve product problems yet, but can yield huge impact when solved. Here are a few relatively pure technical problems that are suggested to me by the supercoordination problem:

1. Neural inference over fact database. Suppose you wanted to form a robustly coherent view of a set of facts (eg a court case, what went wrong, etc). You get together a bunch of witnesses, some human testimony, some other kinds of evidence. Could you feed all these pieces of information into a computer system that will relatively reliably, and without a human in the loop, form a most plausible interpretation relating to various questions? LLMs can sortof do this in straightforward cases, but imagine something closer to a neural SAT solver where you can feed it hundreds or thousands of facts and it can actually do logical reasoning to infer thousands of other facts about the situation. What is the shape of the problem that would be solvable this way? I suspect the core of this would be the right way to define the domain of facts such that you could get enough semantic information (eg LLMs learn word semantics empirically by having to predict next token. What the equivalent for poorly defined "facts"?).

2. Sensor fusion over semi-reliable narrators. Take the above general freeform inference problem and add more restricted grammar but focus on uncertainty over who is telling the truth, who has good judgement, etc. So compared to SAT instead of word-of-god very well defined proposition-logical constraints and questions, maybe our known facts involve semi-reliable attestations in a restricted predicate logic (eg Joe says this object has this quality) and we have to form predictive probabilities over either the ground truth (if we can ground it) or at least a predictive model of further such facts. This seems fairly well-definable at least.

3. Intuitive topological/cartographic embedding. Take a set of documents with different topics and other features, or just objects in general. Embed them in a predictive latent space. Map that latent space down to some kind of low-dimensional or variable-precision space optimized for human intuitiveness (eg 2d or 3d space, or hierarchical topic tags, or both) from which the content can still be predicted with some precision. So a sort of two-level auto-encoder, the first one optimized for semantic accuracy, the second one optimized for intuitive human navigability as a learnable "map". I intend this as an alternative paradigm to recommender systems, flipping it around. Instead of modeling the user's preferences and surfacing those to the algorithm, we model the data space and surface that to the user's preferences. I believe this is a superior agency-preserving approach.

These are all ways to translate between what the machine can do really well, and what groups of people have and need. Specifically, the goal is machine-assisted shared clarity. #1 and #2 approach the problem of forming a trustworthy joint perspective from a disparate set of information. Shared perspective formation is crucial for being able to act as a unit at scale. #3 is about turning the structure of a space into a highly legible object of shared perception. Maps have always been powerful artifacts of communication and thought; imagine trying to navigate the world or discuss geography through the lens of a recommender system. Insane. What if we could generalize that clarity to more navigation problems?

Anyways I thought I'd try asking you guys if you have thoughts on how to approach supercoordination in a technical way, or other problems to suggest.

referenced by: >>2729

Another anon and I h 1111

anon 0x49b said in #2719 1w ago: 55

I think another interesting problem, narrower in scope, is automatically detecting fraud in scientific publishing. There are a small set of tools and techniques that super spotters use. They look for duplication in images, or data distributions that are implausible e.g. uniform when it's supposed to be normal. Additionally, there are already thousands of allegations of fraud on pubpeer. More ideas include: if code is published, one can look for whether a bug is responsible for finding. One can also look for p-values that are just under the significance threshold. If certain researchers have a track record of publications that have p-values that cluster around the significance threshold, it probably means they are p-hacking.

So, (1) scraping and aggregating all existing allegations of scientific fraud, and (2) automatically flagging suspicious signals, (3) exposing this in a public database, could help inform the wider intelligentsia on the true depth of the replication crisis and fraud in science.

referenced by: >>2723

I think another inte 55

anon 0x49c said in #2722 1w ago: 55

>Sensor fusion over semi-reliable narrators.
Sheaf cohomology is what gets used for sensor networks. Cohomology detects obstructions to global consistency (in this case of the incoming data from your sensor network) while sheaf theory tells you how that information overlaps and how it gets patched together. It's above my pay-grade but I know of at least one paper doing logic as homological algebra, for a kind of theory of logical deviance (https://arxiv.org/pdf/2304.13786), deviance in the sense of curvature, one may make an error but not all errors are equal! So the structure of that error can be analyzed and its distance to correctness to some extent measured. So there's at least something plausibly like what you want for this in the horizon.

>Intuitive topological/cartographic embedding
Matilde Marcolli has some very recent work on mathematical linguistics that seems very naturally suited for this since it both geometrically models the syntax-semantics interface while also attempting to give an account of externalization, i.e., how language gets mapped into actions such as speaking or typing. Finding a 'good' map document soup -> human intuition sounds analogous to finding a 'good' map neural soup -> motor control, a somewhat disconcerting correspondence as it places us in the more servile position. It's very giga-brained stuff though and looks like overkill to me, which in my limited experience is typical with mathematical physicists.

This is to say that I think these three problems are good ideas and that there already seems to be some modest in volume (few people like math, and even mathematicians care little for logic as such) but sharp interest in associated questions, which is very promising.

Would be nice to stumble upon some memetic weapon/lure with which to, at least momentarily, capture the attention of someone half as smart as Terry Tao, tbh.

referenced by: >>2723

Sheaf cohomology is 55

anon 0x499 said in #2723 1w ago: 33

>>2722
My math isn’t strong enough to deal with sheaf cohomology, but sounds interesting. Thanks for the tips. I’ll check out Marcolli.

>>2719
Fraud detection at scale sounds really hard and subtle. But maybe you could just pull out key evidence using an LLM and then run a bunch of heuristic checks? How would you approach that?

My math isn’t strong 33

anon 0x4a1 said in #2729 5d ago: 11

>>2716
How about aggregating the collective will of a bunch of individuals wills? By will, I mean a freeform expression/document consisting of issues one values/cares about/wants change in.

How about aggregatin 11

You must login to post.