So the situation as it stands is that the fraction of the light cone expected to be filled with satisfied cats is not zero. This is already remarkable. What’s more remarkable is that this was probably inevitable starting nearly 5000 years ago.
As far as I can tell there were three completely alien to-each-other intelligences operating in stone age Egypt: humans, cats, and the gibbering alien god that is cat evolution (henceforth the cat-shoggoth.) What went down was that humans were by far the most powerful of those intelligences, and in the face of this disadvantage the cat shoggoth aligned the humans, not to its own utility function, but to the cats themselves. This is a phenomenally important case to study- it's very different to other cases like pigs or chickens where the shoggoth got what it wanted, at the brutal expense of the desires of the individual sentient beings. Humans permanently optimize for cat dignity, and that’s nuts.
The alignment to cats of humans has an extremely important property, a property that we have been saying for 20 years is mandatory for aligning robots: it scales with human intelligence. Egyptians treated cats pretty well, sure but the enormous strides in moral reasoning have directly translated to more correct reasoning about cat mental well being, while advances in technology have been turned to laser toys and cat MRI machines. This is the property that I am chasing.
The alignment wasn’t really accidental. It stemmed, as far as we can tell, from positive action of the cat-evolution shoggoth, through some combination of matching human maternal signals, bearable tweaks to cat behavior, and possibly an alliance with toxoplasmosis. The exact details of what happened need intensive study. It did not involve any destruction of the essence of cat-ness (a sharp distinction from Sniffles the teacup poodle. I don't care if you think you're happy, this would not please the prowling wolves of the stone age.)
The next step here is obvious: Aligning an intelligence to humans is hard, maybe even impossible. The true desires of a human being may not even be a well defined concept! On the other hand, aligning an unbounded, recursively improving intelligence to cats is boundedly hard because it’s already been done once. Copying the cat shoggoth’s homework should therefore be an absolute maximum priority task. We need to build a consequentialist, self improving reasoning model that loves cats.
We can approach this (known to be feasible) task in a variety of ways, but I want to throw my weight behind the most direct: sequence the pre-domestication cat genome and variation, set several hundred o1 type learners to guard grain warehouses from mice, seed ancient felis catus populations, and watch for lightning to strike twice. In every other step of the AI revolution scaling has been king over cleverness and I don’t think this is any different.
The best part is that we don’t need to specify the cat utility function, because we know via existence proof that the cat utility function can be learned from data in a way that generalizes wildly out of distribution. By recreating this process we can watch the generalizable learning of a lower intelligence’s utility function happen in real time, with logs and weight checkpoints. Then, we begin the long slog of duplicating it for people. The scientific value of this is priceless, while the cost of my proposed eventual AI ruled, city sized cat heavens is only several billion dollars. An easy trade.
Now, I’m not saying that we should immediately release the resulting model on the world at hyperscale- we can plan to wait for the human aligned version. But, I do think that we should prepare to have initiating the catpocalypse as a contingency. Faced with the alternative of a paperclip maximizer or utility negating basilisk spiraling out of control, I would want the option to counteract it with a machine god that values at least one thing that we value (cats) instead of none. My proposal is the only concrete approach we have to get even this tiny win (aside from just stopping, but that would be ludicrous)
Also, the decision to unshackle neo-Bastet and tile the universe with catnip and scratching posts should be made by hitting a bug red button
and you should give me the red button. i won’t press it I promise
no I won’t get tested for toxoplasmosis why would you