Austin Yubo He

Tentatively, this site is just going to be a wall of text and some links.

can't really be bothered to make it look nice at the moment...

Recently spent like 0.5% of my life doing research on using reinforcement learning to discover quantum error correcting codes arxiv.org/abs/2502.14372 came up with this new perspective to attack the problem and outperformed the previous state of the art in a few metrics by quite a bit, so it seems I'm an "expert" now... fortunately (or unfortunately) it seems everyone else is quite clueless as well. should see the paper published somewhere within a year, I never realized how slow publishing was before... the median time to accpetances at some of the top journals is around 200-300 days...

Currently writing up a proposal for a robot combat competition (will link before april 14th), the main focus will be co-optimziation of morphology and control and theres a $20k usd prize pool. Already got the domain roboarena.io , it'll be live before may 15th... the environment and a baseline are also working now, so its mainly just writing up the proposal and I would have ran some more experiments myself, but don't really feel like spending another 0.5% of my life on this. Anyways, evolutionary morphology is pretty fascinating, and if I had the opportunity to ask just a few questions to god or agi or whatever you think some all knowing entity would be, this would be on my list.

Also been trying this crazy idea of training a language model purely with rl. trying on this language called tokipona which just has 120 words and a few grammar rules, so it should be relatively easy to learn compared to everything else... currently using a llm to give triples of a task, scalar reward, and a feedback as observation, but reward engineering is hell and its not really working though lol. I think part of the problem is that the local llms im using barely speak tokipona, so maybe i'll try chinese pinyin which is like 400 unique "words" in anglicized form. eventually will probbaly churn out a paper on the idea sometime in the future, I really don't like the idea of pretraining on unfathomably large amounts of text and it's nothing close to how a human would pick up language or any notion of "intelligence", not that the way humans learn is in any way optimal. (e.g. I took spanish classes for two years in highschool and didn't really learn shit)

Contact & Social.

You can reach me at [email protected].

Google Scholar X GitHub

Don't really have much of a presence on these so far. Just got my first few cites on google scholar a few weeks ago, barely use github or X either, but probably will some more eventually