About me
I am Wei Shi (施为), a Research Scientist at Meta working on LLM post-training, reinforcement learning, and deep causal learning. I study real-world signals at billion scale, where the signals are often confounded and noisy. Through a causal lens, I use RL to align LLMs with human implicit preferences and domain knowledge, so they generate engaging, appropriate, and reliable creatives — with applications in monetization.
I obtained my Ph.D. at UT Austin under Prof. David Z. Pan (AI for chip design) and Prof. Nan Sun (high-performance chip design). Chip design demands both a deep understanding of the physical world and the capability to manage extreme engineering complexity — at nanometer geometries and nanosecond timescales, or tighter. AI must rise to the same ambition: capturing the deep intuition of human experts and executing designs at very high precision. The question I work on today first surfaced there: how to align learning systems with objectives that human experts hold mostly as tacit understanding.
As an ML enthusiast, chip designer, and anime fan, I’ve trained models, taped out silicon, and dabbled in drawing manga. Agentic capability is my most recent excitement — especially the rapidly advancing STEM reasoning capabilities of frontier LLMs and the emerging promise of autonomous AI research agents.
