Bilingual Models for Parsing and Entity Recognition
I have received an Intelligence Community Postdoctoral Fellowship for a project on using bilingual data to improve parsing and entity recognition. The ultimate goal is an improved end-to-end translation pipeline. This project is joint work with David Burkett and Dan Klein.
Joint Cross-Lingual Ranking
At Microsoft Research Asia, I worked on applying machine learning to cross-lingual ranking. For many Chinese queries, we can find an accurate English (or other foreign) translation. Our system learns a better ranking function by simultaneously using information from both Chinese and foreign results.
CALO
The CALO project was a multi-institution effort to develop a Cognitive Agent that Learns and Organizes. As part of the UPenn team, I worked on building prototypes for email urgency prediction, summarization and clustering.
Google News
In 2004 I interned with the Google News team. My mentor was Thorsten Brants. I worked on the clustering component, but I think the best way to get to know Google News is to visit the site!
PARC discussion list summarization
During the summer of 2003, I worked at (formerly Xerox) PARC, with the HDI group in the Information Sciences and Technologies Laboratory. My PARC supervisor, Paula Newman, and I built a clustering and summarization system for archived discussion lists, such as email and newsgroups.
Johns Hopkins CLSP workshop
In summer 2002, I participated in a Johns Hopkins Center for Language and Speech Processing workshop on automatic multidocument summarization.