Breakfast Savior

Language: Java
Topic: Android App development

We develop an android app, which is an ordering system. Users can order from home, from the bus or while stuck in traffic and get the breakfast right away! The breakfast supplier can receive the order from the clients and broadcast advertisement to all the clients, too

Handcam Recognition

Language: Python
Framework: caffe
Topic: Image classification

Our goal is to fine-tune on the pre-trained Imagenet network. We want to apply the techniques to the IoT. We teach a machine that can recognize the objects around our daily life, like kettle, phone, cup and so on. Our data is from the camera on the user’s hands, which is why we call it “Handcam Recognition”

alt text

Video Title Generation

Language: Python
Framework: Tensorflow
Topic: Video caption

Our goal is to teach a machine to automatically generate a title in natural language from a personal video, about 45 seconds. We first create a highlight detector to find which the highlight clip is, and use the feature of the chosen clip to initialize the hyper parameter in LSTM model. Finally, generate the title from the output of the LSTM model. We also try many methods to increase our performance, including spatial attention, sentence augmentation, using C3D feature and so on. My main contribution is at the spatial attention part and sentence augmentation.
Combine Faster-RCNN with Tracker:
Our idea is to use object detection to extract where the objects are, and we extract 5 boxes per frame, which forms a feature pool. We train the faster rcnn on Imagenet 200 categories and use the tracker to visualize the video object detection. Then we use the weighted sum of the feature as the input of the LSTM.
(I have write a tutorial about training faster rcnn on github, please refer to


Grounding via natural language

Language: Python
Framework: Tensorflow
Topic: Visual grounding

In real AI, you may refer something with its location, color, or other characteristics. The AI robot should know where the corresponding object is. Inspired by the Ronghang Hu’s work, I re-implemented the Hu’s work and furthermore, utilizing multi-task concept and region proposals network to boost its performance. This work is done during my Umbo internship.


Deep Reinforcement Learning Survey

Topic: Deep reinforcement learning

I’m right now conducting the research on deep reinforcement learning at Computational Intelligence Technology Center in Industrial Technology Research Institute (ITRI). I like to study the algorithms in reinforcement leaning and I have shared some of my survey here. Besides studying, I’ll share my survey to the members in my department.