Publications

Automatic Movie Generation: Controllable Plot Generation and Video Retrieval

I-Tsun Cheng

HKUST Bachelor's Thesis 2021

Movies have taught us morals in life and inspired us in many ways possible. However, movies are usually expensive and costly to make. Therefore, we attempt to alleviate the complications involved with movie production by introducing a novel framework for movie generation. Our framework takes in a movie genre as input and generates a movie pertained to that genre as well as a plot that the movie attempts to follow. It consists of two primary components: 1) Controllable Plot Generation: generates the plot pertained to the genre, 2) Video Retrieval: retrieves the most relevant video clips from a database given a plot and combine them into a movie. For Controllable Plot Generation, we found that by using language models and specific attribute classifiers, our model is able to generate fluent and consistent plots that are strongly relevant with the input genre. For Video Retrieval, we built our custom algorithm that matches the plot with the video clips in the dataset by semantic similarity and abstractive summarization techniques. By different means of evaluation, we found that our movie generation framework is able to generate creative movies that is strongly relevant with the input genre and the generated plot.
[PAPER]

Myers-Briggs Personality Classification and Personality-Specific Language Generation Using Pre-trained Language Models

Sedrick Scott Keh, I-Tsun Cheng

ArXiV 2019

The Myers-Briggs Type Indicator (MBTI) is a popular personality metric that uses four dichotomies as indicators of personality traits. This paper examines the use of pre-trained language models to predict MBTI personality types based on scraped labeled texts. The proposed model reaches an accuracy of 0.47 for correctly predicting all 4 types and 0.86 for correctly predicting at least 2 types. Furthermore, we investigate the possible uses of a fine-tuned BERT model for personality-specific language generation. This is a task essential for both modern psychology and for intelligent empathetic systems.
[PAPER]

Learn Languages First and Then Convert: Towards Effective Simplified to Traditional Chinese Conversion

Pranav A, S.F. Hui, I-Tsun Cheng, Ishaan Batra, Chiu Yik Hei

NAACL-SRW 2019, non-archival track 2019

Simplified Chinese to Traditional Chinese conversion is a common preprocessing step in Chinese NLP. However, a simplified Chinese character could correspond to multiple traditional characters, and unfortunately, there is no accurate toolkit to disambiguate such mappings. We propose a sub-word segmentation model which relies on Simplified Chinese and Traditional Chinese language models and the character mapping table. Through these two language models, we effectively segment a sentence and use them to disambiguate between mappings. Our experiments show that we achieve the disambiguation accuracy of 98%.
[PAPER]