My name is Nhat Tran, and I also go by Jonny.
I’m a recent CS Ph.D. graduate from the University of Texas at Arlington with a background in machine learning and bioinformatics.
My research focuses on the area of bioinformatics with techniques centered around graph neural networks and multimodal data integration. More specifically, I research new computational methods for combining sequence representation and heterogeneous network interactions among RNA sequences to aid in the understanding of non-coding RNA functions. My goal is to develop new models, techniques, and open-source tools (see OpenOmics) that will enable researchers to unleash the untapped potential of these rich, complex datasets.
I was previously at Genentech, working on large-scale data pipelines and machine learning models to establish QC standards for detecting low-quality sequencing parameters in NGS genomics. Before focusing on on machine learning and data science, I trained as a full-stack software developer through various internships and hackathons teams to start a career in developing web and mobile applications in the DevOps space.
In my off time, I also research about espresso coffee science. I use machine learning and software development extensively - feel free to check out some of my works!
- Graph neural networks (Heterogeneous graph, Graph representation learning)
- Natural language processing (Text classification, Text generation)
- Machine learning (Deep learning, Transfer learning, Multimodal learning)
- Bioinformatics (Non-coding RNA, RNA-protein interaction, RNA structure)
- Data science (Data engineering, Data visualization, Data integration)
- Software development (Full-stack web development, Mobile development, DevOps)
- Feb 2023: A new work titled “Protein function prediction by incorporating knowledge graph representation of heterogeneous interactions and gene ontology” has been submitted!
- Dec 2022: I successfully defended my Ph.D. dissertation!
- Aug 2021: I joined Genentech as a Data Scientist intern in the Oncology Bioinformatics group!
- May 2021: “OpenOmics: A bioinformatics API to integrate multi-omics datasets and interface with public databases” is now published in the Journal of Open Source Software!
- Sep 2020: A new preprint entitled “Layer-stacked Attention for Heterogeneous Network Embedding” is now on arXiv!
- Jan 2020: “Network Representation of Large-Scale Heterogeneous RNA Sequences with Integration of Diverse Multi-omics, Interactions, and Annotations Data” is now published in Pacific Symposium on Biocomputing (PSB) 2020!
- Dec 2018: “MicroRNA dysregulational synergistic network: discovering microRNA dysregulatory modules across subtypes in non-small cell lung cancers” is now published in BMC Bioinformatics journal!
- July 2017: “Improved microRNA biomarkers for pathological stages in lung adenocarcinoma via clustering of dysregulated microRNA-target associations” is now published in IEEE EMBC’17 journal!