- This event has passed.
Deep Learning & Text Mining in Bioinformatics
Explore unique deep learning approaches for genomics, from autoencoders to sequential models, and turn data into actionable insights.
4
10+
10
2
When and How
Start Date
14th of March 2025
Duration
1.5 – 2 hours per week
Online
via zoom
Overview
This course is designed to equip participants with the knowledge and practical skills to harness deep learning for complex biological data analysis. What sets this course apart is its focus on both tabular and sequence genomics data, exploring state-of-the-art neural network architectures such as autoencoders, LSTM, and Word Embedding. Participants will learn to develop, fine-tune, and evaluate DL models while tackling real-world challenges in genomics and bioinformatics. Through hands-on projects like clustering RNA-Seq data and classifying antibiotic-resistant bacteria, attendees will not only understand the theoretical concepts but also gain practical expertise that is directly applicable to their research or professional projects.
After completing this workshop, you will be able to:
- Build Deep Learning (DL) models from tabular and sequence genomics data.
- Learn the various neural network architecture and their strengths and weaknesses.
- Learn how to choose the correct DL architecture for tabular/sequence data.
- Cluster unlabeled RNASeq data using autoencoders.
- Tune DL layers and functions to avoid over-fitting.
- Define data requirements before applying deep learning.
- Learn how to classify sequence data while preserving the position information.
- Cluster unlabeled sequence data using Word Embedding.
Content
1. Deep Learning for Tabular Genomics Data (2 Sessions)
- Introduction to Neural Networks: Understand the foundational principles of deep learning and neural networks.
- Keras Functions and Layers: Explore key tools for building and managing DL models.
- DL Architectures: Learn the strengths and applications of multi-layer perceptron (MLP) and convolutional neural networks (CNN).
- Multi-Omics Integration: Discover how DL can integrate diverse biological data types.
- Hands-On Projects:
- Classify microbiome data for phenotype predictions using MLP and CNN.
- Cluster RNA-Seq data using autoencoders.
- Advanced Techniques: Dive into dimensionality reduction, fine-tuning DL models, and one-shot learning for limited datasets.
2. Deep Learning and Text Mining for Sequence Genomics Data (2 Sessions)
- Sequential DL Architectures: Grasp the mechanisms of GRU and LSTM models tailored for sequential data.
- Data Augmentation: Enhance the quality and diversity of sequence data.
- Basic Text Mining: Learn foundational text analysis techniques like Count, TFIDF, and Word Embedding.
- Hands-On Projects:
- Classify antibiotic-resistant bacteria with LSTM models.
- Use Word Embedding to cluster unlabeled bacterial sequences.
- Combine LSTM and Word Embedding for improved classification accuracy.
- Utilizing Pre-Trained Models: Leverage existing Word Embedding models to accelerate DL workflows.
This workshop is ideal for:
- Students and data scientists are interested in applying DL to multi-omics data.
- Bioinformaticians are eager to understand and develop new DL-based tools such as autoencoders and sequential DL.
Prerequisite:
Completion of our “Machine Learning in Bioinformatics” course or a strong understanding of machine learning concepts and tools (e.g., Pandas, Sklearn) is required to ensure readiness for the advanced topics in this workshop.
- Sessions are recorded, so you can revisit the content anytime.
- Interactive Q&A after each session ensures personalized support.
- Flexible timing to accommodate participants from different time zones