课程详细信息

课程代码 :
BI6002
课程名称 :
高级生物信息学
课程英文名称 :
Advanced Bioinformatics
课程简称:
类型 :
一级学科
开课学期:
春季
学科/院系:
生物学
课程学分:
3
是否跨学期 :
总学时:
48
实验课学时 :
讨论学时 :
周学时 :
课程性质 :
专业课
课程层次 :
博士课程
课程分类 :
全日制课程
课程类型 :
(0710)生物学
考试方式:
上课方式:
课程教材语种类型:
授课语言类型:
成绩等级 :
通过不通过
是否绩点统计 :
开课状态 :
开课
任课老师:
课程简介 :
生物信息学是一门以数学模型和计算方法为技术手段,结合分子生物学的实验方法研究生物序列信息的获取和分析的课程。其最终目的是理解生物信息从DNA到RNA到蛋白质之间的储存和流动的过程与机理。课程重点在介绍基因组与蛋白质组等生物序列分析的方法及原理。强调学生的动手编程能力,以前沿研究中的问题为课堂教学实例,使得每个学生在完成本课程后可以比较有信心自己编程解决科研中遇到的基因组与蛋白质组序列分析相关的问题。此课程是针对生命科学相关专业的研究生专业课程,也可以作为生命科学类理工科专业本科高年级的通选课程。 在知识结构方面,帮助学生获得必要的基因组和蛋白质组序列数据获取和分析的基本方法和知识,掌握序列分析的核心算法和基本的数学模型,其数学、统计学和分子生物学理论基础,了解最新的基因组和蛋白质组序列数据获取和分析的方法和系统,以及它们在生物、医药、能源以及其它工农业各方面的应用。 在能力培养方面,能够运用所学的理论和应用软件,通过数据采集,数据分析,选择合适的数学模型,建立序列分析模型,对实验数据进行从序列到功能的分析,给出具有合理生物意义的分子机理。通过模型,算法和应用三条主线,培养学生发现问题,分析问题和解决问题的能力,为今后在科学研究和工程技术应用领域的继续学习和工作打下坚实的基础。在素质锻炼方面,引导学生了解基因组与蛋白质组序列数据获取的来源和方法等,使得学生广开思路,勤于思考;了解基因组与蛋白质组序列分析的基本模型、方法和系统以及相应的研究方法发生发展的历史,培养学生的思维方式及研究方法。
课程英文简介:
Bioinformatics is a course about mathematical modeling, computational methods and systems for biological information obtaining and analysis. The ultimate goal is to understand the process and mechanism of the flow of biological information from DNA to RNA to protein. This course will focus on the theory and methods for sequence analysis. We will use frontier research cases as teaching examples in the classroom. Therefore, this course emphasizes the capability of programming and problem solving so that students will feel confident to solve genomics and proteomics related problems after they finish this course. This course targets on all related majors in life sciences in graduate level. It can also serve as a general elective course for senior undergraduates in life sciences. The content of this course include fundamental methods and knowledge about genomics and proteomics sequence acquiring and analysis. Students are required to master the core algorithms and mathematical models for sequence analysis, and the underlying mathematical, statistical theory and their biological interpolations. Latest knowledge about progress of genomic and proteomic sequence analysis and their applications will be exposed to students. During this course, students are required to apply the theories and software learned from the course to solve a real problem by data acquiring, sequence analysis, model selecting, model training and use the model to analyze the data from a sequence to its functions. Students will get promoted in their capability of problem analysis and problem solving for their future career by the training in three different aspects provided by this course: modeling, algorithms, and their applications.
教学大纲:
1. 课程介绍 Chapter 1 Introduction to Bioinformatics i. 生物信息学数据库资源介绍。 Bioinformatics database introduction 2. 模型与算法理论 Modeling and algorithm theory i. 概率统计理论基础 Probability theory ii. 算法复杂度理论 Algorithm complexity analysis iii. 经典算法回顾(贪心算法,动态编程等) Introduction to classic algorithms (greedy algorithms, dynamic programming and etc) iv. 实验课:介绍Linux,Perl编程语言 Lab: Introduction to Linux and Perl programming 3. 序列比对 Sequence alignment i. 动态编程,最优比对结果 Dynamic programming, optimal alignments ii. BLAST算法和理论(概率计算) Algorithm and theory of BLAST alignment iii. 现有序列比对系统比较 Comparison of existing sequence alignment systems iv. 实验课大作业:一个序列比对系统(动态编程) Project 1: A sequence alignment system (dynamic programming) 4. 序列结构模型 Sequence structure modeling i. 隐马尔科夫模型(HMM) Hidden Markov Model (HMM) 1. HMM理论(统计模型及计算) HMM theory (Statistical models and computing) 2. 实验课大作业:一个基于HMM的简单基因结构预测系统 Project 2: An HMM-based simple gene prediction system 3. ii. PSSM或WMM 理论和应用 PSSM and WMM iii. 理论应用 Aplications 1. 基因结构预测及注释 Genome annotation and gene prediction 2. 序列模块寻找,调控因子模块寻找 Finding sequence modular, regulatory element prediction 3. Pfam,从序列到功能 Pfam:from sequence to function 5. 基因组测序方法进展 Progress of genome sequencing technology i. Sanger 法 Sanger sequencing method ii. 新一代测序技术:454, Solexa, Solid Next-generation sequencing technologies: 454, Solexa, Solid iii. 单细胞测序技术以及新新一代测序技术 Single-cell sequencing technology and next-next-generation sequencing technologies 6. 基因组序列拼接 Genome sequence assembly i. 拼接方法和系统 Methods and systems for genome assembly ii. 测序长度和pair-end对拼接的影响 Impact of sequence length and pair-end for genome assembly iii. 实验课大作业:一个序列拼接系统 Project 3: a sequence assembly system 7. 元基因组序列分析 Metagenomics i. 进化树构建 Pylogenetic tree contruction ii. 序列拼接 Sequence assembly iii. 归并(binning) Binning iv. 从大规模测序序列到物种组成分析 Microbial community composition structure analysis from high-throughput sequences. v. 元基因组基因注释和功能分析 Gene prediction and function analysis for metagenome 8. Bioinformatics frontier seminar 1. 学生文献报告 Student presentations 2. 自由讨论/特邀报告 Free discussions/special invited talks
教学进度:
考试大纲: