INTRODUCTION AND DEVELOPMENT OF A PRACTICAL LESSON FOR IMPROVING THE COMPETENCE OF UNDERGRADUATE STUDENTS IN MASSIVE GENOTYPING DATA ANALYSIS: THE USEFULNESS OF TASSEL SOFTWARE
1 Universitat Politècnica de València, Instituto de Conservación y Mejora de la Agrodiversidad Valenciana (SPAIN)
2 Instituto de Biología Molecular y Celular de Plantas, CSIC-UPV (SPAIN)
About this paper:
Conference name: 16th International Technology, Education and Development Conference
Dates: 7-8 March, 2022
Location: Online Conference
Abstract:
An essential step in many genetic research projects is the genotyping of human, animal, plant or microbial materials that are being studied. Genotyping technologies are based on the identification of differences in genomic sequences that may lead to major changes in phenotype. As a result, for high-throughput genotyping platforms, an enormous amount of information is generated, and some bioinformatics and big data fundamentals are required for their proper analysis. There are different available software for massive raw data analysis, but often are command line based, that are far from being intuitive. The Trait Analysis by aSSociation, Evolution and Linkage (TASSEL) software is a very useful tool that allows user-friendly and visual analysis of the genotyped sequences. It is widely used for performing association analysis combining both genotypic and phenotypic data that exploit the natural diversity of a genome. However, it has many more features including the analysis of insertions/deletions, population and family structure, principal components, linkage disequilibrium, missing data imputation, among other analyses. Therefore, the knowledge and management of this tool can become of high usefulness by integrating different theoretical concepts. In addition, results are visualized graphically which is crucial for a better understanding and interpretation of the results and contributes to the establishment and reinforcement of knowledge. The formation of undergraduate students (e.g. for biotechnology students) in the Genetics field of massive data analysis software is of great relevance for addressing future challenges that they will surely face in any research laboratory. For this reason, we propose the introduction of a practical session in the “Genomics” subject based on the analysis of raw data coming from a plant breeding population using the TASSEL software. As a plant material, we propose a collection of eggplant (Solanum melongena L.) genotyped accessions due to the broad genotypic and phenotypic variability observed. According to the design of the practical lesson only one session of three hours is required, which is divided into three different blocks involving: i) raw data filtering and missing data imputation, ii) principal component analysis (PCA) for population structure identification, and iii) genome-wide association study (GWAS) for different agronomic traits. In addition, there will be one hour of autonomous work in which the results obtained will be collected in a report. During this practical session, students will learn how to properly manage, process, and analyse genetic data. The final goal is to have the capacity of mastering a widely used analysis software.Keywords:
Genotyping, phenotyping, bioinformatics, raw data, diversity.