Plant Breeder & Data analyst

Current Position - Genetics Intern - Syngenta

Summary:

Ph.D. in Plant Breeding and Genetics with 6 years of research experience in genetics, genomics, and data analysis of crop plants. Skilled in experimental design, statistical analysis, handling large genomic data and development and deployment of R/Shiny packages for plant breeding applications. Interested in a research or industry role involving genetics, plant breeding, bioinformatics, or Artificial Intelligence for plant breeding.

Publication Statistics:

Cumulative Impact Factor: 34.171

Total Citations: 39 Google Scholar

Technical Skills:

Programming Languages: Proficient in R, Python, Linux command line; experience with Shiny, Google Colab.
Phenomics and Genomics: Expertise in molecular breeding techniques, NGS data analysis, QTL/GWAS, genomic selection, designing field trials, managing phenotyping pipelines, and analyzing high-throughput phenotypic and genotypic data.
Data Science: Multivariate analysis, machine learning, large scale genomic data analysis and visualization.

🎓 Education

Ph.D., Genetics and Plant Breeding TamilNadu Agricultural University (FEB 2021 - Thesis submitted on 31st OCT 2023)
M.Sc., Genetics and Plant Breeding TamilNadu Agricultural University (AUG 2017 - DEC 2020)
B.Sc., Agriculture College of Agricultural Technology (JUL 2013 - JUL 2017)

🔬 Research Experience

Ph.D. research @TNAU (2021 - 2023)

Developed and applied advanced tissue culture techniques like embryo culture to rapidly generate somaclonal variants of rice with improved salinity stress tolerance.
Identified and validated novel genomic regions (QTLs) governing salinity tolerance through molecular mapping in rice F₂ populations along with thorough phenotypic and biochemical characterization.
Employed multi-omics approaches including GC-MS metabolomics to uncover key genes and metabolites linked to superior osmotic/ionic adjustment under salt stress.

Research scholar @ ICRISAT (2018 - 2020)

Single plant phenotying of diverse germplasm accessions (Sorghum, Peral millet, Pigeon pea – genotyped 1,980 plants/samples combined) to examine intra and inter accession genetic diversity.
Single plant genotyping of accessions using DArTSeq based SNPs.
Harnessed whole genome genotyping, simulations, and predictive modeling to provide integrated insights into the dynamics of genetic variation across a species’ range, by quantifying genomic diversity within and between isolated subpopulations, and developed optimized statistical frameworks to guide sustainable sampling regimes that limit genetic drift.

💼 Work experience

Data Science Consultant @ Fiverr (Decembe 2020 - Present)

5⭐ rated freelancer with over 200 hours of data analysis and visualization projects completed in R with over six years of experience specializing in genomic data analysis and bioinformatics.
Proficient in handling big data and performing complex modelling using popular R libraries and packages such as tidyr, data.table, dplyr, plyr, tenserflow, ggplot, ggdendro, ggtree, ggheatmap, and circos.
Received positive feedback from clients for knowledge, professionalism, and mastery of R programming. Demonstrated ability to deliver high-quality work as evidenced by the portfolio available on the Profile.

💻 Programming and data analysis skills

Proficient in full stack development of R packages using modular coding practices.
Created production-grade Shiny web applications for interactive data analysis and visualization and; expertise in dependency management tools like Golem for scalable deployment.
Machine learning model building for image classification/segmentation tasks; trained CNNs and other deep learning architectures in R, Python (PyCharm).
Multivariate data analysis of large-scale omics datasets including genomics, phenomics and metabolomics using cutting-edge bioinformatics tools.
Experience with analysis of next-generation sequencing data including quality control, read mapping, variant calling, expression quantification, metagenomic profiling, and associated statistical analysis using standard workflows in R and Python.
Advanced visualization for multi-dimensional biological data through Circos, ggtree, ggtreeextra, Cytoscape and other platforms.

Additional Skills:

Git/GitHub for version control and collaborative coding.
High performance computing on clusters for scalable data analysis.
Bioconductor for analyzing genomic/transcriptomic experiments.
Workflow automation to enhance reproducibility and to faster workflow.

🌱 R Packages developed

✅ PBGeno

GitHub

Developed pbgeno, an R package to streamline data analysis workflows for plant breeders. The package provides functions for calculating genetic distances, clustering genotypes, estimating diversity statistics, creating publication-quality visualizations, and automating routine tasks. Key features include calculating genetic distance matrices, structure-based clustering, polymorphism quantification, and converting proprietary marker genotypes into standardized formats for genome-wide association mapping.

✅ PBPerfect

PBPerfect (Visit Page) is a interactive web tool enabling reproducible multivariate analysis with visualization of phenotypic and genotypic data. It features basic statistics, experimental designs, SSR workflows, multivariate analysis, mating designs, and dynamic graphics with outputs exported as publication-standard tables and graphics requiring no further formatting.

✅ PBMLT: Plant Breeding Multilocation Trail Data Analysis Software

PBMLT(Visit Page) is a comprehensive and user-friendly platform that provides plant breeders with an all-in-one solution for analyzing multi-environment trial data through:

Powerful Analysis of Variance (ANOVA) to explore significance of variation across locations and treatments.
Additive Main Effects and Multiplicative Interaction (AMMI) analysis for in-depth genotype-environment interaction studies.
Calculation of essential AMMI-based stability indices like ASV, ASTAB, ASI, MASI, SIPC, ZA for identifying adaptable lines.
Evaluation of overall productivity using metrics such as mean AVAMGE for high-yielding genotype selection.
Scaled stability measures like SSIASTAB and ASI_SSI for ranking lines based on trait stability.
Interactive visualizations including Biplots, GGE plots, WASS plots for straightforward interpretation of complex data.
Centralized platform integrating meta-analysis, statistical analytics, and genotype-environment interaction analysis.

✅ PBlinkagemap

PBlinkagemap (Visit Page) enables easy creation of linkage maps and identification of associated quantitative trait loci (QTLs) from genomic and phenotypic datasets.

It allows users to:

Import chromosome, marker, map distance and trait score data
Interactively explore results on linkage maps
Visualize QTL locations and effects By handling computationally intensive linkage analysis and mapping behind the scenes, PBlinkagemap makes it simple for users to go from datasets to QTL discovery through an intuitive interface.

✅ PB-GWAS 🧬

PB-GWAS (Visit Page) makes powerful genome-wide association studies accessible through an easy-to-use web app 👩‍💻

Key features: 📥 File upload in 4 clicks
🏃‍♂️ One-click GWAS launch
⚙️ Adjust parameters via sidebar
📈 Interactive result plotting
📄 Full PDF report downloading

By eliminating coding barriers, PB-GWAS allows both new and advanced users to leverage GAPIT workflows with no programming expertise required!. Whether you want to map simple or complex traits, PB-GWAS provides the automated analysis to accelerate discoveries 🔬

✅ PBHaploMineR 🧬

PBHaploMineR (Visit Page) provides a toolkit to streamline pangenome haplotype mining and comparison from next-generation sequencing data. This R package aims to make large-scale haplotype analysis efficient and accessible for species with reference pangenomes.

Key Features:

Sequence Import - Functions to import raw reads from multiple platforms and store in standardized schema
Haplotype Calling - Optimized algorithms for pangenome-wide haplotype calling, incorporating structural variation
HapViz - Interactive visualization system to explore and compare haplotypes in context of pangenome structure
HapCompare - Statistically compare haplotypes between groups of samples/accessions and identify associated genomic signatures
Parallelization - Built-in parallelization to scale analyses across HPC infrastructure

PBHaploMineR is still under development and testing. ETA for first stable version is Q1 2024.

🎤 Workshop and Conferences

Attended the workshop on “A15 DArTSeq Data Analysis” at CIMMYT, Mexico
Attended the international conference on “Neglected and Underutilized Crop Species for Food, Nutrition, Energy and Environment” at NIPGR, New Delhi. Awarded a travel grant in recognition of contribution.
Attended the workshop on HPLC: “Principles and Applications in Plant Metabolomics”” at TNAU,Coimbaotore.
Attended the workshop on Molecular Modelling and Docking at TNAU, Coimbatore.
Presented poster titled “Genes for salt tolerance revealed by functional metagenomics in rice” at 6^th National conference AC&RI, TNAU, Trichy.

✍️ Articles & Blogs

Medium Articles

📜 Publications

Allan Victor, Mani Vetriventhan, Ramachandran Senthil, S. Geetha, Santosh Deshpande, Abhishek Rathore, Vinod Kumar, Prabhat Singh, Surender Reddymalla, and Vânia CR Azevedo. “Genome-wide DArTSeq genotyping and phenotypic based assessment of within and among accessions diversity and effective sample size in the diverse sorghum, pearl millet, and pigeonpea landraces.”Frontiers in Plant Science 11 (2020): 587426. (DOI: https://doi.org/10.3389/fpls.2020.587426)
Allan, V. (2023) ‘PB-Perfect: A Comprehensive R-Based Tool for Plant Breeding Data Analysis’, PB - Perfect. Available at: https://allanbiotools.shinyapps.io/pbperfect/.
Backiyalakshmi, C., Mani Vetriventhan, Santosh Deshpande, C. Babu, Allan Victor, D. Naresh, Rajeev Gupta, and Vania CR Azevedo. “Genome-wide assessment of population structure and genetic diversity of the global finger millet germplasm panel conserved at the ICRISAT Genebank.” Frontiers in Plant Science 12 (2021): 692463. (DOI: https://doi.org/10.3389/fpls.2021.692463)
Vetriventhan, Mani, Hari D. Upadhyaya, Vania CR Azevedo, Allan Victor, and Seetha Anitha. “Variability and trait‐specific accessions for grain yield and nutritional traits in germplasm of little millet (Panicum sumatrense Roth. Ex. Roem. & Schult.).” Crop Science 61, no. 4 (2021): 2658-2679. (DOI: https://doi.org/10.1002/csc2.20527)
Jagadesh, M., Duraisamy Selvi, Subramanium Thiyageshwari, Thangavel Kalaiselvi, Allan Victor, Munmun Dash, Keisar Lourdusamy, Ramalingam Kumaraperumal, Pushpanathan Raja, and U. Surendran. “Exploration of microbial signature and carbon footprints of the Nilgiri Hill Region in the Western Ghats global biodiversity hotspot of India.” Applied Soil Ecology (2023): 105176 (DOI: https://doi.org/10.1016/j.apsoil.2023.105176).
Jagadesh, M., Cherukumalli Srinivasarao, Duraisamy Selvi, Subramanium Thiyageshwari, Thangavel Kalaiselvi, Aradhna Kumari, Santhosh Kumar Singh, Allan Victor “Quantifying the Unvoiced Carbon Pools of the Nilgiri Hill Region in the Western Ghats Global Biodiversity Hotspot—First Report.” Sustainability 15, no. 6 (2023): 5520. (DOI: https://doi.org/10.3390/su15065520)
Jagadesh, M., Duraisamy Selvi, Subramanium Thiyageshwari, Cherukumalli Srinivasarao, Thangavel Kalaiselvi, Keisar Lourdusamy, Ramalingam Kumaraperumal, and Victor Allan. “Soil Carbon Dynamics Under Different Ecosystems of Ooty Region in the Western Ghats Biodiversity Hotspot of India.” Journal of Soil Science and Plant Nutrition 23, no. 1 (2023): 1374-1385. (DOI: https://doi.org/10.1007/s42729-023-01129-2)
Allan Victor., N. Meenakshi Ganesan, R. Saraswathi, R. Gnanam, and C. N. Chandrasekhar. “Exploring the phenotypic diversity of rice: A multivariate analysis of local landraces and elite cultivars of Tamil Nadu and Exotic Lines.” Electronic Journal of Plant Breeding 14, no. 3 (2023): 857-866. (DOI: 10.37992/2023.1403.099)
Allan Victor, S. Geetha, Mani Vetriventhan, and Vânia CR Azevedo. “Genetic diversity analysis of geographically diverse landraces and wild accessions in sorghum.” Electronic Journal of Plant Breeding 11, no. 03 (2020): 760-764. (DOI: https://doi.org/10.37992/2020.1103.125)

📚 References


Name:	Dr. Vania de Azevedo
Position:	Former Head, Plant Genetic Resources
Organization:	ICRISAT, Hyderabad, India
E-mail:	azevedovcr@gmail.com
LinkedIn:	Visit Page


Name:	Dr. Mani Vetriventhan
Position:	Senior Scientist, Plant Genetic Resources
Organization:	ICRISAT, Hyderabad, India
E-mail:	M.Vetriventhan@cgiar.org
LinkedIn:	Visit Page


Name:	Mr. Rajaguru Bohar
Position:	Regional Genotyping Coordinator (South Asia) / Senior Scientist (Project management)
Organization:	CIMMYT
E-mail:	wishmeguru@gmail.com
LinkedIn:	Visit Page

📞 Contact

Name	Allan. V
E-mail	albertoogy@gmail.com
LinkedIn	Visit Page