Categories
conference
-
Presentation of multi-genome synteny with ntSynt, and ancestry prediction with ntRoot at ISMB 2024 in Montreal
conference ·The 32nd International Conference on Intelligent Systems for Molecular Biology (ISMB 2024) is held in Montreal, Quebec, Canada from July 12-16, 2024. We are excited to announce that the Birol Lab will be presenting at the conference our latest research advancements, which include: the DNA sequence minimizer-based multi-genome synteny utility ntSynt and the sequence alignment-free ancestry inference technology...
-
ntHits, unikseq and Stash presented at RECOMB 2023
conference ·The 27th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2023) was held in Istanbul, Turkey from April 14-19, 2023. The Bioinformatics Technology Lab presented various new algorithms and data structures developed by the group in the past year, including unique+conserved region detection in genome sequences with unikseq, k-mer repeat profiler ntHits and data structure...
-
GoldRush, Stash and miBF mapper presented at ISMB 2022
conference ·The 30th conference on Intelligent Systems for Molecular Biology (ISMB 2022) will be held in a hybrid format (in Madison, Wisconsin, USA and virtually) from July 10-14, 2022. The Bionformatics Technology Lab will be presenting various new algorithms and data structures developed by the group in the past year. We are introducing our new de novo long read assembly...
-
Scientific abstracts describing long sequencing read genome assembler GoldRush and data structure Stash, accepted for oral presentations at ISMB 2022
conference ·The 30th conference on Intelligent Systems for Molecular Biology (ISMB 2022) is taking place in Madison, WI (USA) July 10-14, 2022 and the BTL group will attend in person to showcase our work on de novo long read genome assembler with linear time complexity, GoldRush, and its key components (golden path algorithm GoldRush-Path, long read scaffolder GoldRush-Link and long...
-
Genome assembly repeat resolution, long read genome scaffolding & polishing, AMP discovery and SARS-CoV-2 variant time maps at ISMB/ECCB 2021
conference ·The joint 29th conference on Intelligent Systems for Molecular Biology and the 20th European Conference on Computational Biology (ISMB/ECCB 2021) meeting is held online July 25-30 and the BTL group is presenting several new bioinformatics technologies and analysis workflow developed by our lab. These include ABySS 2.5 genome assembly, LongStitch genome scaffolding, ntEdit/Sealer genome polishing, <a...
-
Meta-NanoSim, ntJoin, Physlr, RNA-Scoop and RResolver at ISMB 2020
conference ·The annual International Conference on Intelligent Systems for Molecular Biology (ISMB 2020) is held online July 13-16 and the BTL group will be there, presenting several new bioinformatics technologies developed by our lab and including Meta-NanoSim, ntJoin, Physlr, RNA-Scoop and RResolver.
-
ABySS 2.0, Tigmint, ARKS, and Bloom Filters at RECOMB-Seq
conference ·RECOMB 2018 will be taking place in Paris, France, from April 21-24th. RECOMB-Seq is one of its four satellite workshops this year, taking place from April 19-20th, bringing together researchers in computational genomics and bioinformatics to discuss new frontiers in gene sequencing.
jobs
-
Research Programmer (C++) position available
jobs ·The Research Programmer:
• Develops and implements new algorithms
• Works with the BTL’s large C/C++ code base (create/improve modules, etc.)
• Works on complex biological problems in which analysis of DNA and RNA sequence data requires in-depth evaluation
More info available here -
Research Programmer (C++) position available
jobs ·The Research Programmer: •Develops and implements new algorithms •Works with the BTL’s large C/C++ code base (improve existing modules, create new ones, etc.) •Works on complex biological problems in which analysis of sequence data requires in-depth evaluation More info available here
news
-
Introducing AMPd-Up: Enhancing De Novo Antimicrobial Peptide Design with a Recurrent Neural Network
news ·We are excited to announce the publication of “De novo Synthetic Antimicrobial Peptide Design with a Recurrent Neural Network” in the journal Protein Science. In this study, we present AMPd-Up, a novel recurrent neural network tool developed for de novo antimicrobial peptide (AMP) design. AMPd-Up leverages in silico sequence generation to efficiently explore the vast sequence space of...
-
Advancing Peptide Toxicity Prediction with the Structure-Aware Deep Learning Model, tAMPer
news ·We are pleased to announce the publication of tAMPer, a cutting-edge deep learning model for predicting peptide toxicity. Published as “Structure-Aware Deep Learning Model for Peptide Toxicity Prediction” in the journal Protein Science, tAMPer integrates amino acid sequence composition with ColabFold-predicted peptide structures through graph and recurrent neural networks. This model aims to expedite antimicrobial peptide (AMP) discovery...
-
Evidence of HLA-C*04:01 Link to COVID-19 Severity
news ·In our Letter to the Editor published in the journal HLA: Immune Response Genetics, we delve into the statistically significant association between the HLA-C*04:01 allele and COVID-19 severity. This association, initially reported by our group in 2020 and 2021, has been replicated in multiple studies, including the extensive CanCOGeN CGEn HostSeq COVID-19 patient cohort (n=9,460). Our...
-
Recursive amino acid sequence hashing algorithm (aaHash) published in Bioinformatics Advances
news ·Our manuscript showcasing aaHash has been published in the Bioinformatics Advances journal. Hashing algorithm aaHash adapts the ntHash algorithm for amino acid sequences and features different hashing levels to represent the biochemical similarities of amino acids. In our tests, aaHash is ∼10X faster than generic string hashing algorithms.
-
Black spruce genome published in G3 Genes|Genomes|Genetics
news ·Our manuscript presenting and analyzing the black spruce genome has been published in the G3 journal. Black spruce (Picea mariana [Mill.] B.S.P.) is a dominant conifer species in the North American boreal forest that plays important ecological and economic roles and its genome assembly (18.3 Gbp) and annotation (66,332 protein-coding sequences predicted) are valuable resources for forest genetics research...
-
New study published: genomic virulence features of Beauveria bassiana as a biocontrol agent for the mountain pine beetle population
news ·Our study on the genomics and transcriptomics characterization of Beauveria bassiana, an entomopathogenic fungus used as a biological agent in agriculture and forestry, was just published in BMC genomics. B. bassiana is of particular interest in regulating the proliferation of the invasive mountain pine beetle (MPB) Dendroctonus ponderosae, a wood-boring insect native to western North America that attacks a...
-
RNA-Bloom2 manuscript featured as a Nature Communications Editors’ Highlight
news ·Our scientific article introducing reference-free long read transcriptome assembler, RNA-Bloom2, was short-listed by Aline Lueckgen editor at Nature Communications, and featured on the Editors’ Highlights under “Biotechnology and methods”.
-
Manuscript describing unikseq and its application to species monitoring using environmental DNA (eDNA), published
news ·In our study published in the Environmental DNA journal we present unikseq, a comparative genomics utility that uses words of length k (k-mers) to quickly identify unique regions in genome sequences, which can be used to yield highly specific quantitative real-time polymerase chain reaction (qPCR)-based eDNA assays. In our manuscript, we illustrate its application within an animal...
-
De novo long read genome and transcriptome assemblers GoldRush and RNA-Bloom2, published in Nature Communications
news ·Our de novo long read genome assembler, GoldRush, and reference-free long read transcriptome assembler, RNA-Bloom2, were published today. Please refer to the GoldRush Nat. Commun. and RNA-Bloom2 Nat. Commun. manuscripts, respectively. The GoldRush long read assembler marks a paradigm shift in long read de novo assembly of large genomes, generating highly contiguous assemblies using an order...
-
ntLink protocol paper published in Current Protocols
news ·After introducing ntLink, our minimizer mapping-based long-read genome assembly scaffolder, as part of the LongStitch pipeline, we added multiple important new features to the tool, which are described in our recently published paper in Current Protocols. These new ntLink features include overlap detection, gap-filling and liftover-based iterations, each of which enable users the generate higher quality final assemblies....
-
Metagenome long read simulator Meta-NanoSim published in GigaScience
news ·This is our third publication on the NanoSim tool, describing a functionality to simulate nanopore reads for metagenome sequencing experiments. Meta-NanoSim is published in GigaScience and is an integral part of the NanoSim project, freely available from GitHub.
-
Our study relating the antimicrobial activity and predicted structure of AMPs is published
news ·Our manuscript presenting on “Associating Biological Activity and Predicted Structure of Antimicrobial Peptides from Amphibians and Insects” has been published in Antibiotics. Antimicrobial peptides (AMPs) hold great potential as effective alternatives to small molecule antibiotics in the race against antibiotic resistance. In our manuscript we present on the initial discovery of 88 AMPs using our in-house predictors rAMPage...
-
Manuscript describing btllib, our common code library for efficient genomic sequence processing, published
news ·We just published on btllib: a C++ library with Python bindings for efficient genomic sequence processing in The Journal of Open Source Software (JOSS). The btllib library is implemented in C++, includes a high-level & easy-to-use Python interface, and is freely available on GitHub. The btllib common code library includes specialized DNA/RNA/protein (amino acid) sequence-processing algorithms with efficiency...
-
Manuscript describing recursive spaced seed hashing for nucleotide sequences (ntHash2) published
news ·Our manuscript presenting ntHash2: recursive spaced seed hashing for nucleotide sequences has been published in Bioinformatics. ntHash2 builds ontop of our popular k-mer and spaced seed nucleotide sequence hashing algorithm, ntHash, with a faster and improved implementation. ntHash2 is freely available on GitHub.
-
Manuscript describing rAMPage, a novel AMP discovery pipeline, published today
news ·Our manuscript presenting rAMPage: Rapid Antimicrobial Peptide Annotation and Gene Estimation has been published in Antibiotics. Antimicrobial peptides (AMPs) hold great potential as effective alternatives to small molecule antibiotics in the race against antibiotic resistance. In this manuscript, we present rAMPage, a scalable high-throughput bioinformatics pipeline for AMP discovery, and demonstrate its utility in the discovery of 7 active...
-
Spruce giga-genomes published in The Plant Journal
news ·Our manuscript presenting and analyzing four spruce giga-genomes has been published in The Plant Journal. Spruce trees are widespread in the northern hemisphere, and have great importance both economically and in carbon sequestration. In this manuscript, we assembled and annotated the large and highly repetitive genomes of Sitka spruce, Engelmann spruce, white spruce and interior spruce. Comparative analysis of...
-
Manuscript describing Physlr, a tool to generate next-generation physical maps of genomes, published today
news ·Our manuscript, published in the peer-reviewed journal DNA, presents Physlr, a tool that leverages long-range information provided by linked read sequencing technologies to construct next-generation physical maps. These maps have many potential applications in genome assembly and analysis, including, but not limited to, scaffolding. In our study, using experimental linked-read datasets from two humans, we used Physlr to construct...
-
Manuscript describing the genome polishing protocol ntEdit+Sealer, just published
news ·Our manuscript, published in the peer-reviewed journal Current Protocols, presents a fast, scalable and memory-efficient methodology for targeted error resolution and automated finishing of long-read genome assemblies. The ntEdit+Sealer protocol, available on GitHub, was initially designed to polish (any) draft genome assemblies with short sequencing reads. It is being extended to also work with k-mers sourced from erroneous...
-
Spruce weevil genome published
news ·In our peer-reviewed manuscript, just published in the journal G3: Genes, Genomes, Genetics, we present the nuclear and mitochondrial genomes and associated annotations of the forest insect pest Pissodes strobi, commonly known as the spruce weevil or white pine weevil, a major pest of spruce and pine forests in North America. We also describe the genome of an apparent...
-
AMPlify, our machine learning method for predicting putative antimicrobial peptides is published
news ·In our peer-reviewed manuscript, just published in the journal BMC Genomics, we present a novel and robust attentive deep learning model, named AMPlify. In the research manuscript, we show how AMPlify was used to predict antimicrobial peptides (AMPs) from the Rana [Lithobates] catesbeiana (bullfrog) genome and demonstrate the bioactivity of these AMPs against multiple species of bacteria, including multi-drug...
-
RNA-Scoop, our interactive visualization tool for transcripts in single-cell transcriptomes, published in NAR Genomics and Bioinformatics
news ·Our manuscript describing an interactive visualization tool for transcripts in single-cell transcriptomes, RNA-Scoop, was just published in NAR Genomics and Bioinformatics. In the manuscript, we show that RNA-Scoop allows users to examine differential transcript expression across clusters and investigate how usage of specific transcript expression mechanisms varies across cell groups. RNA-Scoop is freely available from GitHub.
-
LongStitch, our pipeline for correcting and scaffolding genome assemblies using long reads, published in BMC Bioinformatics
news ·Our manuscript describing the newly developed long read correction and scaffolding pipeline, LongStitch, was just published in BMC Bioinformatics. LongStitch incorporates multiple tools developed by our group and runs in up to three stages, which includes initial assembly correction (Tigmint-long), followed by two incremental scaffolding stages (ntLink and ARKS-long). Tigmint-long and ARKS-long are misassembly correction and scaffolding utilities, respectively,...
-
Manuscript reporting associations between COVID-19 disease severity and the human leukocyte antigen (HLA) published.
news ·In our peer-reviewed manuscript, just published in the journal PeerJ, we report on observations linking host HLA alleles with COVID-19 disease severity in a New York cohort, some of which (eg. C04:01 and HLA11:01) are corroborated by different research groups and in different COVID-19 patient cohort transcriptome data.
-
Our peer-reviewed research on the application of deep learning approaches to the gap-filling problem in genome assembly (GapPredict) was just published
news ·In our peer-reviewed manuscript, just published in the journal IEEE-TCBB, we report on a novel application, GapPredict, for resolving gaps in genome sequence assemblies. This proof-of concept study demonstrates the practical utility of deep-learning machine learning models for this task.
-
Manuscript describing Straglr, a tool for detecting tandem repeat expansion using long-read sequences, just published at Genome Biology
news ·Our manuscript describing Straglr, a tool for both targeted tandem repeat genotyping and novel expansion detection using long-read sequences, was just published in the peer-reviewed journal Genome Biology.
-
Interactive SARS-CoV-2 mutation timemaps, published at F1000research and available online
news ·A report describing our SARS-CoV-2 variant analysis and interactive SVG mutation maps freely available to the community for browsing and sharing was just published at F1000research. It was announced earlier at arXiv. The maps report nucleotide changes in over 2.5 million SARS-CoV-2 coronavirus genomes (and their effect on gene products) over time and in different continents and...
-
Our report on the HLA profiles of COVID-19 patients at the early stage of the 2020 pandemic, now published in Bioinformatics
news ·Our Letter to the Editor, just published in the peer-reviewed journal Bioinformatics, reports on the HLA profiles derived from the metatranscriptomic RNA-Seq samples of eight COVID-19 patients at the pandemic onset. Our study highlights the central role of HLA in vaccine development and host immunity in the current context, and adds perspective to host susceptibility to SARS-CoV-2, the coronavirus...
-
Manuscript describing RNA-Bloom, single-cell transcriptome assembler, just published at Genome Research
news ·Our study describing RNA-Bloom, a utility for reference-free and reference-guided sequence assembly of single-cell transcriptomes, was just published in the peer-reviewed journal Genome Research.
-
Multiindex Bloom Filter (miBF) data structure manuscript published in PNAS
news ·Our manuscript describing miBF, a probabilistic data structure we developed for alignment-free sequence classification tasks, was published in PNAS. Alignment-free methods, including miBF, have applications ranging from transcript expression analysis, metagenome characterization, to de novo assembly to name a few. They are usually faster than alignment-based methods, but often limited in their sensitivity and memory requirements. In the manuscript...
-
Transcriptome long read simulator trans-NanoSim published in GigaScience
news ·This is our second publication on the NanoSim tool, describing a functionality to simulate nanopore reads for transcriptome sequencing experiments. Trans-NanoSim was published in GigaScience and is an integral part of the NanoSim project, freely available from GitHub.
-
P. sitchensis (Sitka spruce) mitonchondrial genome published in Genome Biology and Evolution
news ·Our study describing the genome sequencing of the Sitka spruce mitochondrial genome was accepted for publication in Genome Biology and Evolution. In the manuscript we present the complete 5.5 Mb genome, one of the largest mitochondrial genome of a gymnosperm, assembled from Oxford Nanopore long reads and describe its complex physical structure.
-
Reference-guided scaffolder ntJoin published in Bioinformatics
news ·Our manuscript describing ntJoin, a fast and lightweight reference-guided scaffolder, was published in Bioinformatics. Instead of alignments, ntJoin uses a mapping approach based on a graph data structure generated from ordered minimizer sketches. ntJoin can be used in a variety of different research applications, including improving a draft assembly with a reference-grade genome, a short-read assembly with a draft...
-
Manuscript describing FusionBloom for detecting transcript fusions, just published
news ·Our study describing FusionBloom, a utility for detecting transcript fusions with utility in cancer diagnostics, was just published in the peer-reviewed journal Bioinformatics.
-
Spruce chloroplast genome manuscripts published
news ·Two studies reporting on the complete chloroplast genomes and gene annotations of white and Engelmann spruce are now published in the peer-reviewed journal Microbiology Resource Announcements.
-
Manuscript describing our fast genome polishing tool ntEdit published
news ·Our study describing ntEdit, a fast and scalable technology to polish and ‘haploidize’ genome sequences, was just published in the peer-reviewed journal Bioinformatics.
-
Bioinformatics container environment for education and research (ORCA), published
news ·We introduce ORCA, a Docker image with hundreds of bioinformatics tools/dependencies, simplifying software installs. ORCA was recently published in the peer-reviewed journal Bioinformatics.
-
Study on bullfrog antimicrobial peptides discovery published
news ·Our latest work on antimicrobial peptide (AMP) discovery in bullfrogs was just published in the peer-reviewed journal Scientific Reports The paper describes the AMP discovery bioinformatics pipeline and functional assays to test their efficacy against microbes.
-
Genome misassembly correction using linked reads (tigmint), published
news ·Our study presenting Tigmint, a bioinformatics utility to detect and correct errors in genome assemblies using linked reads is published in the peer-reviewed journal BMC Bioinformatics.
-
ARKS, linked read kmer genome scaffolder, published
news ·Our latest genome assembly scaffolder, ARKS, was just published in the peer-reviewed journal BMC bioinformatics The paper describes a new read alignment-free methodology that employs kmer-based read mapping and improves assembly runtime. In the paper we present benchmarks on human genome draft assemblies and show how linked reads can improve drafts assembled with the same sequencing data further.