It is now possible for non-bioinformaticians to create knowledge networks — a powerful way for biologists to visualize deep connections between genes and phenotypes — quickly and efficiently, thanks to the integration of Rothamsted Research’s open-source KnetMiner software into the Genestack platform.
These new software tools make it easier for plant breeders and others to mine genomics data to find novel ways to improve the performance of all kinds of crops.
“Genotype to phenotype analysis is at the core of what biologists do. With KnetMiner, we have created software that enables biologists to take their own high-throughput experimental data and to see them in the context of all the public knowledge that is out there. This can help them interpret their own data faster and more effectively," Dr. Keywan Hassani-Pak, head of bioinformatics at Rothamsted Research and leader of the KnetMiner project, explained.
“For a particular target species, such as a crop plant, KnetMiner integrates all the relevant genomics and 'omics' information that is present in more than 25 sources under a multitude of formats. KnetMiner brings it together in the form of a heterogeneous knowledge network. We don’t only integrate the data; we also create new relationships based, for example, on co-occurrences of genes and phenotypes in the scientific literature. We are the first in the U.K. to develop such detailed networks and make them mineable. We are talking about up to a million nodes here,” Hassani-Pak said.
Plant scientists and others saw the potential of KnetMiner and approached Rothamsted to help them create a secure system they could use with their own data.
KnetMiner is an exciting visualization tool, but it could take many months for each network to be created for a new species, and it was complex to use. With the benefit of Innovate U.K. funding, Rothamsted worked with Cambridge, U.K.-based Genestack to migrate KnetMiner onto the Genestack platform.
Dr. Misha Kapushesky of Genestack noted, “The Rothamsted researchers could spend months collecting all the data that was available for a particular organism, cleaning the data and writing scripts to transfer it into a format that was usable in KnetMiner and then presenting it so that other scientists could use the information. We migrated the visualization software and automated the collection process by making it part of our Genestack ecosystem. It is now possible to simply ‘point and click’ on data that is in the public domain to create a network and then overlay your own data, using KnetMiner to visualize it. You can build your own network with collaborators in a secure environment. It is no longer a fixed set of data on the Rothamsted website but a dynamic tool that can be made commercially available.”
Genestack now hosts more than 40 plant and crop networks as well as a prototype human disease network. Although it originated in agricultural research, network mining for gene discovery is generic, and Genestack provides an environment for building and distributing these large-scale knowledge networks.
Knowledge networks are a way of showing visually the connection between phenotypes with the genotype of a given species. The nodes are different shapes to represent various biological entities (such as genes, publications or pathways), which are connected by relevant relationships (such as encodes, published_in, interacts_with). They are a very good way to show complex and highly interconnected biological data, the announcement said.
“There are a lot of tools out there that will return a list of ranked genes when you are conducting a gene candidate analysis, and of course, KnetMiner also does that with its evidence-based gene rank algorithm. Most of them also stop there,” Hassani-Pak explained. “KnetMiner is unique as it allows users to see how and why the prediction was made. They can fully understand the results because the process is completely transparent, and the provenance is visualized. There is no black box approach here.”
“Network view” allows users to leverage information present in the network for new discoveries and hypotheses; this, in turn, can spur ideas for further analysis.
Genestack is a bioinformatics company offering a platform for complex multi-omics data and meta-data management, analysis and visualization. The platform transforms the way genomics research and development is done by eliminating routine tasks, tackling inefficiencies and helping its users to overcome the challenges of bioinformatics.
Rothamsted Research is the oldest agricultural research institute in the world. Through independent science and innovation, it makes significant contributions to improving agri-food systems in the U.K. and internationally.