In this upgrade of ggKbase, we replaced the listed_features
table (which is the core component that supports lists, genome summaries, and binning) with the Elasticsearch
search engine. Is this a big deal? It’s HUGE!
- Lists are automatically updated to all projects, even the newly added ones.
- No need to update lists when you are adding more projects to the genome summaries.
- More accurate and up-to-date features that match list criteria.
- List management is simplified (and more to come for UI improvements).
- Facets are provided to filter list results. This is also available on the search results page. Facets split apart a search result into categories.
On the tech side, as the listed_features
was growing larger by the day with the increasing number of projects going into ggKbase, it is no longer a scalable solution. Elasticsearch
should provide us with great speed and plenty of room to scale.
With the listed_features
table gone, we can now support using using custom annotation content for list building (i.e. HMMs!). Watch for updates soon to the data ingestion pipeline for adding KEGG HMMs to your projects.