GitHub recently announced the release of Blackbird, its reworked code search engine built on Rust. This new tool promises to enhance text-based search techniques for code queries, offering developers a faster and more comprehensive exploration of software repositories. It is a game-changer, evolving how programmers navigate, comprehend, and contextualize their code, thus augmenting productivity.
Blackbird, developed from scratch in Rust, was GitHub’s answer to the limitations of existing open-source solutions like Apache Cassandra, Solr, or Elasticsearch. As GitHub needed to manage a scale far beyond what these solutions could handle, they decided to create Blackbird to enhance their existing Elasticsearch cluster without expanding its resource demands. This decision proved fruitful, as Blackbird surpassed its predecessors by supporting up to 640 queries per second and indexing around 120,000 documents per second, enabling it to process 15.5 billion documents in about 36 hours.
The choice to develop Blackbird in Rust was significant. Rust, one of the most popular programming languages, was chosen because of its memory safety features and suitability for performance-sensitive, back-end services. Rust has been shown to reduce potential bugs and provide higher security, making it an excellent choice for critical components such as Blackbird.
Blackbird’s capabilities are not limited to speed alone. It comes with precomputed search indices that map numeric keys to values and supports substring queries, regular expressions, and symbol search. It contextualizes code, prioritizing the most important results, and has a redesigned search interface that combines search, browsing, and code navigation. Blackbird now provides partial coverage across about 45 million of GitHub’s 200 million dynamic code repositories, enabling search across 15 terabytes of code and 15.5 billion documents for programs written in Java, Python, and JavaScript.
GitHub’s Blackbird is praised as a ‘Google Scholar for Code.’ It has surpassed the default GitHub search and other options like Sourcegraph, which are expensive and limited in their search capabilities. With precise filtering, Blackbird makes locating specific text across repositories easy, proving invaluable when identifying specific parts of an application that produced an error message or finding values associated with specific keys in configuration files.
With Blackbird, GitHub has taken a giant leap in improving the user experience for developers, particularly in terms of code search and navigation. This advancement is expected to transform how developers interact with their codebase, helping them to find essential information scattered across their codebase, contextualize that information, and increase productivity. It’s not an exaggeration to say that Blackbird has turned GitHub’s long-standing problem to dust, making it a significant productivity booster for GitHub and a powerful tool for the development community.