Simplifying Documentation Updates with AI-Generated Release Notes

Writing comprehensive (and user-friendly) release notes has long been a time-consuming and labor-intensive process for development and product teams, requiring meticulous documentation of new features, bug fixes, and improvements. Recent advances in generative AI, however, have opened up new possibilities. After iterative trials, we have successfully automated the creation of user-friendly release notes with precise formatting by leveraging OpenAI’s cutting-edge GPT-4 model.

Fully Documenting YugabyteDB’s Evolution

YugabyteDB is a distributed SQL database that is fully PostgreSQL compatible. However, adapting every PostgreSQL feature to operate efficiently in a distributed environment necessitates a rigorous re-engineering process. Our engineers meticulously design and enhance these features, resulting in a substantial volume of code modifications for each release. In some instances, a single release can encompass up to 1000 distinct changes.

Documenting these enhancements in a concise, user-friendly manner, with all essential details included, poses a significant challenge. Expecting all engineers to adhere to a uniform documentation format is impractical. Although product managers diligently track major features, the sheer volume of modifications directly committed by engineers can be overwhelming, making it challenging to maintain a comprehensive overview.

Consequently, engineers, product managers, and release managers often find themselves dedicating a significant amount of time to creating accurate and informative release notes before each deployment cycle.

How We Did It – Input Data

Engineers often include extensive details in issue summaries and commit messages beyond what the title indicates. Therefore, we extracted all relevant text for each issue, cleaned it up —removing user names, customer details, and other internal data — and kept the context to 5,000 characters. NOTE: Keeping the context length small is important to get a high-quality output. Using this curated data, we employed Large Language Models (LLMs) to create clear and concise release notes.

For an issue with the title “[#18748] CDCSDK: Change default checkpointing type to EXPLICIT while stream creating“,  we are now able to generate user-friendly release notes such  “Alters the default checkpoint type to `EXPLICIT` during stream creation, ensuring no upgrade or rollback issues due to alterations in the default proto field value.

Check out one of our Release Notes ChangeLog sections to see how consistent and user-friendly these release note lines are now.

It Takes a Village to Design the Right Generative AI Prompt

The prompt is a set of instructions that tells the LLM how to process the text and provide the output in a specific format. Crafting the right prompt was pivotal in generating the release note that aligned with our specifications. Designing the optimal prompt required collaboration among multiple individuals and numerous discussions. This endeavor posed more of a linguistic puzzle than an engineering hurdle. Without the involvement of my co-workers, who are more proficient in English than me, it would have been impossible to ensure the accuracy and effectiveness of the release notes generation.

We assigned the LLM with a persona—  “You are a summary generator for a YugabyteDB code change“. To keep the summary short we had to say “Limit the summary to 150 characters“. To avoid the explanation of the actual code in the release notes, but add other significant information, we had to add: “Do not include implementation details but include any other specific GFlag or configuration adjustments“. To limit the wild usage of adjectives we had to say “Do not change the meaning of the text“. To make it user-friendly, we had to add “use clear and concise grammar and be user-focused and emphasize how the change will benefit the user.

Action Verb: When we first started working on this, the release notes generated started like “This yugabytedb change does/fixes … “. They were full sentences with repetitive statements. We wanted it to be of sentence fragments that Release Notes typically have like “Fixed the issue”. I just could not get the LLM to return the output in that format. We decided to ask the LLM to start the sentences with an action verb in the present tense. And that worked like magic and the sentences started with a variety of verbs like Allows/Mitigates/Enhances/Solves/Introduces etc.

Present Tense: Traditionally, release notes were prefaced with terms such as “Fixed” or “Introduced,” as they were written from the engineers’ viewpoint, documenting changes in the past tense. Recognizing an opportunity to enhance the narrative, we pivoted to position the Release itself as the central focus, adopting the present tense. This strategic shift conveys that when users choose a specific release, it actively resolves an issue or introduces a new feature, fostering a more user-centric and actionable understanding of the release’s impact.

Check out the full prompt on Github. Yes, we could have perfected this by continuing to fine-tune, but we stuck to the age-old saying — perfect is the enemy of good — and saved ourselves some time.

Human Correction

We successfully automated the process for all changes included in a release, and the text generated by AI proved to be high quality. Nonetheless, recognizing that LLMs are still a relatively new technology, we established a spreadsheet containing additional metadata linked to each change. This enabled product managers and release teams to review, edit, and refine the release notes before publication. As time passed, we gained confidence in the quality of the AI-generated text, leading to almost no reviews.

Comparison of LLMs

We tested out multiple LLMs for this project. The prominent ones were GPT-4 (OpenAI), Claude-2 (Anthropic), Mixtral Instruct (Mistral), LLama2 (Meta), Solar (Upstage), and Bard (Google). An LLM is only as good as its ability to process instructions and Zero-shot prompting is critical for automation. Only GPT-4 and Claude-2 demonstrated the capability to initiate sentences with action verbs without requiring additional prompting. However, Claude-2 struggled to capture critical information for most of the issues we provided. GPT-4 emerged as the sole model that could consistently generate responses aligned with our specified prompt structure. It’s worth noting that while Claude-3, released after the completion of this project, exhibits impressive performance, it was not evaluated during the initial assessment phase.

Conclusion

It’s a rare occurrence for technology to significantly streamline tasks for engineers, product teams, and release teams while also delivering benefits to end users. Automating AI to extract key information from issues and process it (in a specific manner) resulted in the creation of user-friendly release notes, saving days of effort across multiple teams.

Ready to experience the power and simplicity of YugabyteDB for yourself?

Sign up for our free cloud DBaaS offering today at cloud.yugabyte.com. No credit card required, just pure database innovation waiting to be unleashed!

Ready to experience the power and simplicity of YugabyteDB for yourself?Sign up for our free cloud DBaaS offering today at cloud.yugabyte.com. No credit card required, just pure database innovation waiting to be unleashed!

Let’s get started!

Related Posts

Explore Distributed SQL and YugabyteDB in Depth

Discover the future of data management.
Learn at Yugabyte University
Get Started
Browse Yugabyte Docs
Explore docs
PostgreSQL For Cloud Native World
Read for Free