Skip to main content

Posts

Showing posts from 2025

Spring AI's Text-to-Speech API

Photo by Jeffrey Hamilton on Unsplash The Spring AI's Text-to-Speech (TTS) API helps convert text documents into audio files. Currently, Spring AI supports integration only with the OpenAI TTS API . Further, TTS has applications in multiple fields, including e-learning, multilingual support, etc. Numerous industries are adopting this technology to improve communication, accessibility, and operational efficiency. Spring AI provides a good abstraction on top of the underlying OpenAI TTS services . Let's learn more. TTS API Key Classes Before we can build a program using Spring AI's TTS API, let's get familiar with a few important components of the library: The Spring AI framework supports auto-configurations for various model client classes, allowing integration with the underlying LLM Service APIs . This functionality is facilitated through special configuration properties, which we will ...

Spring AI's Transcription API

Photo by Elijah Merrell on Unsplash Some large language models (LLMs) can transcribe audio into text. Different businesses are rapidly adopting this technology and reaping productivity benefits. We've seen glimpses of this technology in Zoom, Microsoft Teams, and other collaboration and communication tools, where call transcriptions can be generated automatically. Further, the entertainment industry is rapidly adopting it in Movies, Advertisements, and other fields. Spring AI aims to provide a unified transcription API that integrates with LLM providers like OpenAI and Azure OpenAI . This article will explore how the Transcription API can be used to transcribe an audio file using OpenAI. Transcription API Key Classes We'll start learning some important Spring AI Transcription API classes: We can divide the components into two groups, one specific to the underlying LLM service provider and the other...

Spring AI Image Model API

Generated by AI Large Language Models (LLMs) such as ChatGPT-4, Claude 2, Llama 2, etc., generate text outputs when invoked with text prompts. However, Image Models such as DALL-E, MidJourney, and Stable Diffusion can generate images from user prompts, which can be either texts or images . All Model providers have their own APIs, and switching between them becomes challenging without a common interface. Luckily, the Spring AI library offers a framework that can help seamlessly integrate with the underlying models. In this article, we'll learn some important components of Spring AI's Image Model API and implement a demo program to generate an image with a prompt. Important Components of Image Model API First, let's look at the important components of the Image Model API that help integrate with the underlying LLM providers such as OpenAI, and Stability AI: Interfaces such as ImageOptions and ImageModel , as well as classes su...

Spring AI Advisors API

Designed by Freepik Most enterprise applications have to deal with cross-cutting concerns like governance, compliance, auditing, and security. That’s where the interceptor pattern addresses these challenges in most development frameworks. Spring AI Advisors specifically benefit AI applications that interact with Large Language Models (LLMs). These advisors act as interceptors for LLM requests and responses. They can intercept outgoing requests to LLM services, modify them, and then forward them to the LLM. Similarly, they can intercept incoming LLM responses, process them, and then deliver them to the application. The advisors can be useful for numerous use cases: Sanitize and validate data before sending them to LLMs Monitor and control LLM usage costs Auditing and logging LLM requests and responses adhere to governance and compliance requirements as part of the respo...