Skip to main content

Ollama - Testcontainers Module in Java

Ollama
Photo by Stephanie Cantu on Unsplash

Testcontainers simplify integration testing by providing lightweight and disposable containers for various tools and services. These containerized integrations are referred to as modules in the Testcontainers library. Developers can use them in their JUnit tests to start application environments as needed.

Ollama is a unified platform for running large language models, such as Llama 3.3, DeepSeek R1, and Mistral Small 3.1 in the local test environment. The Testcontainers library now includes an Ollama module, which allows developers to launch Ollama containers within JUnit tests. This is especially useful for testing code that interacts with language models.

In this article, we'll learn how to use the Ollama module on a local laptop with a CPU.

Prerequisites

In this section, we'll discuss the prerequisites before we can use the Testcontainers library.

Container Runtime

Our local environment, where we'll run the test cases, must have a Docker-API compatible container runtime such as Docker, Podman, Colima, or Rancher Desktop.

We'll be running the test program on a Windows machine, hence we must have Docker Desktop running on our laptop:

Docker Desktop


Moreover, it also supports Mac OS and Linux environments.

Maven Dependencies

The Testcontainers library comes with its Maven bill of materials (BOM). We can specify the version information of the BOM while defining it under the dependencyManagement tag:

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.testcontainers</groupId>
            <artifactId>testcontainers-bom</artifactId>
            <version>1.21.1</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

Now, we're ready to import the Maven dependency for Testcontainer's Ollama module without the version information:

<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>ollama</artifactId>
    <scope>test</scope>
</dependency>

The Ollama module version is picked indirectly from the BOM.

JUnit Test Case Integration with Ollama Module

The Testcontainers library includes the OllamaContainer class as part of the Ollama module. Typically, the JUnit test program consists of three stages. First, we set up the container in the initialization stage just before executing the test case. Additionally, we pull the supported image from the Docker repository. Next, the test program integrates with the LLM model running in the Ollama container and executes a prompt. The container is stopped after the test program executes in the final stage.

Let's first discuss the initialization and cleanup stages:

public class OllamaContainerLiveTest {
    private static Logger logger = LoggerFactory.getLogger(OllamaContainerLiveTest.class);

    static OllamaContainer ollamaContainer;
    private static final String OLLAMA_IMAGE = "ollama/ollama:latest";
    private static final String LLM = "tinyllama:1.1b";

    @BeforeAll
    static void init() throws IOException, InterruptedException {

        ollamaContainer = new OllamaContainer(OLLAMA_IMAGE);
        ollamaContainer.withCreateContainerCmdModifier((cmd) ->
          cmd.getHostConfig().withDeviceRequests(null));

        ollamaContainer.start();

        ollamaContainer.execInContainer("ollama", "pull", LLM);
    }

    @AfterAll
    static void cleanUp() {
        if (ollamaContainer != null) {
            try {
                ollamaContainer.stop();
            } catch (Exception e) {
                logger.error("Error stopping Ollama container: {}", e.getMessage());
            }
        }
    }
}

We define the initialization and tear-down stages with the help of the JUnit @BeforeAll and @AfterAll annotations.

First, in the init() method, we instantiate the OllamaContainer class with the ollama/ollama:latest image. Then, we set the device request to null to force the container to run on the laptop's CPU. We can remove this code block to run the container on the laptop's GPU. When we call the OllaContainer#start() method, the Docker engine starts pulling the image, and finally, we see the Ollama container running:

Docker Desktop




Towards the end, we call the OllamaContainer#execInContainer() method to pull the LLM image we wish to use. For this example, we have used the tinyllama:1.1b model. We can use other LLMs by replacing the value of the LLM variable in the program. However, while experimenting with other LLMs, we must consider our local environment's compute and storage configurations.

After the test program completes execution, the cleanUp() method tears down the environment by stopping the container by calling the OllamaContainer#stop() method.

Finally, let's look at the test program that invokes the LLM with a prompt:

@Test
void test() throws IOException, InterruptedException {
    String prompt = """
      Context:
      The sun is a star located at the center of our solar system.
      It provides the Earth with heat and light, making life possible.
      The sun is composed mostly of hydrogen and helium
      and generates energy through nuclear fusion."
      Question: What two gases make up most of the sun?
      Instructions:
      Please answer strictly from the context provided in the prompt
      and no other additional should be provided.
      Also, keep the answer short and concise."
      """;
    Container.ExecResult execResult =
      ollamaContainer.execInContainer("ollama", "run", LLM, prompt);
    assertEquals(0, execResult.getExitCode(), "Exit code should be 0");
    logger.info("Exec Result: {}", execResult.getStdout());
}

The program invokes the OllamaContainer#execInContainer() method to execute the ollama CLI to invoke the underlying LLM service with a prompt. The prompt strictly instructs the LLM to stick to the context provided while answering the question regarding the gases found in the sun. Further, it checks the resulting exit code to confirm successful execution. Finally, it prints the output by calling the ExecResult#getStdOut() method:

Exec Result: Answer: Most of the sun is made up of hydrogen and helium gas.

Conclusion

In this article, we learned about the Testcontainer's Ollama module. This module reduces the cost of running LLMs during the development stage. Moreover, it saves development effort by allowing developers to experiment with open-source LLMs in their local environment.

Visit our GitHub repository to access the article's source code.

Comments

Popular posts from Kode Sastra

Qdrant DB - Spring AI Integration

Designed by Freepik This tutorial covers Spring AI's integration with Qdrant DB . It's an open-source, efficient, and scalable vector database. We'll insert some unstructured data into the vector DB. Then, we'll perform query and delete operations on the DB using the Spring AI framework. Brief Introduction to Qdrant DB It's a highly scalable multi-dimensional vector database with multiple flexible deployment options: Qdrant Cloud offers 100% managed SaaS on AWS, Azure, and GCP and a hybrid cloud variant on the Kubernetes cluster. It provides a unified console, to help create, manage, and monitor multi-node Qdrant DB clusters. It also supports on-premise private cloud deployments. This is for customers who want more control over management and data. Moreover, IAC tools like Terraform and Pulumi enable automated deployment and managemen...

Implement Rag with Spring AI and Qdrant DB

Designed by Freepik Earlier, we discussed Spring AI's integration with Qdrant DB . Continuing on the same lines, we'll explore and try implementing the Retrieval Augmented Generation (RAG) technique using Spring AI and Qdrant DB. We'll develop a chatbot that helps users query PDF documents, in natural language . RAG Technique Several LLMs exist, including OpenAI's GPT and Meta's Llama series, all pre-trained on publicly available internet data. However, they can't be used directly in a private enterprise's context because of the access restrictions to its knowledge base. Moreover, fine-tuning the LLMs is a time-consuming and resource-intensive process. Hence, augmenting the query or prompts with the information from the private knowledge base is the quickest and easiest way out . The application converts the user query into vectors. Then, it fires the q...

Building AI Assistance Using Spring AI's Function Calling API

Photo by Alex Knight on Unsplash Building AI assistance in existing legacy applications is gaining a lot of momentum. An AI assistant like a chatbot can provide users with a unified experience and enable them to perform functionalities across multiple modules through a single interface. In our article, we'll see how to leverage Spring AI to build an AI assistant. We'll demonstrate how to seamlessly reuse existing application services and functions alongside LLM capabilities. Function Calling Concept An LLM can respond to an application request in multiple ways: LLM responds from its training data LLM looks for the information provided in the prompt to respond to the query LLM has a callback function information in the prompt, that can help get the response Let's try to understand the third option, Spring AI's Function calling ...