Ollama - Testcontainers Module in Java

Testcontainers simplify integration testing by providing lightweight and disposable containers for various tools and services. These containerized integrations are referred to as modules in the Testcontainers library. Developers can use them in their JUnit tests to start application environments as needed.

Ollama is a unified platform for running large language models, such as Llama 3.3, DeepSeek R1, and Mistral Small 3.1 in the local test environment. The Testcontainers library now includes an Ollama module, which allows developers to launch Ollama containers within JUnit tests. This is especially useful for testing code that interacts with language models.

In this article, we'll learn how to use the Ollama module on a local laptop with a CPU.

Prerequisites

In this section, we'll discuss the prerequisites before we can use the Testcontainers library.

Container Runtime

Our local environment, where we'll run the test cases, must have a Docker-API compatible container runtime such as Docker, Podman, Colima, or Rancher Desktop.

We'll be running the test program on a Windows machine, hence we must have Docker Desktop running on our laptop:

Moreover, it also supports Mac OS and Linux environments.

Maven Dependencies

The Testcontainers library comes with its Maven bill of materials (BOM). We can specify the version information of the BOM while defining it under the dependencyManagement tag:

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.testcontainers</groupId>
            <artifactId>testcontainers-bom</artifactId>
            <version>1.21.1</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

Now, we're ready to import the Maven dependency for Testcontainer's Ollama module without the version information:

<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>ollama</artifactId>
    <scope>test</scope>
</dependency>

The Ollama module version is picked indirectly from the BOM.

JUnit Test Case Integration with Ollama Module

The Testcontainers library includes the OllamaContainer class as part of the Ollama module. Typically, the JUnit test program consists of three stages. First, we set up the container in the initialization stage just before executing the test case. Additionally, we pull the supported image from the Docker repository. Next, the test program integrates with the LLM model running in the Ollama container and executes a prompt. The container is stopped after the test program executes in the final stage.

Let's first discuss the initialization and cleanup stages:

public class OllamaContainerLiveTest {
    private static Logger logger = LoggerFactory.getLogger(OllamaContainerLiveTest.class);

    static OllamaContainer ollamaContainer;
    private static final String OLLAMA_IMAGE = "ollama/ollama:latest";
    private static final String LLM = "tinyllama:1.1b";

    @BeforeAll
    static void init() throws IOException, InterruptedException {

        ollamaContainer = new OllamaContainer(OLLAMA_IMAGE);
        ollamaContainer.withCreateContainerCmdModifier((cmd) ->
          cmd.getHostConfig().withDeviceRequests(null));

        ollamaContainer.start();

        ollamaContainer.execInContainer("ollama", "pull", LLM);
    }

    @AfterAll
    static void cleanUp() {
        if (ollamaContainer != null) {
            try {
                ollamaContainer.stop();
            } catch (Exception e) {
                logger.error("Error stopping Ollama container: {}", e.getMessage());
            }
        }
    }
}

We define the initialization and tear-down stages with the help of the JUnit @BeforeAll and @AfterAll annotations.

First, in the init() method, we instantiate the OllamaContainer class with the ollama/ollama:latest image. Then, we set the device request to null to force the container to run on the laptop's CPU. We can remove this code block to run the container on the laptop's GPU. When we call the OllaContainer#start() method, the Docker engine starts pulling the image, and finally, we see the Ollama container running:

Towards the end, we call the OllamaContainer#execInContainer() method to pull the LLM image we wish to use. For this example, we have used the tinyllama:1.1b model. We can use other LLMs by replacing the value of the LLM variable in the program. However, while experimenting with other LLMs, we must consider our local environment's compute and storage configurations.

After the test program completes execution, the cleanUp() method tears down the environment by stopping the container by calling the OllamaContainer#stop() method.

Finally, let's look at the test program that invokes the LLM with a prompt:

@Test
void test() throws IOException, InterruptedException {
    String prompt = """
      Context:
      The sun is a star located at the center of our solar system.
      It provides the Earth with heat and light, making life possible.
      The sun is composed mostly of hydrogen and helium
      and generates energy through nuclear fusion."
      Question: What two gases make up most of the sun?
      Instructions:
      Please answer strictly from the context provided in the prompt
      and no other additional should be provided.
      Also, keep the answer short and concise."
      """;
    Container.ExecResult execResult =
      ollamaContainer.execInContainer("ollama", "run", LLM, prompt);
    assertEquals(0, execResult.getExitCode(), "Exit code should be 0");
    logger.info("Exec Result: {}", execResult.getStdout());
}

The program invokes the OllamaContainer#execInContainer() method to execute the ollama CLI to invoke the underlying LLM service with a prompt. The prompt strictly instructs the LLM to stick to the context provided while answering the question regarding the gases found in the sun. Further, it checks the resulting exit code to confirm successful execution. Finally, it prints the output by calling the ExecResult#getStdOut() method:

Exec Result: Answer: Most of the sun is made up of hydrogen and helium gas.

Conclusion

In this article, we learned about the Testcontainer's Ollama module. This module reduces the cost of running LLMs during the development stage. Moreover, it saves development effort by allowing developers to experiment with open-source LLMs in their local environment.

Visit our GitHub repository to access the article's source code.

Implement Rag with Spring AI and Qdrant DB

Designed by Freepik Earlier, we discussed Spring AI's integration with Qdrant DB . Continuing on the same lines, we'll explore and try implementing the Retrieval Augmented Generation (RAG) technique using Spring AI and Qdrant DB. We'll develop a chatbot that helps users query PDF documents, in natural language . RAG Technique Several LLMs exist, including OpenAI's GPT and Meta's Llama series, all pre-trained on publicly available internet data. However, they can't be used directly in a private enterprise's context because of the access restrictions to its knowledge base. Moreover, fine-tuning the LLMs is a time-consuming and resource-intensive process. Hence, augmenting the query or prompts with the information from the private knowledge base is the quickest and easiest way out . The application converts the user query into vectors. Then, it fires the q...

Kode Sastra

Search this blog