AI-ToolLab: LLM API Access

Overview

Members of Goethe University can obtain OpenAI-compatible API access to various Large Language Models (LLMs) via the AI-ToolLab.

This service enables experimental access to AI technologies in a protected environment and is provided by studiumdigitale.

Features

OpenAI-compatible API:
Easy integration into existing applications
Free use:
For Goethe University employees (under fair use conditions; restrictions may apply to paid models)
Access to 35+ models:
- various sizes, providers, and hosting locations
- Language models (LLMs) such as OpenAI GPT-4o, Llama 3.3, Qwen2.5
- Code models (Code-LLMs) such as Qwen2.5 Coder, CodeLlama

🔑 Requesting API Access

To gain access to the LLM API, you must complete the following steps

Meet the requirements:
- You must:
  - have a valid HRZ account
  - have successfully completed the training on the critical use of AI (according to the EU AI Act)
- If you are already using a service of the AI ToolLab, you have already completed this step.
Submit a request:
- Send an email to:
  ai-toollab@studiumdigitale.uni-frankfurt.de
- Send us:
  - Your HRZ username
  - The link to your Moodle badge as proof of completing the AI training
Account creation:
- After successful verification, we will create your API account
- You will receive an email with your individual access data

Notes on the
Training on the Critical Use of AI

You can find the training here 📖
Log in with:
- follow the instructions for login to log in with your HRZ account or create a new account.
Enroll in the course
To complete the course, you must successfully complete all learning units
Once you have completed the course, you will receive a badge on Moodle confirming completion

Privacy policy and terms of use

By applying for an API account, you agree to the following measures:

Account creation:
- We will create an account for you in the “LiteLLM” platform hosted by us at GU.
  - For this purpose, we store your HRZ username and email address in the platform and link your account to this data.
  - This is done for the purpose of recognition, any necessary contact, and, if applicable, for billing purposes

API usage:
- When using the API, statistical data is collected about each API call.
  - This includes the exact API endpoint, the time of the request, the input and output token amount, and whether the request was successful or triggered an error.
  - This is done for error analysis and, in the future, possibly for billing purposes.
- At no time do we store the prompts, or other data that you transmit within the requests.

Billing:
- Currently, the offer is completely free of charge. You don’t have to worry about any costs and can freely test the service in this introductory phase.
- You start with a €50 budget, and when this is used up, your access to the API will be restricted, so you don’t have to worry about any costs.
- See “Note: Budget for the use of paid models” for more information

Further Information about the Offer

General Information

Endpoints

We use LiteLLM as an API gateway to provide an OpenAI-compatible interface.
The endpoints are as follows:

Azure-compatible:
https://litellm.s.studiumdigitale.uni-frankfurt.de/

OpenAI-compatible:
https://litellm.s.studiumdigitale.uni-frankfurt.de/v1/

Authentication

For API access, you need an individual API key, which will be provided to you after the request.

You can use this key in combination with the endpoint URL in your API requests.

Available Models

The API currently provides access to 35 different models from three hosting categories.

In principle, the offer includes a mix of commercial and open-source models, optimized in various sizes and for different use cases.

The following models are currently available:

🇪🇺 Azure OpenAI (EU Data Zone)

Hosting: Microsoft Azure in the EU
Costs: Paid (budget-based)

The GU uses the Azure OpenAI Service in the EU Data Zone to provide access to commercial models from OpenAI. These models are generally powerful and offer a wide range of functions.

Models:

`gpt-4o`	OpenAI’s GPT-4o model
`gpt-4o-mini`	Compact GPT-4o version
`o3-mini`	Latest O3-Mini model
`text-embedding-3-large`	Embedding model

🇩🇪 GWDG/KissKI

Hosting: in Germany at the GWDG via the KissKI project and their service chat-ai.academiccloud.de
Costs: Free under fair use conditions

Through the cooperation with GWDG and KissKI, we can offer a variety of open-source and commercial models. These models are hosted locally in Germany and offer high availability.
We have no influence on the availability and number of models, as these are provided by the GWDG.
If a model is not available, we cannot guarantee that it will be available again.

Overview of the GWDG models: GWDG website

Models:

`llama-3.3-70b-instruct`	Recommended for most applications
`qwen2.5-72b-instruct`	High-performance model
`qwen2.5-coder-32b-instruct`	Code development
`qwen2.5-vl-72b-instruct`	Vision & Language
`mistral-large-instruct`	Large instruction model
`deepseek-r1`	Reasoning-specialized (~600B parameters, unfortunately very slow at the moment)
`deepseek-r1-distill-llama-70b`	Faster alternative, Recommended for Reasoning
`qwq-32b`	Specialized in Reasoning
`codestral-22b`	Code-Generation
`gemma-3-27b-it` `internvl2.5-8b` `llama-3.1-sauerkrautlm-70b-instruct` `meta-llama-3.1-8b-instruct` `meta-llama-3.1-8b-rag` `qwen3-235b-a22b`, `qwen3-32b`	Other available models

🏛️ Goethe University / studiumdigitale

Hosting: Locally on a server at studiumdigitale at the GU
Costs: Free

Chat models:

`llama3.1:8b`	Latest Llama version
`llama3:8b`	Proven all-purpose model
`llama2:7b`	Small text model
`mistral:7b`	Compact language model
`codellama:7b`	Code-specialized
`ollama_default`	Standard model (Llama 3.1 8B)

Embedding models:

all-minilm:33m

bge-large:335m

bge-m3:567m

granite-embedding:278m

mxbai-embed-large:335m

nomic-embed-text:v1.5

paraphrase-multilingual:278m

snowflake-arctic-embed2:568m

snowflake-arctic-embed:335m

We update the list depending on availability, demand, and our capabilities.

Instructions: Retrieving current model information

Since the available models change regularly, it is recommended to retrieve the current information directly via the API.

For the following queries, you must have a basic technical knowledge of how to use APIs

/models – Short model overview

Content: Simple list of all available model IDs
Usage: Quick overview of which models are available

Request via CURL:

curl --request GET \
--url http://litellm.s.studiumdigitale.uni-frankfurt.de/models \
--header 'x-litellm-api-key: ihr-api-key'

Example response:

{
  "data": [
    {
      "id": "llama3:8b",
      "object": "model",
      "created": 1677610602,
      "owned_by": "openai"
    },
    {
      "id": "gpt-4o",
      "object": "model",
      "created": 1677610602,
      "owned_by": "openai"
    }
  ]
}

/model/info – Detailed model information

Content: Complete information on all models including hosting and description
Usage: Decision support for model selection

Request via CURL:

curl --request GET \
  --url http://litellm-dev.s.studiumdigitale.uni-frankfurt.de/model/info \
--header 'x-litellm-api-key: ihr-api-key'

Example response (abbreviated):

{
  "data": [
    {
      "model_name": "meta-llama-3.1-8b-rag",
      "model_info": {
        "description": "Meta Llama 3.1 8B RAG",
        "mode": "chat",
        "hosted_by": "KissKI/GWDG in Göttingen, https://kisski.gwdg.de/",
        "input_cost_per_token": 0,
        "output_cost_per_token": 0,
      }
    }
  ]
}

Important information from the API response:

model_name: The exact name for API calls
description: Short description of the model
hosted_by: Hosting provider and location
input_cost_per_token/output_cost_per_token: Cost structure (0 = free)

Note: Hosting locations and data security

⚠️ Important note:
Pay special attention to the hosted_by-Feld, der model/info Route as there are three different hosting locations:

Azure OpenAI Service in EU Data Zone – Microsoft Azure (EU)
KissKI/GWDG in Göttingen – Germany (GWDG)
studiumdigitale, Goethe University Frankfurt – Local (Germany)

Depending on the hosting location, different data protection regulations and security guidelines apply. Choose the model that is suitable for your application according to the sensitivity of your data.

Note: Budget for the use of paid models

For each assigned API key, a starting budget of €50 is currently provided. The allocation of API keys is still in the test phase. Further rules on budget use (e.g. top-up) and a way to view the remaining budget will follow shortly.

The use of GWDG and GU models is free of charge; only the Azure OpenAI models are subject to a fee.

Usage

You can use the API access like a standard OpenAI API access. For this you need your individual API key and the endpoint.

Usage in OpenAI-compatible tools and WebApps

If you are using an OpenAI-compatible tool, and this tool offers the possibility to configure an API key and an OpenAI Proxy/Server/Endpoint, you can use your individual API key and the endpoint. Try both endpoints (with and without /v1/ at the end) to see which one works.

Make sure that you only use the API key with trusted applications, as this grants access to your LLM API account.

!

Usage in different programming languages

You can use the API in any programming language that supports HTTP requests. Here is an example of using the API in Python with the requests library:

Python with requests-Library

import requests

url = "https://litellm.s.studiumdigitale.uni-frankfurt.de/v1/chat/completions"
headers = {
    "Authorization": "Bearer IHR_INDIVIDUELLER_API_KEY",
    "Content-Type": "application/json"
}
data = {
    "messages": [{
        "role": "user",
        "content": [{"type": "text", "text": "Ihr Prompt hier"}]
    }],
    "model": "llama3.1:8b",  # Beispiel-Modell
    "temperature": 0.7
}
response = requests.post(url, headers=headers, json=data)
print(response.json())

Python with openai-Library

from openai import OpenAI

client = OpenAI(
    api_key="IHR_INDIVIDUELLER_API_KEY",
    base_url="https://litellm.s.studiumdigitale.uni-frankfurt.de/v1/"
)

chat_completion = client.chat.completions.create(
    messages=[{
        "role": "user",
        "content": [{"type": "text", "text": "Ihr Prompt hier"}]
    }],
    model="llama3.1:8b",  # Beispiel-Modell
    temperature=0.7
)

print(chat_completion.choices[0].message.content)

with other libraries

For examples of using the API in other programming languages or libraries, please visit the LiteLLM documentation.

Important Notes

⚠️ Data protection: It is not permitted to send personal or sensitive data to the LLMs.

📚 Requirement: Completion of the training on the critical use of AI (according to EU AI-VO)

🎯 Fair-Use: Fair-use restrictions apply to GWDG models – avoid stress tests

Support

If you have any problems or questions, write an email to the AI-ToolLabs team.