LLM API service and gpt.ntnu.no

Table of Contents

    IDUN HPC cluster provides access to LLMs ( running locally ).

    Do you want web interface to work with LLM modes? Visit https://gpt.ntnu.no and read more here https://i.ntnu.no/wiki/-/wiki/Norsk/GPT+NTNU

    Do you need LLM API for research or development, or access to alternative web interfaces? The this page is for you.

    API details:
    Base URL: https://llm.hpc.ntnu.no/
    API provider: LiteLLM or OpenAI compatible
    API Key: send email to help@hpc.ntnu.no to create new.

    LLM models

    Updated: 2026-06-26

    Model nameInput formatCreated byCountryLicenseParametersContext WindowMax inputMax output
    moonshotai/Kimi-K2.6image and textMoonshot AIChinaModified MIT1000B26214417476287381
    NorwAI/NorwAI-Magistral-24B-reasoningtextNorwAI, NTNUNorwayNorLLM License by NTNU24B29491219660898304
    norallm/normistral-11b-thinkingtextNordic Language
    Processing Laboratory (NLPL)
    NorwayApache 2.011B983046553632768
    NbAiLab/borealis-27btextNational Library of NorwayNorwayNB-License27B1310708738043690
    openai/gpt-oss-120btextOpenAIUSAApache 2.0117B1310728738043690
    Qwen/Qwen3-Embedding-8BtextAlibaba CloudChinaApache 2.08B409602048020480
    Qwen/Qwen3.6-27B-FP8image and textAlibaba CloudChinaApache 2.027B26214419660865536
    nvidia/GLM-5.2-NVFP4textZ.aiChinaMIT753B20275213516867584
    mistralai/Mistral-Medium-3.5-128Bimage and textMistral AI SASFranceModified MIT128B26214417476287381
    MiniMaxAI/MiniMax-M3-MXFP8video and image and testMiniMaxChinaminimax-community428B26214417476287381

    NOTE 1: MiniMax-M3 license has commercial use limitations. Please read license before use.

    NOTE 2: context window = max_input_tokens + max_output_tokens. We configure 66% / 33%. For example for model mistralai/Mistral-Large-3-675B-Instruct-2512-NVFP4 with context window 294912 use:
    - max_input_tokens: 196608
    - max_output_tokens: 98304

    How to access

    IDUN LLM models are available from NTNU networks or NTNU VPN.

    You can use IDUN LLM models via Desktop AI applications like:
    - VS Code with extensions (Cline, Kilo Code, Zoo Code )
    - ZED
    - OpenCode
    - Claude Code

    Web chat/agent:
    - Open WebUI: https://idun-llm.hpc.ntnu.no
    - LibreChat: https://ai.hpc.ntnu.no/chat/
    Login with your NTNU short username.

    iPhone and Android applications like Apollo (tested)

    API key:
    We create personal API key for each user. Send e-mail to: help@hpc.ntnu.no

    See more details and examples in this document.

    Rate Limits

    We are launching gpt.ntnu.no and it is using the same LLM modes. API users can make LLMs slow.

    Please do not run heavy research LLM workloads during working hours. Best time for run big jobs:

    • works days between 18:00 – 06:00 (evening, night)
    • weekends without time limits

    3 LLM models with rate limits now:

    • openai/gpt-oss-120b
    • moonshotai/Kimi-K2.6
    • mistralai/Mistral-Large-3-675B-Instruct-2512-NVFP4

    Rate limits (defaults):

    • TPM: 300000 (tokens per minute)
    • RPM: 20 (requests per minute)

    Contact us if you need to change your rate limits.

    What about sensitive data?

    All LLM models on IDUN are running locally and data is not leaving NTNU network.

    • API calls go directly into the model, and they are not stored. (vLLM is started with "--no-enable-prefix-caching" and LLM proxy is started with "cache: False" option)
    • Use API with desktop AI applications.
    • Web interfaces Open WebUI and LibreChat have feature "Temporary Chat". These chats are not saved. "Temporary Chat" sis not enabled by default so users’ questions and answers are stored for user convenience. And user can delete saved conversations manually.
    • There is a plan to officially approve for "røde data", but we have not done the formal assessment yet.

    Temporary chat toggle is located in the top right corner:

    Usage statistics: https://ai.hpc.ntnu.no/stats

    Hardware:

    Server with 8 x B300 288GB (NVFP4 support):
    4 x B300 - moonshotai/Kimi-K2.6
    2 x B300 - nvidia/GLM-5.2-NVFP4
    Server with 8 x H200 141GB
    4 x H200 - MiniMaxAI/MiniMax-M3-MXFP8
    1 x H100 - 
    1 x H100 - openai/gpt-oss-120b
    2 x H100 - mistralai/Mistral-Medium-3.5-128B
    Server with 4 x A100 80GB
    1 x A100 - NbAiLab/borealis-27b
    1 x A100 - NorwAI/NorwAI-Magistral-24B-reasoning
    1 x A100 - google/gemma-4-31B-it
    1 x A100 - Qwen/Qwen3.6-27B-FP8
    Server with 1 x A100 40GB:
    1 x A100 - Qwen/Qwen3-Embedding-8B and intfloat/multilingual-e5-large-instruct

    We use "--kv-cache-dtype fp8" only in Gemma 4 because of the bug and in the Qwen3.6-27B-FP8 because we want to provide full context window on 80GB VRAM GPU.

    Gemma 4 has several unresolved issue with vLLM in production:
    - Gemma 4 hangs with long context: https://github.com/vllm-project/vllm/issues/40002 # --kv-cache-dtype fp8 solve this issue for now
    - Gemma 4 infinite loop answer: https://github.com/vllm-project/vllm/issues/40080 # it happens with original model, but it does not happen with RedHatAI/gemma-4-31B-it-FP8-block

    Web interface (Open WebUI)

    Open WebUI: https://idun-llm.hpc.ntnu.no
    LibreChat: https://ai.hpc.ntnu.no/chat/

    Open WebUILibreChat

    API (OpenAI Compatible)

    Send email and we'll generate new personal API token for you: help@hpc.ntnu.no

    https://llm.hpc.ntnu.no is a LLM gateway to access all LLMs on IDUN HPC cluster. It provides consistent openai compatible API.

    You can see all endpoints on this web page https://llm.hpc.ntnu.no/

    Visual Studio Code - extensions

    There are several popular open source VS Code extensions:
    - Cline (recommended)
    - Kilo Code
    - Zoo Code

    NOTE: Current experience with Roo Code. Extension is getting updates every 1-3 days. And that cases somtimes issue. Things that worked yesterday can stop working today. This was my experience with Roo Code and Mistral models. You can downgrade extension and stop auto update.

    NOTE 1: Some LLM models works better (no errors) with Cline some are better with Roo Code.

    NOTE 2: You can use API Provider "Open API compatible" or "LiteLLM". In most cases they are the same. But for example with Mistral Large 3 in Roo Code "Open API compatible" provider shows sometimes errors. This can be fixed in the next Roo Code update release.

    ClineRoo Code

    VS Code - built-in "agent mode" in Copilot Chat

    You can connect IDUN LLM models to Copilot Chat in VS Code.

    1.

    2.

    3.

    4.

    5. Example configuration:

    [
    	{
    		"name": "IDUN",
    		"vendor": "customendpoint",
    		"apiKey": "${input:chat.lm.secret.-YOUR SECRET HERE}",
    		"apiType": "chat-completions",
    		"models": [
    			{
    				"id": "nvidia/GLM-5.2-NVFP4",
    				"name": "nvidia/GLM-5.2-NVFP4",
    				"url": "https://llm.hpc.ntnu.no",
    				"toolCalling": true,
    				"vision": true,
    				"maxInputTokens": 135168,
    				"maxOutputTokens": 67584
    			},
    			{
    				"id": "mistralai/Mistral-Medium-3.5-128B",
    				"name": "mistralai/Mistral-Medium-3.5-128B",
    				"url": "https://llm.hpc.ntnu.no",
    				"toolCalling": true,
    				"vision": true,
    				"maxInputTokens": 174762,
    				"maxOutputTokens": 87381
    			},
    			{
    				"id": "moonshotai/Kimi-K2.6",
    				"name": "moonshotai/Kimi-K2.6",
    				"url": "https://llm.hpc.ntnu.no",
    				"toolCalling": true,
    				"vision": true,
    				"maxInputTokens": 174762,
    				"maxOutputTokens": 87381
    			},
    			{
    				"id": "MiniMaxAI/MiniMax-M3-MXFP8",
    				"name": "MiniMaxAI/MiniMax-M3-MXFP8",
    				"url": "https://llm.hpc.ntnu.no",
    				"toolCalling": true,
    				"vision": true,
    				"maxInputTokens": 174762,
    				"maxOutputTokens": 87381
    			}
    		]
    	}
    ] 

    Comparing LLM models undenstanding and coding capabilities. Text prompt (Source https://z.ai/blog/glm-4.7):

    Design a richly crafted voxel-art environment featuring an ornate pagoda set within a vibrant garden.
    Include diverse vegetation—especially cherry blossom trees—and ensure the composition feels lively, colorful, and visually striking.
    Use any voxel or WebGL libraries you prefer, but deliver the entire project as a single, self-contained HTML file that I can paste and open directly in Chrome.

    Desktop AI applications

    BYOK (Bring Your Own Keys) AI applications. Test several applications. Looking for features:

    • Application can connect to LLM models via API on IDUN
    • Open Source
    • Can work with local documents
    • Can use local web search
    • Can send images for recognition

    These applications was tested:

    Desktop AI applicationLicenseComment
    Open WebUIpermissive license
    with branding protection
    It can be installed on a local computer. And accessed
    via web browser http://localhost:8080.
    Uses embedding model to work with local documents.
    WitsyAGPL-3.0 licenseUses embedding model to work with local documents.
    Cherry StudioAGPL-3.0 licenseUses embedding model to work with local documents.
    Anything LLMMIT LicenseUses embedding model to work with local documents.
    ZEDOpen sourceZed is a minimal code editor with AI support out of the box.
    It is designed to work with code. It is not using embedding model
    but it understand question and finds answer in local files with LLM model.
    Visual Studio CodeMIT licenseVS Code is a code editor. It can connec to LLM API with
    extensions like Cline, Roo Code, Kilo Code....
    It is not using embedding model but it understand question
    and finds answer in local files with LLM model.
    Click to show: Configuration example local Open WebUI

    Install guide: https://docs.openwebui.com/getting-started/quick-start

    Settings location: Click User icon > Admin panel > Settings

    Screenshot main interface http://localhost:8080:

    Connect to IDUN LLMs:

    Configure search engine:

    Configure Embedding model for local documents:

    Add local documents to a knowledge base:

    Add image generation model:

    Click to show: Configuration example Witsy

    Main interface:

    Connect to IDUN LLM

    Add embedding model for local documents:

    Click to show: Configuration example Cherry Studio

    Main interface:

    Connect to IDUN LLM

    Configure web search engine:

    Add directory to knowledge base:

    Click to show: Configuration example Anything LLM

    Main interface

    Connect to IDUN LLM:

    Add embedding model:

    Configure web search:

    Add directory to knowledge base:

    Zed Editor

    Main interface:

    Connect IDUN LLM:

    iPhone (iOS) and Android application

    Example configuration application Apollo:

    API examples with curl and Python

    curl

    Test API token - get model list:

    curl https://llm.hpc.ntnu.no/v1/models -H "Authorization: Bearer sk-..MY..PESONAL..API..TOKEN.."

    Test chat response:

    curl https://llm.hpc.ntnu.no/v1/chat/completions -H "Authorization: Bearer sk-..MY..PESONAL..API..TOKEN.." -H "Content-Type: application/json" -d '{
        "model": "openai/gpt-oss-120b",
        "messages": [
          {"role": "user", "content": "Who are you?"}
        ]
      }'

    Example with curl command - embedding:

    curl https://llm.hpc.ntnu.no/v1/embeddings -H "Authorization: Bearer sk-..MY..PESONAL..API..TOKEN.." -H "Content-Type: application/json" -d '{
        "model": "Qwen/Qwen3-Embedding-8B",
        "input": ["hello world", "this is another sentence"]
      }'

    Python - openai module

    This example will use Python module openai. First create Python virtual environment and install openai module:

    python3 -m venv venv-openai
    source venv-openai/bin/activate
    pip install openai

    Create file chat-tools.py with code example with tool calling:

    import openai
    import json
    import datetime
    
    client = openai.OpenAI(
        base_url="https://llm.hpc.ntnu.no/v1",
        api_key="sk-..MY..PESONAL..API..TOKEN.."
    )
    
    def get_current_time():
        current_datetime = datetime.datetime.now()
        return f"Current Date and Time: {current_datetime}"
    
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_current_time",
                "description": "Get current date and time"
            },
        }
    ]
    
    response = client.chat.completions.create(
        model="openai/gpt-oss-120b",
        messages=[{"role": "user", "content": "What's the time right now?"}],
        tools=tools
    )
    
    # Process the response
    response_message = response.choices[0].message
    
    if response_message.tool_calls:
        for tool_call in response_message.tool_calls:
            function_name = tool_call.function.name
            function_args = json.loads(tool_call.function.arguments)
    
            if function_name == "get_current_time":
                time_info = get_current_time()
                print(f"Tool call executed: {function_name}() -> {time_info}")
            else:
                print(f"Unknown tool call: {function_name}")
    else:
        print(f"Model response (no tool call): {response_message.content}")

    Example output:

    $ python3 chat-tools.py
    Tool call executed: get_current_time() -> Current Date and Time: 2025-12-29 12:40:34.581986

    Python - litellm

    Example to with Python module litellm. First create Python virtual environment and install litellm module:

    python -m venv venv-litellm
    source venv-litellm/bin/activate
    pip install litellm

    Set environment variables:

    export LITELLM_MODEL=openai/openai/gpt-oss-120b
    export LITELLM_API_BASE=https://llm.hpc.ntnu.no/v1
    export LITELLM_API_KEY=sk-v...MY...API...KEY...Q

    Create script file question.py:

    import os
    import litellm
    
    MODEL = os.getenv("LITELLM_MODEL")
    API_BASE = os.getenv("LITELLM_API_BASE")
    API_KEY = os.getenv("LITELLM_API_KEY")
    
    response = litellm.completion(
        model=MODEL,
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Where is located Nidarosdomen?"}
        ],
        api_base=API_BASE,
        api_key=API_KEY,
        temperature=0,
        drop_params=True,
    )
    
    print("### RESPONSE  ###")
    print(response)
    print("### CONTENT   ###")
    print(response.choices[0].message.content)
    print("### REASONING ###")
    print(response.choices[0].message.reasoning_content)

    Example output:

    $ python question.py
    ### RESPONSE  ###
    ModelResponse(id='chatcmpl-b278f109ed072fe8', created=1779180464, model='openai/gpt-oss-120b', object='chat.completion', system_fingerprint=None, choices=[Choices(finish_reason='stop', index=0, message=Message(content='Nidarosdomen, also known as Nidaros Cathedral, is located in the city of **Trondheim** in central **Norway**. It sits on the banks of the River Nidelva, right in the historic center of Trondheim, near the old city gate (Munkholmen) and the Archbishop?s Palace. The cathedral is a major landmark and pilgrimage site, built on the burial place of Saint\u202fOlav, Norway?s patron saint.', role='assistant', tool_calls=None, function_call=None, reasoning_content='The user asks: "Where is located Nidarosdomen?" They likely want location of Nidaros Cathedral (Nidarosdomen). It\'s in Trondheim, Norway. Provide answer.', provider_specific_fields={'refusal': None, 'reasoning': 'The user asks: "Where is located Nidarosdomen?" They likely want location of Nidaros Cathedral (Nidarosdomen). It\'s in Trondheim, Norway. Provide answer.', 'reasoning_content': 'The user asks: "Where is located Nidarosdomen?" They likely want location of Nidaros Cathedral (Nidarosdomen). It\'s in Trondheim, Norway. Provide answer.'}), provider_specific_fields={})], usage=Usage(completion_tokens=143, prompt_tokens=88, total_tokens=231, completion_tokens_details=None, prompt_tokens_details=None), service_tier=None)
    ### CONTENT   ###
    Nidarosdomen, also known as Nidaros Cathedral, is located in the city of **Trondheim** in central **Norway**. It sits on the banks of the River Nidelva, right in the historic center of Trondheim, near the old city gate (Munkholmen) and the Archbishop?s Palace. The cathedral is a major landmark and pilgrimage site, built on the burial place of Saint?Olav, Norway?s patron saint.
    ### REASONING ###
    The user asks: "Where is located Nidarosdomen?" They likely want location of Nidaros Cathedral (Nidarosdomen). It's in Trondheim, Norway. Provide answer.

    OpenCode - configuration example

    Instead of hard coding your API key in to the configuration file add your API key into the environment variables with command:

    export NTNU_API_KEY=sk-...Personal...API.KEY...

    Create config file: ~/.config/opencode/config.json

    {
        "provider": {
            "idun-llm": {
                "npm": "@ai-sdk/openai-compatible",
                "name": "NTNU LLM",
                "options": {
                    "baseURL": "https://llm.hpc.ntnu.no/v1",
                    "apiKey": "{env:NTNU_API_KEY}"
                },
                "models": {
                    "Qwen/Qwen3.5-122B-A10B-FP8": {
                        "name": "Qwen3.5 122B A10B FP8",
                        "modalities": {
                            "input": ["text", "image"],
                            "output": ["text"]
                        }
                    },
                    "mistralai/Mistral-Large-3-675B-Instruct-2512-NVFP4": {
                        "name": "Mistral Large 3 675B Instruct",
                        "modalities": {
                            "input": ["text", "image"],
                            "output": ["text"]
                        }
                    },
                    "zai-org/GLM-4.7-FP8": {
                        "name": "GLM 4.7 FP8"
                    },
                    "moonshotai/Kimi-K2.6": {
                        "name": "Kimi K2.6",
                        "modalities": {
                            "input": ["text", "image"],
                            "output": ["text"]
                        }
                    }
                }
            }
        }
    }

    Extended example with custom agents:

    {
      "$schema": "https://opencode.ai/config.json",
        "agent": {
          "custom_reproducible": {
            "mode": "primary",
            "temperature": 0,
            "top_p": 1.0,
            "top_k": 1,
            "min_p": 0.0,
            "presence_penalty": 0.0,
            "repetition_penalty": 1.0,
            "seed": 42,
            "prompt": "You are an expert software engineer."
          },
          "custom_qwen35_precise_coding": {
            "mode": "primary",
            "temperature": 0.6,
            "top_p": 0.95,
            "top_k": 20,
            "min_p": 0.0,
            "presence_penalty": 0.0,
            "repetition_penalty": 1.0,
            "prompt": "You are an expert software engineer."
          }
        },
        "provider": {
            "idun-llm": {
                "npm": "@ai-sdk/openai-compatible",
                "name": "IDUN LLM",
                "options": {
                    "baseURL": "https://llm.hpc.ntnu.no/v1",
                    "apiKey": "{env:NTNU_API_KEY}",
                    "timeout": 600000
                },
                "models": {
                    "mistralai/Mistral-Large-3-675B-Instruct-2512-NVFP4": {
                        "name": "Mistral Large 3 675B Instruct",
                        "temperature": true,
                        "modalities": {
                            "input": ["text", "image"],
                            "output": ["text"]
                        }
                    },
                    "zai-org/GLM-4.7-FP8": {
                        "name": "GLM 4.7 FP8",
                        "temperature": true
                    },
                    "moonshotai/Kimi-K2.6": {
                        "name": "Kimi K2.6",
                        "temperature": true,
                        "modalities": {
                            "input": ["text", "image"],
                            "output": ["text"]
                        }
                    },
                    "Qwen/Qwen3.5-122B-A10B-FP8": {
                        "name": "Qwen/Qwen3.5-122B-A10B-FP8",
                        "temperature": true,
                        "modalities": {
                            "input": ["text", "image"],
                            "output": ["text"]
                        }
                    }
                }
            }
        }
    }

    Claude Code - configuration example

    Install Claude Code. Instruction: https://code.claude.com/docs/en/quickstart.

    I used this commands on Mac:

    curl -fsSL https://claude.ai/install.sh | bash
    echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc && source ~/.zshrc

    IMPORTANT (to start without subscription): only first time, add this environment variable before starting claude:

    export ANTHROPIC_AUTH_TOKEN="ollama"

    One model configuration

    Create config file settings.json in the newly created .claude directory in the home directory .claude/settings.json with these lines:

    {
    "env": {
      "ANTHROPIC_BASE_URL": "https://llm.hpc.ntnu.no",
      "ANTHROPIC_AUTH_TOKEN": "sk-...YOUR_API_KEY....",
      "API_TIMEOUT_MS": "3000000",
      "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": 1,
      "ANTHROPIC_MODEL": "zai-org/GLM-4.7-FP8",
      "ANTHROPIC_SMALL_FAST_MODEL": "zai-org/GLM-4.7-FP8",
      "ANTHROPIC_DEFAULT_SONNET_MODEL": "zai-org/GLM-4.7-FP8",
      "ANTHROPIC_DEFAULT_OPUS_MODEL": "zai-org/GLM-4.7-FP8",
      "ANTHROPIC_DEFAULT_HAIKU_MODEL": "zai-org/GLM-4.7-FP8"
      }
    }

    Test. Create directory "garden" and change directory:

    mkdir garden
    cd garden

    Start Claude Code:

    claude

    Multi-model - configuration:

    {
    "env": {
      "ANTHROPIC_BASE_URL": "https://llm.hpc.ntnu.no/",
      "ANTHROPIC_AUTH_TOKEN": "sk-...YOUR_API_KEY....",
      "API_TIMEOUT_MS": "3000000",
      "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": 1,
      "CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1",
      "ANTHROPIC_MODEL": "zai-org/GLM-4.7-FP8",
      "ANTHROPIC_DEFAULT_OPUS_MODEL": "moonshotai/Kimi-K2.6",
      "ANTHROPIC_DEFAULT_SONNET_MODEL": "zai-org/GLM-4.7-FP8",
      "ANTHROPIC_DEFAULT_HAIKU_MODEL": "mistralai/Devstral-Small-2-24B-Instruct-2512",
      "ANTHROPIC_CUSTOM_MODEL_OPTION": "Qwen/Qwen3.5-122B-A10B-FP8"
      }
    }

    NOTE: In this example Claude Code can automatically switch between models. For example for /plan mode it will switch to configured OPUS_MODEL.

    Mistral Vibe - configuration example

    Add you API key to the shell environment befor starting vibe.

    export OPENAI_API_KEY="your_api_key_here"

    Start vibe first time: type random API key, exit. This will create configuration file: ~/.vibe/config.toml. Edit config.toml

    # Modify this line to change default model:
    
    active_model = "zai-org/GLM-4.7-FP8"
    
    # Add this lines to add provider and models
    
    [[providers]]
    name = "IDUN"
    api_base = "https://llm.hpc.ntnu.no/"
    api_key_env_var = "OPENAI_API_KEY"
    api_style = "openai"
    backend = "generic"
    reasoning_field_name = "reasoning_content"
    project_id = ""
    region = ""
    
    [[models]]
    name = "zai-org/GLM-4.7-FP8"
    provider = "IDUN"
    alias = "zai-org/GLM-4.7-FP8"
    input_price = 0.4
    output_price = 2.0
    thinking = "off"
    auto_compact_threshold = 200000
    
    [[models]]
    name = "Qwen/Qwen3.5-122B-A10B-FP8"
    provider = "IDUN"
    alias = "Qwen/Qwen3.5-122B-A10B-FP8"
    input_price = 0.4
    output_price = 2.0
    thinking = "off"
    auto_compact_threshold = 200000

    Pi.dev (pi-mono)

    Pi is a minimal terminal coding harness. Install Node.js befor installing Pi.

    Node.js install: https://nodejs.org/en/download

    Install Pi: Follow steps on pi.dev or git:
    https://pi.dev
    https://github.com/badlogic/pi-mono/

    Create configuration file ~/.pi/agent/models.json

    Example:

    {
      "providers": {
        "NTNU IDUN HPC": {
          "baseUrl": "https://llm.hpc.ntnu.no/",
          "api": "openai-completions",
          "apiKey": "sk-B....YOUR...API...KEY.....Q",
          "models": [
            { "id": "nvidia/GLM-5.2-NVFP4" },
            { "id": "mistralai/Mistral-Medium-3.5-128B" },
            { "id": "MiniMaxAI/MiniMax-M3-MXFP8" },
            { "id": "moonshotai/Kimi-K2.6" },
            { "id": "Qwen/Qwen3.6-27B-FP8" }
          ]
        }
      }
    }

    Test prompts:

    Prompt - 3D lighthouse
    Design a richly crafted voxel-art environment featuring a lighthouse on an island. There should be seagulls and boats. Include diverse vegetation on the island - and ensure the composition feels lively, colorful, and visually striking. Use any voxel or WebGL libraries you prefer, but deliver the entire project as a single, self-contained HTML file that I can open directly in web browser.

    Prompt - 3D solar system:
    Create single HTML file with a 3d rotating solar system to open in web browser. Show planet text information when mouse is ower the planet. It will be possible to zoom and rotate with mouse.

    Prompt - 3D racing game:
    Design and create a 3D highway racing game. The game must feature 3D graphics in any style you choose. A Start Screen that allows the user to select the car they will use. The user may select from three potential options as follows: A Sports Car, A Sedan, An option of your choosing. Each Car must have realistic limitations on its performance (e.g., top speed, acceleration, handling), which should also be displayed graphically on the car selection screen. Once the car is selected and the game starts, the player's car will begin driving on a busy highway. The player must navigate through dynamic traffic, swerving between lanes to avoid collisions. There MUST be a visible "nitrous oxide" or speed boost effect when used, as well as functional damage implementation for the player's car from collisions. If the player successfully navigates through the traffic for a set distance/time, the level repeats with increased difficulty (e.g., denser traffic, higher speeds, adverse weather). If the player's car sustains critical damage or crashes, the vehicle becomes uncontrollable (e.g., spins out, rolls over), and the screen returns to the home screen following a 2-second black screen. You may use any library for this implementation, but it must be contained within a single script, and be able to be opened and played in the Chrome browser.

    Prompt - rotating 3D globe:
    Create a single HTML file that sets up a basic Three.js scene with a rotating 3D globe. The globe should have high detail (64 segments), use a placeholder texture for the Earth's surface, and include ambient and directional lighting for realistic shading. Implement smooth rotation animation around the Y-axis, handle window resizing to maintain proper proportions, and use antialiasing for smoother edges. Explanation: Scene Setup: Initializes the scene, camera, and renderer with antialiasing. Sphere Geometry: Creates a high-detail sphere geometry (64 segments). Texture: Loads a placeholder texture using THREE. TextureLoader. Material & Mesh: Applies the texture to the sphere material and creates a mesh for the
    globe. Lighting: Adds ambient and directional lights to enhance the scene's realism. Animation: Continuously rotates the globe around its Y-axis. Resize Handling : Adjusts the renderer size and camera aspect ratio when the window is resized.

    Raspberry PI - Connect to LLM API from outside campus network.

    Alternative to VPN is sshuttle command line tool. It allows to access remote network via SSH protocol.

    NOTE 1: sshuttle requires root or sudo access.

    NOTE 2: terminal with running sshuttle command should stay opened to keep connecting alive.

    Install sshuttle on Raspberry PI:

    sudo apt install sshuttle

    Connect for students:

    sudo sshuttle -r USERNAME@login.stud.ntnu.no 129.241.121.16/32

    Connect for employees:

    sudo sshuttle -r USERNAME@login.ansatt.ntnu.no 129.241.121.16/32

    129.241.121.16 is an IP address for llm.hpc.ntnu.no. So you will be able for reach https://llm.hpc.ntnu.no/… endpoints.

    RAG with Open WebUI

    Open WebUI: https://idun-llm.hpc.ntnu.no

    RAG - Retrieval-Augmented Generation. It is an AI framework that connects Large Language Models to external knowledge sources for example documents.

    Create new Folder in the Open WebUI:

    Git folder a name and upload documents:

    Open WebUI is using model: Qwen/Qwen3-Embedding-8B for embedding.

    Ask you question inside that folder:

    Scroll to Top