dubesor.de

Here are some links to some of my current projects:

Guestbook

Dubesor LLM Benchmark table

Small-scale manual LLM performance comparison benchmark

NOPE - a Jailbreak puzzle

Can you get the bot to respond properly?

GPT-Chars

Pre-prompted Chatbot characters

LLM Chess

AI models playing chess
(using your own API keys)

Small experiments

Reasoning Token Usage

Long-Chain-of-Thought Reasoning Models and how "Thinking" affects token usage

LLM Chess tournament 2

Using minimalistic raw PGN movetext continuation

LLM Chess tournament

Single-elimination chess tournament between 15 AI models

Short Creative comparison (SOTA)

Respond to an LM Arena armchair critic

Impossible Instructions (SOTA)

washing hands without arms

Impossible thinking task

Concede or continue token waste? (DeepSeek-R1 & Sonnet 3.7 Thinking)

Short RP example comparison (SOTA)

robbing the gas station

0-Shot Web design UI Demo page 3

Creating a simple AI benchmarking website template, how meta

0-Shot Web design UI Demo page 2

Creating a Steins;Gate themed terminal page

0-Shot flappy bird game

(warning: bugs!)