Show HN: Local LLM Notepad – run a GPT-style model from a USB stick
github.comWhat it is A single 45 MB Windows .exe that embeds llama.cpp and a minimal Tk UI. Copy it (plus any .gguf model) to a flash drive, double-click on any Windows PC, and you’re chatting with an LLM—no admin rights, Cloud, or network.
Why I built it Existing “local LLM” GUIs assume you can pip install, pass long CLI flags, or download GBs of extras.
I wanted something my less-technical colleagues could run during a client visit by literally plugging in a USB drive.
How it works PyInstaller one-file build → bundles Python runtime, llama_cpp_python, and the UI into a single PE.
On first launch, it memory-maps the .gguf; subsequent prompts stream at ~20 tok/s on an i7-10750H with gemma-3-1b-it-Q4_K_M.gguf (0.8 GB).
Tick-driven render loop keeps the UI responsive while llama.cpp crunches.
A parser bold-underlines every token that originated in the prompt; Ctrl+click pops a “source viewer” to trace facts. (Helps spot hallucinations fast.)
> walk up to any computer
Windows users seem to think their OS is ubiquitous. But in fact for most hackers reading this site, using Windows is a huge step backwards in productivity and capability.
However the facts speak otherwise? Windows at 70%+ versus 4.1% for Linux globally. https://gs.statcounter.com/os-market-share/desktop/worldwide
Surely you're hinting at Linux, in which case this runs fine with WINE
Interesting, will definitely try it. What can be expected? What other models do perform ok with this?
Why not llamafile? Runs on everything from toothbrushes to toasters...
Wonder if you can use/interface with those coral accelerator boards