Infra

A Jetson Orin Nano single-board computer on a home lab desk with a monitor in the background showing an AI model benchmark comparison chart, cool ambient LED lighting, no people.

Frontier Models Are Overkill for Most Production Workloads

Topics: AI Models, Open Source, Ollama, Production AI, Infrastructure The trading bot running on my Jetson Orin Nano uses llama3.2:3b for its daily summary task. Not because it was the first model I tried. deepseek-r1:14b at 9GB does not fit the 7.4GB unified memory pool. llama3.1:8b mostly fits and crashes at the edge. llama3.2:3b stays stable at roughly 2GB and writes the summary well. The model writes one paragraph per day: what position the bot holds, what the P&L is, what the trailing stop did. It does that task well. The fact that it is several capability tiers below GPT-5.5 does not show up anywhere in the output. ...

A home developer workstation at night with dual monitors showing API routing configuration code and a cost comparison spreadsheet, dim desk lamp, personal lab setup, no people.

Claude Code on DeepSeek: 17x Cheaper

Topics: Claude Code, DeepSeek, AI Costs, Developer Tools, Open Source Claude Code’s tool ecosystem and the model it runs on are two separate things. A project called DeepClaude treats them that way. DeepClaude intercepts API calls from Claude Code and routes them to DeepSeek V4 instead of Anthropic’s models. The tool layer, file editing, bash execution, session context, autonomous loops, stays intact. The inference backend changes. The cost difference is approximately 17x. ...

A terminal monitor in a dark server room displaying API pricing comparison data in green text, server rack hardware with blinking LEDs in the background, dim ambient lighting.

DeepSeek V4 Broke the Pricing Argument

Topics: AI Models, Open Source, Enterprise Costs, API Pricing Claude Opus 4.7 costs $5 per million input tokens and $25 per million output tokens. GPT-5.5 is $5 input and $30 output. DeepSeek V4, released as open weights on Friday, costs $1.74 input and $3.48 output, runs a 1 million token context window, and scores within a few benchmark points of both on math and Q&A. The pricing argument for closed frontier models just got harder to make. ...

I Built a Trading Bot That Runs Its LLM on a Jetson in My Closet

Topics: Python, Alpaca, Ollama, Jetson Orin Nano, Trading Automation The trading bot watches XNDU every five minutes during market hours. XNDU is a photonic quantum computing company. Photonics means room temperature operation. The cooling infrastructure that makes quantum computing prohibitively expensive at scale is not part of the design. XNDU had solid financials this week and got upgraded to a strong buy. I queued 100 paper shares for the 9:31 AM open on April 30, 10% trailing stop, $5,000 position cap. ...