A Jetson Orin Nano single-board computer on a home lab desk with a monitor in the background showing an AI model benchmark comparison chart, cool ambient LED lighting, no people.

Frontier Models Are Overkill for Most Production Workloads

Topics: AI Models, Open Source, Ollama, Production AI, Infrastructure The trading bot running on my Jetson Orin Nano uses llama3.2:3b for its daily summary task. Not because it was the first model I tried. deepseek-r1:14b at 9GB does not fit the 7.4GB unified memory pool. llama3.1:8b mostly fits and crashes at the edge. llama3.2:3b stays stable at roughly 2GB and writes the summary well. The model writes one paragraph per day: what position the bot holds, what the P&L is, what the trailing stop did. It does that task well. The fact that it is several capability tiers below GPT-5.5 does not show up anywhere in the output. ...

May 12, 2026 · Mario Martinez Jr.

The Ethical AI Company Billed You for Using Competitor Tools

Topics: Anthropic, Claude Code, AI Ethics, Billing, Vendor Trust Anthropic’s detection logic found “hermes.md” in a user’s git commit history. The user was on the $200/month Claude Max plan with 86% of their usage allocation untouched and no active session running. Anthropic billed $200.98 in extra charges. When the user reported it, support acknowledged the billing error three times and refused the refund. The post reached 1.4 million views. Anthropic then issued the refund plus one month of credit. ...

May 10, 2026 · Mario Martinez Jr.
A home developer workstation at night with dual monitors showing API routing configuration code and a cost comparison spreadsheet, dim desk lamp, personal lab setup, no people.

Claude Code on DeepSeek: 17x Cheaper

Topics: Claude Code, DeepSeek, AI Costs, Developer Tools, Open Source Claude Code’s tool ecosystem and the model it runs on are two separate things. A project called DeepClaude treats them that way. DeepClaude intercepts API calls from Claude Code and routes them to DeepSeek V4 instead of Anthropic’s models. The tool layer, file editing, bash execution, session context, autonomous loops, stays intact. The inference backend changes. The cost difference is approximately 17x. ...

May 7, 2026 · Mario Martinez Jr.
An empty hospital emergency triage station at night, medical monitors showing vital signs, a diagnostic computer terminal on the desk, dim clinical lighting, no people present.

AI Outperformed ER Doctors in a Harvard Trial

Topics: AI, Healthcare, Clinical Trials, Emergency Medicine Listen to this article Harvard ran a controlled trial of AI performance in emergency triage and published the results this week. The AI outperformed emergency physicians on diagnostic accuracy. Most of the conversations that follow a result like this focus immediately on liability. That conversation is worth having. It is not the most important one. What Emergency Triage Actually Tests Emergency triage is decision-making under a specific set of conditions: incomplete information, time pressure, high consequence, and compounding cognitive load from case volume. A physician who has seen 40 patients in a shift is making probabilistic judgments under fatigue in a way that a physician at the start of a shift is not. ...

May 6, 2026 · Mario Martinez Jr.

The 47 Percent Debugging Skill Drop

Topics: AI Coding Agents, Developer Skills, Claude Code, Software Engineering Anthropic published research this year showing that developers who leaned heavily on AI coding agents experienced a 47% drop in debugging skills. The finding that made it uncomfortable is in the same document: supervising an AI coding agent effectively requires the exact debugging skills that atrophy from using one. You need the skill to catch what the agent gets wrong. Using the agent is what costs you the skill. ...

May 5, 2026 · Mario Martinez Jr.
A terminal monitor in a dark server room displaying API pricing comparison data in green text, server rack hardware with blinking LEDs in the background, dim ambient lighting.

DeepSeek V4 Broke the Pricing Argument

Topics: AI Models, Open Source, Enterprise Costs, API Pricing Claude Opus 4.7 costs $5 per million input tokens and $25 per million output tokens. GPT-5.5 is $5 input and $30 output. DeepSeek V4, released as open weights on Friday, costs $1.74 input and $3.48 output, runs a 1 million token context window, and scores within a few benchmark points of both on math and Q&A. The pricing argument for closed frontier models just got harder to make. ...

May 3, 2026 · Mario Martinez Jr.

I Built a Trading Bot That Runs Its LLM on a Jetson in My Closet

Topics: Python, Alpaca, Ollama, Jetson Orin Nano, Trading Automation The trading bot watches XNDU every five minutes during market hours. XNDU is a photonic quantum computing company. Photonics means room temperature operation. The cooling infrastructure that makes quantum computing prohibitively expensive at scale is not part of the design. XNDU had solid financials this week and got upgraded to a strong buy. I queued 100 paper shares for the 9:31 AM open on April 30, 10% trailing stop, $5,000 position cap. ...

April 30, 2026 · Mario Martinez Jr.
Dark terminal screen showing a root shell prompt, with kernel source code diff on a second monitor, lit only by cold monitor glow.

CVE-2026-31431: The Optimization That Opened Root

A 732-byte Python script dropped today gives any unprivileged user a root shell on every mainstream Linux distribution running a kernel built after 2017. No race condition. No kernel-specific offsets. A straight logic flaw in code that has been shipping on your servers, your CI/CD runners, and your cloud instances for eight years. The vulnerability is CVE-2026-31431. The researchers named it Copy Fail. Here is what happened. The AF_ALG Interface In 2003, the Linux kernel crypto API grew a socket interface: AF_ALG. The idea was sound — expose kernel crypto primitives to userland without requiring applications to link against third-party crypto libraries. You open an AF_ALG socket, set the algorithm, feed it data, get results back. Clean separation between userland and kernel. ...

April 29, 2026 · Mario Martinez Jr.
Terminal screen displaying x86 assembly opcodes alongside their mnemonics, with one instruction highlighted as the current execution point.

TryHackMe: x86 Assembly Crash Course

Author: Mario Martinez Jr. (ku5e / Gary7) | TryHackMe USA Rank #76 | Top 1% Difficulty: Easy Topics: x86 Assembly, Opcodes, MOV/LEA/NOP, Arithmetic Instructions, Logical Instructions, Flags, Conditionals, Branching, Stack Operations, Function Calls Link: x86 Assembly Crash Course on TryHackMe Answers are redacted within the narrative to allow you to complete the tasks on your own, but a full table of answers is available at the end of this walkthrough. Assembly is the lowest level of human-readable language and the highest level a compiled binary can be reliably decompiled to. When you open a malware sample in Ghidra or x64dbg, you are reading assembly. There is no layer above it. This room covers the instructions you will see on every analysis: MOV, LEA, NOP, ADD, SUB, XOR, CMP, TEST, JMP, PUSH, POP, and CALL. Complete the x86 Architecture Overview room first if you have not already. ...

April 13, 2026 · Mario Martinez Jr.
Debugger terminal displaying x86-64 register values with a coffee mug and handwritten notes in the foreground.

TryHackMe: x86 Architecture Overview

Author: Mario Martinez Jr. (ku5e / Gary7) | TryHackMe USA Rank #76 | Top 1% Difficulty: Easy Topics: CPU Architecture, x86 Registers, Memory Layout, Stack Analysis, Malware Analysis Fundamentals Link: x86 Architecture Overview on TryHackMe Answers are redacted within the narrative to allow you to complete the tasks on your own, but a full table of answers is available at the end of this walkthrough. This room gives you the mental model that makes malware analysis readable. Before you open a binary in Ghidra or step through a sample in x64dbg, you need to know what the CPU is actually doing with its registers and memory. The room covers Von Neumann architecture, x86 registers from EAX down to the segment registers, the four-section memory layout, and the stack. It takes about an hour. If you plan to do any serious reverse engineering, that hour is not optional. ...

April 12, 2026 · Mario Martinez Jr.