Back to Blog
Case StudiesMar 2026~10 min read

Case Study: Drafting 4 Technical Specs in 25 Minutes—From a Coffee Shop Mobile

When diagnosis and execution separate, your phone becomes a logic-layer workstation

If you're new here

QEEK is a codebase-aware workspace: Architect Chat reasons over your repositories, and you can turn those threads into durable specs, briefs, and implementation plans.

This post isn't a product tour—it's a show-your-work case study. The spec excerpts are what we actually wrote before opening a full IDE.

How this post was made

The storyline—and the two figures you'll see below—started as a first draft from QEEK, in the same Architect session that produced the four technical specs. In other words: after the navigator finished spelunking the repos, we asked it to explain what had just happened, and it came back with prose plus quick visuals (the latte timeline and the diagnosis-vs-execution sketch).

What you're reading now is that draft after a normal editorial pass: contrast and spacing on the graphics, tighter headings, and the spec cards rewritten for scanability and verification—without pretending the session was more polished than it was.

Meta, but sincere: if your product can draft the post about the work it just helped do, the "documentation tax" starts to feel optional.

The scenario

It's a familiar developer moment: you're away from your desk with nothing but a phone—and work still shows up. Usually that means waiting for a laptop or squinting at an IDE. This time, the goal wasn't to ship commits from the couch; it was to turn four messy production issues into reviewable technical specs while the espresso machine screamed in the background.

Timeline from ordering a latte through four completed specs over 25 minutes
The same clock that bounded the coffee bounded the work: a short window forces clarity—what is the smallest spec that still lets a team implement without guesswork?

The challenge (four real bugs, four specs)

Each item below became its own spec: problem statement, proposed file-level edits, and a verification checklist—not marketing bullets.

  • 1.Architect chat "hang": broad questions drove tool calls, but the thread could end with no user-visible answer—until you nudged it to continue.
  • 2.The share wall: sharing required generated specs; chat-only sessions hit "No specs found."
  • 3.Sync ghost: retry-heavy jobs and strict pipeline gates produced misleading file counts and sometimes threw away work that had actually succeeded.
  • 4.Image-only attachments: PDFs and text artifacts never made it past the UI filter—so the model never saw them.

The philosophy: diagnosis vs. execution

The shift wasn't "I edited code on a phone." It was separating diagnosis from execution: keep your attention on intent, acceptance criteria, and failure modes—then let the navigator traverse the tree, read the noisy files, and propose a patch list you can reject or refine.

  • You (architect): why this matters, what "done" means, what we're not solving today.
  • AI (navigator): where in the repos, which configs, which services—and the concrete diffs you'll still review on a real screen.
Diagram comparing the Architect logic layer and the Navigator implementation layer
Two layers, one session: stolen time becomes a spec-shaped artifact, not a stash of half-remembered URLs.

Operating at the logic layer on mobile is viable when the output is a spec—because a spec is structured thought, not keystrokes.

What we actually wrote down

The following summaries are tightened from the four internal drafts produced in that 25-minute window. They're enough for an engineer to sanity-check direction before implementation.

1. Architect chat tool-call loop and hang

Broad architectural questions triggered many tool calls; synthesis could return no visible text; frontend/backend iteration limits didn’t always line up—so users saw a “hang” until they continued the thread.

Proposed direction

  • Raise backend maxSteps (e.g. toward 20) for primary agents + tighten prompts so the model synthesizes after smaller tool batches.
  • Harden streaming middleware: more descriptive synthesis fallback text; higher timeout; truncate oversized tool payloads before Gemini synthesis.
  • Raise frontend MAX_TOOL_ITERATIONS so the UI doesn’t bail early; if the stream ends with tools but no text, show an explicit “continue” affordance.

How we'd validate: Stress-test a broad question (“explain auth + projects across repos”); confirm 10+ tool calls can still end with an inline answer; force a no-text path and read the fallback.

2. Chat-only sharing in Architect

Share flows assumed at least one spec artifact; chat-only sessions threw “No specs found” and disappeared from Library-oriented views.

Proposed direction

  • specBoardService: remove the hard error when no specs exist; derive title from the session and use the first user message as board-visible content.
  • architectService: stop filtering out sessions with zero specs/mockups/diagrams; descriptions should still read well from chat.
  • UI: share dialog + Library cards tolerate “0 files” (e.g. “Chat only”) and deep-link back into Architect cleanly.

How we'd validate: No Firestore schema migration—spec records may simply leave product/tech fields empty while keeping sourceChatId. Walk invite + “Shared with me” + reopen chat history.

3. Reliable repo sync and idempotent progress

A sub-50% success gate could skip bulk load and discard embeddings that actually landed. Retried Cloud Run tasks double-counted progress in RTDB, inflating totals and breaking trust in sync stats.

Proposed direction

  • Workflow: always bulk-load when succeededCount > 0; surface partial completion via syncStats instead of pretending failure means “nothing to load.”
  • repo-file-processor: per-task index in RTDB (completedTasks/<index>); skip increments when the task was already counted; keep counters atomic.
  • Retention: park failed file reports under sync-logs/ (not silent delete) so engineers can debug outliers.

How we'd validate: Retry a flaky task index: counts move once. UI shows failedCount alongside completion. Successful embeddings survive even when many files error.

4. Non-image chat attachments

Architect chat filtered to images only—PDFs, markdown, and plain text never reached upload or model context.

Proposed direction

  • chat-panel + session hook: unified attachments pipeline (images vs generic files).
  • chatService: generalize upload; map to AI SDK parts—image, file (binary), or inline text for .md/.txt.
  • qeek-mastra: middleware/agent config accepts the expanded multimodal parts.

How we'd validate: Regression-test images plus PDF + .txt/.md; model references specific attachment content in replies.

Stolen-moment productivity

The win wasn't that a phone replaced a workstation. It's that ten quiet minutes—waiting, bored, offline-adjacent—became a high-fidelity architectural session because the navigator held the file graph in working memory.

You still land at your desk to implement. But you're not starting from a vague complaint in Slack; you're starting from a checklist someone can review.

Results (that session)

  • Time: ~25 minutes.
  • Artifacts: four specs with file touches, verification steps, and operational edge cases—not four paragraphs of vibes.
  • Hardware: smartphone only for drafting.
  • Location: a local coffee shop.

What still belongs at a desk

Specs compress ambiguity; they don't eliminate review. The next steps are normal engineering: diff size estimates, risk call on schema-adjacent flows, profiling anything touching sync, and proving the attachment path against your production MIME and size limits. The phone bought the clarity; the workstation still buys the merge.

Takeaways

  1. 1.Credibility scales with specificity: readers trust file names, limits, and verification steps more than superlatives.
  2. 2.Drafting beats typing on mobile: structured prompts toward specs beat trying to patch code through a glass keyboard.
  3. 3.Keep the contract honest: show diagrams, show scope, show what still needs a human gate—the story gets more compelling, not less.

Closing

If your tooling can hold the shape of your system, the device in your hand becomes a place to think—not a place to fight your repo. The latte emptied; the stack traces didn't have to wait for Monday.

Next time you're out with only a phone, try ending the session with something another engineer could review. That's when the coffeeshop stops being a detour—and becomes a studio.