LoudScribe Under the Hood: How We Learn Your Voice, LoudScribe Blog

Most people who use voice-learning AI tools have no idea what's actually happening when the system "learns" their voice. They run through an onboarding flow, answer a few questions, paste in some old posts, and a few days later the AI produces drafts that feel surprisingly like them. The process seems like magic, which is fine if you just want to use the tool, but a lot of the professionals we work with are curious or skeptical enough to want to know what's actually going on.

Phase 1 is the voice interview, and it's the part most people underestimate. The interview isn't onboarding friction. It's the most information-dense part of the whole process. We ask things like: What topics do you feel most qualified to have opinions on? When you read something you disagree with, is your first instinct to argue back publicly or to think it through privately? When you explain something complex to a colleague, do you reach for an analogy or walk through the logic step by step? What words feel natural to you professionally, versus words that feel like you're performing someone else's language? The answers build a map of your thinking style before we've seen a single word you've written. We also use the interview to surface banned words and topics: things that are off-limits professionally, or that you find clichéd and cringe-worthy in your industry.

Phase 2 is writing sample analysis. If you give us 10-15 posts or articles you've written, we run them through an analysis pipeline that extracts patterns at three levels. Surface level: vocabulary range, sentence length distribution, punctuation habits, use of em-dashes versus commas versus parenthetical asides. Structural level: how you open posts, whether you typically build to a conclusion or state it upfront, how you handle evidence and examples, how often you use rhetorical questions. Conceptual level: the recurring ideas and frames you return to, the implicit assumptions in how you describe your industry, the relationship you seem to assume with your reader: expert-to-peer, mentor-to-mentee, challenger-to-status-quo. This last level is the hardest to quantify and the most important for generating content that sounds like you at the level of substance, not just style.

Phase 3 pulls in your LinkedIn history, if you grant access. This serves a different purpose than the writing samples you select. Writing samples are your best work: the things you chose to share as examples. Your LinkedIn post history is your actual track record, including the posts that underperformed, the angles that didn't land, the experiments. It tells us what you actually publish regularly, not what you're proud of. It's also the best source for learning your topic deduplication threshold: how long you typically wait before revisiting a subject, what angles you've already covered, where your blind spots might be.

Phase 4 is where the continuous learning kicks in. Every time you edit a draft (changing a word, restructuring a paragraph, cutting something entirely) those edits are signals. A systematic word substitution (replacing "leverage" with "use" every single time) gets encoded into your vocabulary profile. A consistent structural pattern (you always move the conclusion to the top when drafts bury it) becomes an instruction to the generation pipeline. Every approval or rejection teaches the system about your threshold for specificity, your tolerance for nuance, your preferred degree of directness. After 20-30 posts, the drafts require measurably less editing than after 5-10. The system isn't being fine-tuned on your data in the model-training sense. It's building a richer and more structured representation of your preferences that guides generation at inference time.

Phase 5 is cross-context adaptation. The same person sounds subtly different on LinkedIn versus X versus writing a newsletter. LinkedIn probably uses more business framing; the newsletter might be more personal; X is shorter and sharper. As you generate content across platforms, the system builds separate context profiles for each: same underlying voice, different register. This matters more than most people expect. A post calibrated perfectly for LinkedIn reads as overwrought on X. A great Twitter thread formatted for LinkedIn loses its rhythm.

The Humanizer layer runs as a final pass before any draft surfaces for review. I get the most questions about this piece, and it's worth being precise about what it does. The Humanizer is not a detection-evasion tool. It's a quality filter that catches patterns that are artifacts of how AI language models work rather than expressions of your voice. Things like symmetric sentence structures that humans rarely produce naturally, transitional phrases that appear because they're statistically common in training data rather than because they're useful, hedge words added not because of genuine uncertainty but because the model learned to hedge. These patterns make AI-generated text feel slightly off even when readers can't name exactly why. Removing them produces prose that's more natural, because natural human writing is less statistically average, not more.

What this isn't: fine-tuning a base model on your writing samples. The common assumption is that voice AI works by training a custom version of GPT or Claude on your posts until it produces outputs in your style. That's not what we do, for several reasons. Fine-tuning at the scale required for true voice differentiation is extremely expensive and slow. More importantly, it conflates style with substance. You don't want an AI that's learned to write sentences that sound like yours. You want a system that starts from your actual ideas and produces prose that expresses those ideas in your register. Your voice profile is structured context that shapes generation throughout the process, not training signal that changes the model's weights.

What AI can't do is have your experiences, carry your expertise, or form genuine opinions. The system can produce plausible-sounding commentary on any topic, but plausible and insightful are very different things. The quality ceiling for AI-assisted content is set entirely by the quality of the thinking you bring to it. When you give the system a strong perspective, grounded in real knowledge, and ask it to express that perspective in your voice: the results are good. When you give it a vague topic and ask it to come up with something interesting, the results are generic, regardless of how sophisticated the voice profile is.

This shapes how you should actually use the tool. The goal isn't to offload your thinking. It's to stop letting the difficulty of writing slow down the expression of thinking you've already done. There's a lot of genuine insight sitting in executives' heads that never makes it into the world because the writing process is too slow and too fragile to sustain consistently. That's what we're solving. The voice learning infrastructure is the mechanism for doing it without producing content that sounds like it came from a press release generator.

person

Yariv Levi

Founder of LoudScribe. Building AI that learns your voice so you can share your expertise without spending hours writing.

LoudScribe Under the Hood: How We Learn Your Voice

Get weekly insights on thought leadership