|

What AI Actually Needs to Translate Well
By Volkan Güvenç, Founder — Alafranga Language Solutions
There is a common assumption that AI translation quality depends primarily on the model. In practice, it depends mostly on what you feed into it.
We have been running AI-assisted workflows since 2018 — first with neural MT post-editing, then with the controlled drafting approach we now call SmartEdit. The clearest lesson from that experience: a well-configured AI with good data beats a better model with no data, every time.
Here is what that data actually consists of.
|
▮Translation Memories (TMs)
A translation memory stores every segment you have ever approved — matched to its source, tagged with domain, client, and date. When AI drafts against a populated TM, it is not working from general training data. It is working from your decisions.
For a client like Solplanet, where we have been running solar energy documentation across 20+ languages for four years, the TM is a record of thousands of terminology decisions made in context. An AI draft that ignores that history will produce output that is linguistically correct but contextually wrong.
TM integration is not optional. It is what separates controlled AI output from generic MT.
▮Glossaries and termbases
A glossary is not a dictionary. It is a list of decisions — which term is approved, which is not, which variant is used in which context.
For regulated content, this matters operationally. A machinery manual where "emergency stop" has been translated consistently across twelve documents needs to stay consistent in the thirteenth. The AI does not know that unless you tell it.
We maintain client-specific glossaries updated after every project delivery. In MemoQ and SDL Trados, these are enforced at the segment level — the AI draft is checked against the termbase before it reaches the reviewer.
▮Style guides
AI output without style guidance is fluent but anonymous. It has no voice.
A style guide tells the AI — and the reviewer — whether the client uses formal or informal register, British or American spelling, active or passive construction, numbered lists or prose. For brands with a specific tone, this is not cosmetic. A customer-facing product interface that suddenly shifts register breaks trust.
The shortest style guide we work with is two pages. The longest is forty. Both are used.
▮Bilingual reference files
Past deliverables — XLIFF, TMX, bilingual DOCX, SRT subtitle files — carry structural and contextual information that segment-level TM does not capture. How a table was handled. How a warning label was formatted. How a legal clause was broken across lines.
These files serve as practical benchmarks, particularly for document types that appear infrequently. When a client sends a new CE compliance filing after eighteen months, the reference file from the previous one is worth more than any general guidance.