Cover
Aloita nyt ilmaiseksi HIAT final test.pdf
Summary
# The computer-assisted translation process
The computer-assisted translation (CAT) process involves a series of structured steps designed to leverage technology in producing a final translation [49](#page=49).
### 1.1 Overview of the CAT process
CAT tools, also known as Translation Assistance software (TAO), are designed to streamline the translation workflow. Establishing a routine and following defined steps is recommended for achieving the final translation effectively [49](#page=49).
### 1.2 Key stages in the CAT process
The CAT process can be broken down into several distinct stages, typically involving initial preparation, the core translation work, and post-translation refinement.
#### 1.2.1 Initial preparation and segmentation
* **File format check:** The process begins with verifying the compatibility and format of the source files [50](#page=50).
* **Resource assignment:** Relevant translation resources, such as translation memories (TMs) and termbases, are assigned to the project [50](#page=50).
* **Segmentation:** The source text is divided into smaller units, usually sentences or phrases, referred to as segments. This segmentation is crucial as it forms the basis for matching and translation within the CAT tool [50](#page=50) [51](#page=51).
#### 1.2.2 The translation phase
* **Translation:** Translators work on each segment, utilizing the CAT tool's features for assistance. This typically involves matching segments against existing translation memories and applying machine translation suggestions if available [50](#page=50).
#### 1.2.3 Post-translation and refinement
* **Resource update:** After the translation work is completed, the translation memory is updated with the new translations generated during the project [52](#page=52).
* **Revision:** A thorough revision of the translated segments is performed to ensure accuracy, consistency, and adherence to project guidelines [52](#page=52).
* **TM generation:** The translation memory is further refined and potentially generated for future use [52](#page=52).
* **Final review:** A final review of the entire translated document is conducted to catch any remaining errors or inconsistencies before delivery [52](#page=52).
> **Tip:** The effectiveness of the CAT process is significantly enhanced by having well-maintained and comprehensive translation memories and termbases [50](#page=50) [52](#page=52).
---
# Understanding translation memory files and their importance
Translation memory (TM) files are fundamental tools for translators, significantly enhancing efficiency and ensuring consistency in the translation process. These files, when loaded into a translation environment tool (CAT or TEnT tool), allow translators to leverage previously translated content [75](#page=75).
### 2.1 The importance of translation memory files
#### 2.1.1 Improving efficiency
TM files boost translator efficiency by automatically identifying and presenting previously translated segments. If a sentence or phrase in the current project has been translated before, the CAT tool will alert the translator to the existing translation, saving time and effort [75](#page=75).
#### 2.1.2 Ensuring consistency
A critical function of TM files is to maintain consistency across translations, especially when working on multiple projects for various clients. By utilizing "client-based" or "project-based" TM files, translators can ensure that specific terminology and phrases are applied accurately and uniformly throughout their work [75](#page=75).
### 2.2 Key industry standard file types: TMX and XLIFF
TMX (Translation Memory eXchange) and XLIFF (XML Localization Interchange File Format) are both industry-standard, XML-based file types used in translation workflows. While they share commonalities, including some inline markup elements, they possess distinct structures and purposes [76](#page=76).
#### 2.2.1 Differences between TMX and XLIFF
* **Purpose:** XLIFF was designed to store extracted text and facilitate data transfer within the localization process, whereas TMX was developed specifically for exchanging translation memory data between different tools [76](#page=76).
* **Language Support:** TMX can accommodate an unlimited number of languages within a single document, while XLIFF is structured for one source and one target language [76](#page=76).
* **Inline Code Handling:** TMX exclusively uses encapsulation methods for inline codes (where native codes are enclosed within elements). XLIFF, however, supports both encapsulation and a placeholder method, where native codes are removed to a separate "Skeleton file" and replaced by referencing elements similar to OpenTag's [76](#page=76).
* **File Structure and Reconstruction:** In TMX files, the collection of `` (translation unit) elements lacks a specific order and does not include mechanisms to rebuild the original source file. XLIFF, conversely, offers enhanced capabilities for reconstructing or rebuilding the original file [76](#page=76) [78](#page=78).
* **Additional Data Types:** XLIFF incorporates extra data types and fields not found in TMX, such as pretranslation, history, versioning, and binary objects [76](#page=76).
* **Temporal Data:** TMX files have the capability to store time and date data at the translation unit level, a feature not present in XLIFF files [76](#page=76).
#### 2.2.2 Which format is better: TMX or XLIFF?
Both TMX and XLIFF are robust and widely supported formats by most translation software tools. The choice between them often depends on the specific project requirements or the translation tool being used. In many cases, translators may not need to "choose" as they can export their translation memory in either format from their CAT tool [77](#page=77).
> **Tip:** Regardless of the format chosen, using translation memory is vastly more beneficial than not using it at all [77](#page=77).
#### 2.2.3 Preferences for TMX and XLIFF
When given the choice for a new translation project, some translators prefer TMX for two primary reasons:
* **Time-Stamped Translation Units:** TMX allows for time stamping of translation units, which enables later productivity analysis of the work performed [78](#page=78).
* **Multiple Target Languages:** TMX can store multiple target languages within a single file [78](#page=78).
On the other hand, XLIFF is considered more powerful if the ability to reconstruct or rebuild the original file from the TM file is a significant requirement [78](#page=78).
---
# Translation-related file formats and resources
This section delves into file formats and resources crucial for managing and exchanging translation data.
### 3.1 Translation file formats
Translation workflows often involve various file formats designed to store and transfer linguistic data between different tools and stakeholders.
#### 3.1.1 TermBase eXchange (TBX)
* **Description:** TBX, also known as DXLT (Default XLT format), is a standard for exchanging terminology data [92](#page=92).
* **Purpose:** It facilitates the transfer of glossaries between translation tools [92](#page=92).
* **Underlying Standard:** The format is based on ISO 12200: MARTIF (Machine-Readable Terminology Interchange Format) [92](#page=92).
### 3.2 Terminology management resources
Effective terminology management is vital for maintaining consistency and accuracy in translations.
#### 3.2.1 Standards-based Access service to multilingual Lexicons and Terminologies (SALT)
* **Description:** SALT is a resource provided by BYU for maintaining organized terminology [92](#page=92).
### 3.3 Further reading and references
The document references several sources for further exploration of translation-related formats and concepts.
* Gargallo Cherta, Esther. *Guía de formatos para la traducción* [95](#page=95).
* Moorkens, Joss. *The Role of Metadata in Translation Memories* [95](#page=95).
* Information on Localization-Related Formats [95](#page=95).
* Resources from the World Wide Web Consortium (W3C) [95](#page=95).
---
# Internationalization and localization of websites
Internationalization and localization of websites are critical processes for adapting online content to diverse global audiences, ensuring both technical compatibility and cultural relevance .
### 4.1 Internationalization (i18n) principles
Internationalization refers to the design and development of a website in a way that facilitates its easy adaptation to various languages and cultural preferences. The core principles ensure a foundational readiness for localization efforts .
#### 4.1.1 Unicode standard
Utilizing the Unicode standard is paramount for ensuring compatibility with a wide array of writing systems. This allows for the accurate representation of characters and symbols from diverse languages .
#### 4.1.2 Separation of content and code
A fundamental principle is to maintain a clear separation between website content and its underlying source code. This practice enables simpler and more efficient translation processes, as translators can work with content files without needing to make extensive modifications to the code itself .
#### 4.1.3 Flexible user interface (UI) design
Designing a flexible user interface is essential to accommodate varying text lengths that can occur across different languages. Furthermore, it allows for the adaptation to languages that may have different reading directions, such as right-to-left scripts .
#### 4.1.4 Adaptation of date, time, and number formats
For cultural relevance and user experience, it is crucial to adapt date, time, and number formats to align with locale-specific conventions .
> **Tip:** Different regions have distinct ways of writing dates (e.g., MM/DD/YYYY vs. DD/MM/YYYY) and using decimal separators, which must be accounted for.
#### 4.1.5 Culturally neutral images and icons
The selection of images and icons should aim for cultural neutrality. Alternatively, providing region-specific alternatives ensures that visual elements are appropriate and inclusive for different audiences .
### 4.2 Localization (L10n) process for websites
Localization is the process of adapting a website for a specific region or language by modifying it to reflect the local language, culture, and technical requirements.
#### 4.2.1 Translation of content
This involves converting all textual and multimedia elements into the target language. It requires not only linguistic accuracy but also a deep understanding of linguistic nuances and cultural sensitivities to convey the intended meaning effectively .
#### 4.2.2 Adaptation of graphics and multimedia
Beyond text, images, videos, and other multimedia assets need to be reviewed and adapted to ensure they are culturally appropriate and resonate with the target audience. This might involve modifying imagery or even choosing entirely different visual content for different regions .
#### 4.2.3 Adjustment of layout and design
The layout and design of a website may need to be adjusted to accommodate variations in text length, font styles, and other language-specific considerations. For example, longer text in one language might require more space on the page than in another .
#### 4.2.4 Integration of local regulations
A crucial aspect of localization is ensuring compliance with legal requirements and local regulations. This includes adherence to content restrictions, privacy laws, and accessibility standards specific to the target market .
#### 4.2.5 Testing and quality assurance
Rigorous testing is essential to guarantee the functionality, linguistic accuracy, and cultural appropriateness of the localized website. This step ensures that the website performs as intended and meets user expectations in the target market .
### 4.3 Web localization vs. other audiovisual products
While the fundamental principles of localization are consistent across various mediums, websites present distinct challenges compared to static products like applications or games .
#### 4.3.1 Dynamic content
Websites often feature dynamic content that requires real-time updates. This characteristic makes the localization process more complex than for static products, as content needs to be updated and localized promptly .
#### 4.3.2 SEO considerations
Effective localization of metadata, keywords, and tags is critical for search engine optimization (SEO). Proper localization directly impacts a website's visibility and ranking in search results within different regions .
#### 4.3.3 Cultural sensitivity
As websites often serve as the public face of an organization, meticulous attention to cultural nuances is required. This is vital to prevent misunderstandings or unintentional offense, which can damage brand reputation .
#### 4.3.4 Continuous updates
Websites are subject to frequent updates and revisions. This necessitates ongoing localization efforts to ensure that the content remains current and culturally relevant for the target audience over time .
---
# The role of humans in machine translation processes
Humans play a crucial role in various stages of the machine translation (MT) process, working alongside MT engines to optimize translation quality and efficiency. The primary human interventions involve pre-editing and post-editing .
### 5.1 Pre-editing
Pre-editing involves revising technical documentation *before* it undergoes machine translation. The goal is to improve the source text in order to enhance the quality of the raw MT output. Effective pre-editing can significantly reduce or even eliminate the need for post-editing .
#### 5.1.1 The pre-editor's role
Ideally, a specialized human editor performs pre-editing. This editor analyzes the text from the perspective of an MT engine, anticipating potential errors in the machine-generated translation. The editor then modifies the source text to facilitate MT by :
* Reducing sentence length .
* Avoiding complex or ambiguous syntactic structures .
* Ensuring term consistency .
* Using articles appropriately .
> **Tip:** Thinking like an MT engine is key to effective pre-editing. Consider how the system might misinterpret certain structures or words.
#### 5.1.2 Tools and techniques in pre-editing
Beyond structural text manipulation, pre-editors utilize automated revision tools. These include :
* **Spell-checking:** Verifying the source text against a project-specific glossary .
* **Grammar-checking:** Employing advanced grammar-checking tools .
* **Tagging untranslatable elements:** Identifying and marking parts of the source document that should not be translated by the MT engine .
> **Example:** A pre-editor might identify a specific brand name or a code that should remain in its original form and tag it accordingly, preventing the MT from attempting to translate it.
#### 5.1.3 Broader applications of pre-editing principles
The techniques employed in pre-editing are also beneficial for traditional human translation projects. Organizations producing extensive multilingual content often integrate similar practices into their localization workflows. Writing source material with MT facilitation in mind from the outset can lead to substantial improvements in overall quality and productivity downstream .
### 5.2 Post-editing
Post-editing is the other significant human role in MT processes, occurring after the MT engine has generated a translation. While not detailed in the provided pages for this section, it typically involves a human linguist reviewing and correcting the MT output .
> **Tip:** Understanding the principles of pre-editing can inform how one approaches post-editing, as the issues addressed during pre-editing are the same ones a post-editor would likely encounter.
---
# Controlled language rules for writing
Controlled language rules aim to improve clarity and reduce ambiguity in written communication, particularly in technical and academic contexts. These rules focus on constructing sentences that are easy to understand, translate, and process. The following sections detail specific rules designed to achieve this objective .
### 6.1 Principles of sentence construction
This section outlines fundamental principles for constructing clear and unambiguous sentences, focusing on noun usage, article application, word choice, and spelling.
#### 6.1.1 Noun repetition versus pronoun use
A key rule in controlled language is to repeat nouns rather than using pronouns to refer back to them. This direct repetition eliminates potential confusion that can arise from pronoun antecedents, especially in complex sentences or during translation .
* **Write:** You must check the spelling of your text before you publish your text .
* **Do not write:** You must check the spelling of your text before publishing it .
> **Tip:** While pronouns are common in everyday writing, their overuse can lead to ambiguity in controlled language. Prioritizing noun repetition ensures that the subject of a sentence remains explicitly clear.
#### 6.1.2 Using articles to identify nouns
Controlled language mandates the use of articles (such as 'a', 'an', 'the') to clearly identify nouns. This practice distinguishes between general and specific references and ensures that nouns are explicitly introduced and understood, preventing omissions that could lead to misinterpretation .
* **Write:** Test the installation .
* **Do not write:** Test installation .
> **Example:** In a procedural document, specifying "Test the valve" is clearer than "Test valve" because the article "the" indicates a specific valve that should be tested.
#### 6.1.3 Using general dictionary words
To promote universal understanding and simplify translation, controlled language encourages the use of words commonly found in general dictionaries. This avoids jargon, specialized terminology, or archaic phrasing that might not be widely understood or easily translatable .
* **Write:** Avoid ambiguity .
* **Do not write:** Eschew obfuscation .
> **Tip:** When in doubt about a word's common usage or translatability, opt for a simpler, more widely recognized synonym.
#### 6.1.4 Ensuring correct spelling
Maintaining correct spelling is a fundamental rule in controlled language, as errors can significantly complicate comprehension and translation processes. Texts with spelling mistakes can introduce confusion and require additional effort to decipher, thereby undermining the goal of clear communication .
* **Write:** Texts that contain spelling errors complicate the translation process .
* **Do not write:** Texts that contein speling misstakes complicate the translation procces .
> **Example:** A document stating "The system requires regular maintance" contains a spelling error ("maintance" instead of "maintenance"). Correcting this ensures clarity and avoids any potential misinterpretation by readers or translation software.
---
# Guidelines for achieving different quality levels
This section outlines the guidelines for achieving different quality levels in post-editing, distinguishing between "good enough" and quality similar to human translation. The effort required for post-editing is primarily determined by the initial quality of the machine translation (MT) raw output and the desired end quality of the content .
### 7.1 Quality levels and post-editing approaches
Two main quality levels are discussed, each with a corresponding post-editing approach:
* **"Good enough" quality:** This level refers to content that is comprehensible, accurate, but not necessarily stylistically compelling. It might sound like it was computer-generated, with potentially unusual syntax or imperfect grammar, but the core message remains accurate. For this level, light post-editing is generally recommended .
* **Quality similar or equal to human translation:** This level is defined as comprehensible, accurate, and stylistically fine, although the style might not reach the level of a native-speaker human translator. Syntax, grammar, and punctuation are expected to be normal and correct. For this quality, full post-editing is typically recommended .
### 7.2 Guidelines for achieving "good enough" quality
To achieve "good enough" quality, the following guidelines should be followed:
* **Aim for semantically correct translation** .
* **Ensure that no information has been accidentally added or omitted** .
* **Edit any offensive, inappropriate, or culturally unacceptable content** .
* **Utilize as much of the raw MT output as possible** .
* **Apply basic rules regarding spelling** .
* **No need to implement corrections that are solely stylistic** .
* **No need to restructure sentences solely to improve the natural flow of the text** .
> **Tip:** For "good enough" quality, focus on conveying the core meaning accurately and ensuring comprehensibility, without investing effort in stylistic enhancements or perfect naturalness.
### 7.3 Guidelines for achieving quality similar or equal to human translation
To achieve a quality level similar or equal to human translation, the following guidelines apply:
* **Aim for grammatically, syntactically, and semantically correct translation** .
* **Ensure that key terminology is correctly translated and that untranslated terms belong to the client’s list of “Do Not Translate” terms** .
* **Ensure that no information has been accidentally added or omitted** .
* **Edit any offensive, inappropriate, or culturally unacceptable content** .
* **Utilize as much of the raw MT output as possible** .
* **Apply basic rules regarding spelling, punctuation, and hyphenation** .
* **Ensure that formatting is correct** .
> **Example:** When aiming for publishable quality, an editor would not only correct grammatical errors but also refine sentence structure for better readability and ensure the tone and style are appropriate for the target audience, much like a human translator would.
---
# Computer-assisted translation systems
Computer-assisted translation (CAT) systems are software tools designed to aid human translators in the translation process, enhancing efficiency and consistency [54](#page=54).
### 8.1 Overview of CAT systems
CAT systems provide an integrated environment for managing translation projects from inception to completion. They assist translators by offering features such as translation memory management, terminology databases, and revision tools [54](#page=54).
### 8.2 Key CAT tools and their features
Several prominent CAT systems are widely used in the industry, each offering distinct functionalities:
#### 8.2.1 RWS TRADOS
TRADOS is a comprehensive translation environment that supports the entire workflow of a translation project, from initial setup and translation memory creation to final review and editing. It is considered the most utilized system currently and is frequently requested by clients from their translators [54](#page=54).
#### 8.2.2 WORDFAST
WORDFAST is an online CAT system, though a desktop version is also available. Historically, it was a free CAT system. WORDFAST includes an integrated machine translation engine that suggests translations as the user progresses through the text. It also offers alignment and glossary functions [55](#page=55).
#### 8.2.3 MemoQ
MemoQ, launched in 2004 by Kilgray Translation Technologies, is a competitor to TRADOS that has gained significant traction. It offers various products tailored to different translator needs [56](#page=56).
#### 8.2.4 Déjà Vu X3
Déjà Vu X3, developed by Atril, is another CAT program with functionalities similar to the aforementioned systems. It enables project managers to evaluate, prepare, and control projects from start to finish across available language pairs [57](#page=57).
---
# File formats for translation data exchange
File formats for translation data exchange are crucial for interoperability between different localization tools and processes. These formats facilitate the transfer of various types of translation-related data, ensuring seamless workflows across the localization lifecycle [88](#page=88) [90](#page=90).
### 9.1 Translation Memory eXchange (TMX)
TMX is a standard format designed for the transfer of translation memories between different translation tools. A translation memory (TM) itself is a database storing source text segments and their corresponding translations in one or more target languages. TMX was developed by the OSCAR Special Interest Group at LISA (the Localisation Industry Standards Association) [89](#page=89).
### 9.2 XML Localisation Interchange File Format (XLIFF)
XLIFF is a format that enables the transfer of localizable data extracted from original files across various stages of the localization process. It supports moving data from one stage to the next and merging localized data back into its original format. The OLIF Consortium is associated with XLIFF, working closely with the SALT group [90](#page=90).
### 9.3 Open Lexicon Interchange Format (OLIF)
OLIF, also known as Open Lexicon Interchange Format, facilitates the exchange of terminological and lexical data between translation tools. While similar in purpose to TBX, OLIF is particularly geared towards Natural Language Processing (NLP) data, such as machine translation lexicons. The OLIF Consortium maintains this format [91](#page=91).
### 9.4 TermBase eXchange (TBX)
TBX, or TermBase eXchange format, is also referred to as DXLT (Default XLT format), where XLT stands for XML representations of Lexicons and Terminologies. This format allows for the transfer of glossaries between translation tools. TBX is based on the ISO 12200 standard, which is the Machine-Readable Terminology Interchange Format (MARTIF). The SALT (Standards-based Access service to multilingual Lexicons and Terminologies) group at BYU is responsible for its maintenance [92](#page=92).
---
# Understanding post-editing of machine translation
Post-editing is the process by which human translators correct machine-generated translations to achieve a satisfactory final output .
### 10.1 Definition and scope
Post-editing refers to the amendment of machine translation (MT) output by human translators to meet a predetermined quality standard. This process is distinct from editing human-generated text, which is typically known as revision in the translation field. Post-edited text may subsequently undergo revision and proofreading to correct minor errors and ensure linguistic quality .
> **Tip:** While post-editing improves MT output, it is a separate process from revision, which enhances human-generated text.
### 10.2 When post-editing is applied
Post-editing is employed when raw machine translation does not meet the required quality standards and full human translation is not deemed necessary. Industry recommendations suggest using post-editing when it can at least double the productivity compared to manual translation, with potential fourfold increases for light post-editing .
### 10.3 Efficiency and productivity gains
The efficiency of post-editing can be challenging to predict. While numerous academic and industry studies indicate that post-editing is generally faster than translating from scratch, regardless of language pairs or translator experience, there is no consensus on the actual time savings. Industry reports suggest time savings around 40%, whereas some academic research indicates more modest gains of 0–20% under real working conditions. Some professionals have even experienced negative productivity gains, where corrections took longer than translating from scratch .
> **Tip:** Be aware that claimed productivity gains from post-editing can vary significantly between industry reports and academic studies, and may not always materialize in practice.
### 10.4 Post-editing strategies and quality levels
The approach to post-editing depends on project-specific requirements, with key considerations being time, quality, and cost. Strategies are built around these factors by selecting the appropriate method .
#### 10.4.1 Light post-editing
Light post-editing involves minimal intervention by the post-editor, sufficient only to make the machine-translated text understandable to the end user. This approach is typically used when the client needs the text quickly, for a short duration, or for inbound purposes only .
#### 10.4.2 Full post-editing
Full post-editing requires a higher level of intervention to achieve a quality level agreed upon by the client and post-editor. The goal is to produce text that is not only understandable but also stylistically appropriate, suitable for assimilation and dissemination for both inbound and outbound use .
> **Example:** For a company's internal knowledge base, light post-editing might suffice to ensure key information is conveyed. However, for marketing materials intended for public release, full post-editing would be necessary to ensure brand voice and stylistic appropriateness.
#### 10.4.3 Top-end quality expectations
At the highest end of full post-editing, the expectation is a quality level indistinguishable from human translation. Historically, it was believed that translating from scratch required less effort than post-editing MT output. However, advancements in machine translation and artificial intelligence are shifting this perception. For certain language pairs and tasks, especially with MT engines customized with domain-specific data, some clients are now requesting post-editing over original translation, aiming for comparable quality at a reduced cost .
### 10.5 Role of CAT tools
Practically all Computer-Assisted Translation (CAT) tools now offer support for post-editing of machine-translated output .
### 10.6 Controlled language
Pre-editing the source text, for instance by applying controlled language principles, can lead to better results when combined with post-editing the machine output .
---
# Computer-assisted translation tools and their functions
Computer-assisted translation (CAT) tools are software applications designed to support human translators, offering functionalities that enhance efficiency, consistency, and quality in the translation process [60](#page=60).
### 11.1 General principles of CAT tools
CAT tools aim to streamline translation workflows by providing various features for managing translation data and assisting the translator. These tools often integrate multiple functionalities, categorized as off-line and on-line functions [62](#page=62) [63](#page=63).
### 11.2 Off-line functions
Off-line functions are typically performed on a complete document or a collection of data before or after the primary translation phase.
#### 11.2.1 Import
The import function allows for the transfer of text and its translation from external text files into a translation memory (TM). This can be done from a raw format, where the source text and its translation are provided, or from a native format, which is the TM's proprietary file format for saving translation memories [62](#page=62).
#### 11.2.2 Analysis
Analysis involves several steps to prepare the text for translation and to estimate the translation effort.
##### 11.2.2.1 Textual parsing
Textual parsing focuses on correctly recognizing elements like punctuation to differentiate between, for instance, a full stop ending a sentence and one used in an abbreviation. This stage often involves markup, which is a form of pre-editing that identifies special text elements. Some elements, like proper names or codes, may not require translation, while others might need conversion to a native format [62](#page=62).
##### 11.2.2.2 Linguistic parsing
Linguistic parsing involves techniques to normalize word order and identify phrases. The base form reduction is used to prepare lists of words for automatic term retrieval from a term bank. Syntactic parsing can identify multi-word terms or phraseology by analyzing which words can form a phrase [62](#page=62).
#### 11.2.3 Retrieval
Translation memories offer different types of matches for source segments, aiding in reuse of previous translations.
* **Exact match (100% match):** Occurs when the source segment in the current document is identical, character by character, to a segment already stored in the TM [64](#page=64).
* **In-Context Exact (ICE) match or Guaranteed Match:** An exact match that also appears in the same context within a document, often considering surrounding sentences and document attributes like file name, date, and permissions [64](#page=64).
* **Fuzzy match:** This applies when a segment is not an exact match but shares similarities with a stored segment. Systems may assign percentages to fuzzy matches, indicating the degree of similarity (greater than 0% and less than 100%). These percentages are not universally comparable across different CAT tools unless the scoring method is specified [64](#page=64).
* **Concordance:** This function allows translators to search the TM for specific words or phrases within a source segment, retrieving all segment pairs that match the search criteria. It is particularly useful for finding translations of terms and idioms when a dedicated terminology database is unavailable [64](#page=64).
#### 11.2.4 Updating
As translators accept new translations, the TM is updated. The process of updating a database can involve modifying or deleting existing entries. Some systems permit saving multiple translations for the same source segment [64](#page=64).
### 11.3 On-line functions
On-line functions are typically integrated into the real-time translation process, assisting the translator as they work.
#### 11.3.1 Segmentation
Segmentation divides the source text into meaningful translation units, often sentences, to identify the most useful segments for matching in the TM. This process is monolingual and uses superficial parsing. Alignment is based on segmentation, and manual correction of segmentation can sometimes lead to repeated errors if the program does not adapt [63](#page=63).
#### 11.3.2 Alignment
Alignment establishes correspondences between segments in the source and target texts. A good alignment algorithm should ideally provide feedback to segmentation and be capable of correcting initial segmentation errors [63](#page=63).
#### 11.3.3 Term extraction
Term extraction can utilize a pre-existing dictionary or employ parsing based on text statistics to identify unknown terms. This function is crucial for estimating the workload of a translation project, aiding in planning and scheduling. Translation statistics often count words and assess the degree of repetition within a text [63](#page=63).
#### 11.3.4 Export
Export allows the transfer of text from the TM into an external text file. Ideally, export and import functions should be inverse operations [63](#page=63).
#### 11.3.5 Automatic translation and substitution
CAT tools often offer automatic retrieval and substitution of translations. As a translator moves through a document, TM systems can automatically search for matches and display their results. With automatic substitution, if an exact match is found for a segment in a new version of a document, the software will automatically insert the previously stored translation. However, if the translator does not verify the accuracy of the automatically substituted translation against the source, errors from the previous translation may be replicated [65](#page=65).
> **Tip:** Be vigilant about automatically substituted translations. Always cross-reference with the source text to prevent the propagation of prior errors.
#### 11.3.6 Networking
Networking features allow multiple translators to collaborate on a single text, increasing efficiency. Sentences and phrases translated by one team member become available to others, fostering faster completion times compared to individual work. Sharing translation memories before finalization also provides an opportunity for team members to correct each other's mistakes [65](#page=65).
#### 11.3.7 Text memory and Translation memory
* **Text memory:** This concept, forming the basis of standards like the Lisa OSCAR xml:tm, encompasses both author memory and translation memory [65](#page=65).
* **Translation memory (specifically in the context of xml:tm):** This system remembers unique identifiers during translation to ensure the target document is precisely aligned at the text unit level. If the source document is later modified, unchanged text units can be directly transferred to the new target version without translator intervention, embodying the concept of "exact" or "perfect" matching. xml:tm also supports in-document leveraged and fuzzy matching [65](#page=65).
---
# Metadata in translation memory software
Metadata plays a crucial role in translation memory (TM) software, enabling efficient management, retrieval, and reuse of translation data [84](#page=84) [85](#page=85).
### 12.1 Understanding metadata
Metadata is defined as "data about data," providing additional information about digital content and processes. It describes not only the content itself but also how it was created, managed, and structured [85](#page=85).
#### 12.1.1 Types of metadata
There are three primary types of metadata:
* **Descriptive metadata:** This type describes the content of the data [85](#page=85).
* **Structural metadata:** This describes the organization of objects or components [85](#page=85).
* **Administrative metadata:** This describes technical information, such as file types [85](#page=85).
### 12.2 Metadata within translation memory software
In the context of TM software, metadata is embedded within translation units, allowing for granular tracking and management of translated segments [84](#page=84).
#### 12.2.1 Segment-level metadata
TM software divides texts into segments. Each of these segments is associated with metadata that can trace it back to a specific translator, the date, and the time of translation. This granular information is vital for translators and language service providers [84](#page=84).
#### 12.2.2 Leveraging metadata for efficiency
The metadata associated with translation segments offers several benefits:
* **Prioritizing recent material:** Translators can choose to leverage more recent translations, ensuring that terminology is up-to-date [84](#page=84) [87](#page=87).
* **Managing outdated content:** Segments containing outdated terminology can be identified and deleted [84](#page=84).
* **Effective resource management:** Language service providers can manage their TM resources more effectively by understanding the origin and recency of translated content [84](#page=84).
* **Filtering for trustworthiness:** Metadata can help filter previous translations, ensuring that more trustworthy material is reused [87](#page=87).
#### 12.2.3 Translation process and metadata
TM software manages the translation process by aligning source language segments with their corresponding target language segments. When a segment is encountered that has been translated previously, the TM software automatically proposes that previous translation to the current translator. This automatic suggestion is a direct result of the stored metadata within the translation unit [86](#page=86).
> **Tip:** The ability to leverage metadata for filtering and prioritization is key to the cost and time savings TM software aims to provide in specialized translation and localization projects [87](#page=87).
#### 12.2.4 Fuzzy matching and metadata
TM systems can also suggest partial or ‘fuzzy’ matches. These suggestions are based on a calculated percentage of similarity between a new source segment and existing source language segments already stored in the TM. While not directly metadata in the sense of descriptive attributes, the underlying process of matching and scoring relies on the stored translation units, which are rich in metadata [86](#page=86).
> **Example:** If a translator encounters a sentence that is 90% similar to a previously translated sentence, the TM software, utilizing its stored data (including metadata about the original translation), will propose a partial match, saving the translator significant effort [86](#page=86).
### 12.3 Challenges with metadata
A significant challenge related to metadata in TM software is the potential loss of important information when transferring data between different formats or software tools. This issue of software interoperability can restrict users to a particular TM software if they are concerned about losing valuable metadata [84](#page=84).
---
# Translation memory alignment from legacy text
This topic covers the process of creating translation memories from existing translated documents that are not already in TM format.
### 13.1 Introduction to translation memory alignment
Translation Memory (TM) systems are crucial in the translation and localization industry for reusing previously translated content. These systems store translations as "translation units" that can be leveraged in new projects, offering either 100% or fuzzy matches. This leads to significant savings in time and money, as well as improved consistency across different platforms and versions of a product [97](#page=97).
### 13.2 The need for alignment of legacy text
There are several scenarios where legacy translation material might not be available in TM format [98](#page=98):
* Translations were completed by in-country offices lacking access to TM systems [98](#page=98).
* Linguistic vendors did not provide a TM as part of the project handover [98](#page=98).
* Linguistic vendors provided a TM, but its quality was poor. Subsequent improvements were made only to the translated files, not to the original TM [98](#page=98).
In such cases, existing translated work is not lost; TMs can be created from legacy text through a process called "alignment" [98](#page=98).
### 13.3 What is alignment?
Alignment is the process of matching segments between a source file and its corresponding translation file. This process builds a repository of translation units that can then be saved as a TM for use in future translation projects [99](#page=99).
### 13.4 The alignment process
The initial alignment is typically performed using automated alignment tools available on the market. The process involves loading a set of source and target files into the tool, linking them based on their filenames, and then running an automatic alignment on each file pair [100](#page=100).
These tools analyze the structure of both the source and target files to match source text segments with their probable translations on a sentence-by-sentence basis. Modern alignment tools have become highly sophisticated, and the results of the automated process are generally very good [100](#page=100).
Some alignment tools can also generate reports that include a quality score. This score, based on internal algorithms, provides an indication of the alignment's success rate [100](#page=100).
> **Tip:** The quality of the alignment process is highly dependent on the quality and structure of the source and target files provided.
> **Example:** Reports from alignment tools might indicate a high match rate for well-structured documents, suggesting a robust TM can be created. Conversely, documents with significant structural differences or inconsistencies between source and target may result in lower quality scores and require more manual post-alignment editing.
### 13.5 Alignment reports
Alignment tools often provide reports to evaluate the success of the automated alignment. These reports can offer insights into the reliability of the generated TM .
---
# Localization goes beyond translation to adapt content for local audiences
Localization is a process that extends beyond simple translation to adapt content for specific local audiences, considering cultural nuances, local laws, and regional dialects to foster trust and connection with the target market .
### 14.1 Understanding the difference: Translation vs. Localization
Translation is the conversion of content from a source language to a target language, adhering to grammatical rules and syntax while preserving the original meaning. It is crucial for documents like user manuals, medical texts, and scientific journals, requiring translators to produce accurate work .
Localization, on the other hand, is a more comprehensive adaptation of a message to resonate with local audiences. It is widely applied to digital content such as websites, mobile apps, software, video games, multimedia, and voiceovers .
### 14.2 The necessity of localization for market success
Simply translating content is often insufficient for a business to succeed in local markets. Localization is vital for gaining the trust of the local public, as entering a foreign country involves more than overcoming language barriers; it requires a customized message tailored for each audience .
Cultural barriers can significantly impede the understanding of an original message .
> **Example:** KitKat's launch in Japan illustrates effective localization. Instead of merely translating "Have a break, have a KitKat," they adapted it to "Kitto Katsu," meaning "surely win" in Japanese. They also introduced exotic chocolate flavors to cater to local tastes, resulting in a successful campaign that utilized local language expressions for resonance .
### 14.3 Adapting to cultural expectations while maintaining brand identity
Effective globalization requires localizing content to align with the culture of each country, while simultaneously preserving a unique brand voice that is recognizable worldwide .
Coca-Cola exemplifies this by maintaining a singular global message while adapting campaigns to local markets. Their brand colors are universally recognized, but their marketing strategies differ by country to meet public expectations .
> **Example:** In China, Coca-Cola became "kekou kele," which translates to "delicious happiness." This adaptation allowed the beloved drink to successfully penetrate the market .
### 14.4 The comprehensive localization process
Localization is not merely about translating content or changing labels; it involves a deeper cultural integration. Brands engage local experts and teams of specialists to develop new names and localized marketing strategies. The goal is to sell the brand's image while demonstrating respect for local culture, which can be vastly different from Western norms .
Localization entails a cultural approach where websites or apps are reshaped to make the local public feel as though the content was specifically created for them. This includes working with local marketers and consultants to ensure compliance with local laws and cultural sensitivities .
> **Tip:** When localizing, remember that languages can have regional variations. Spanish spoken in Argentina, Mexico, and Spain, or English spoken in the US, Australia, and Canada, each have distinct characteristics that need consideration .
---
# Translation for SEO and effective international web presence
This topic explores the crucial role of skilled translators in enhancing a client's global online visibility through effective Search Engine Optimization (SEO) for website localization .
### 15.1 Key considerations for optimizing content for international markets
#### 15.1.1 Keyword research
Thorough keyword research in the target language and region is essential to identify terms and phrases relevant to the local audience. This includes considering linguistic variations, synonyms, and colloquial expressions that users might employ in their search queries .
#### 15.1.2 Cultural relevance
Understanding cultural nuances and preferences is vital for choosing keywords that resonate with the target audience. Literal translations that might not capture the intended meaning or sound unnatural in the target language should be avoided .
#### 15.1.3 Localized content
Translated content must be not only linguistically accurate but also culturally appropriate. Adapting content to align with local customs, traditions, and market trends enhances its relevance for the target audience .
#### 15.1.4 Metadata optimization
Translating and optimizing meta titles, meta descriptions, and URL slugs is a critical aspect of SEO. Crafting compelling and concise meta descriptions that incorporate relevant keywords encourages click-throughs .
#### 15.1.5 Multilingual link building
Collaboration with web developers and marketers is necessary to build a network of high-quality, multilingual backlinks. Identifying reputable local websites and influencers for potential collaborations contributes to improved search engine rankings .
#### 15.1.6 Content structure and formatting
Ensuring that translated content maintains a user-friendly structure and formatting is important. Search engines value well-organized content, so utilizing headers, bullet points, and other formatting elements enhances readability and SEO .
#### 15.1.7 Mobile optimization
The growing importance of mobile search necessitates ensuring that translated content is mobile-friendly. Optimizing images and other media for fast loading times on mobile devices positively impacts SEO rankings .
#### 15.1.8 Regular updates
Staying informed about changes in search engine algorithms and adapting SEO strategies accordingly is crucial. Regularly updating translated content to reflect current trends ensures sustained visibility in international markets .
#### 15.1.9 Analytics and reporting
Close collaboration with clients to monitor website analytics and assess the performance of localized content is key. Providing insights and recommendations based on data analysis allows for the continuous refinement of SEO strategies .
#### 15.1.10 Communication with clients
Establishing clear communication channels with clients is essential to understand their business goals, target audience, and specific SEO objectives. Collaborating on a strategy that aligns translation efforts with broader marketing initiatives ensures a comprehensive international SEO approach .
---
# The significance of web page metadata and professional translators in foreign market optimization
Web page metadata and the strategic involvement of professional translators are crucial for effective optimization in foreign markets, ensuring both search engine visibility and user engagement .
### 16.1 The role of web page metadata
Metadata, which includes elements like meta titles and descriptions embedded within a webpage's code, is fundamental for improving a website's discoverability and search engine ranking. Search engine algorithms rely on accurate metadata to understand and index webpage content .
### 16.2 The significance of professional translators in foreign market optimization
Professional translators are indispensable allies in optimizing web pages for international audiences. Their role extends beyond simple translation to ensuring linguistic accuracy, cultural relevance, and strategic keyword integration .
#### 16.2.1 Enhancing search engine visibility
Translating metadata ensures that a website's content is understandable and indexable by search engines in foreign languages, thereby improving its visibility in international search results .
#### 16.2.2 Impacting user click-through rates
Skilled translators craft meta titles and descriptions that are not only linguistically correct but also compelling and culturally resonant for the target audience. This effectively encourages users to click on the search result links .
#### 16.2.3 Ensuring local relevance and appeal
A deep understanding of cultural nuances and audience preferences allows translators to localize metadata, making the content more appealing and relevant to users in specific foreign markets .
#### 16.2.4 Optimizing keyword usage
Professional translators are proficient in keyword research for target languages. They can optimize metadata by incorporating region-specific terms, significantly increasing the chances of the webpage appearing in relevant search queries .
> **Tip:** Effective keyword integration by translators can dramatically improve a webpage's ranking for local searches in foreign markets.
#### 16.2.5 Maintaining global brand consistency
For businesses operating internationally, translators ensure that the translated metadata aligns with the brand's established tone and message, presenting a unified and cohesive global brand image .
#### 16.2.6 Adhering to character limits
Search engines impose character limitations on meta titles and descriptions. Translators possess the skill to create concise yet impactful translations that fit within these constraints, preventing content truncation and ensuring full visibility in search results .
> **Example:** A meta title that is too long might be cut off in search results, e.g., "Best Italian Restaurants in Rome, Italy for Food Lovers..." might appear as "Best Italian Restaurants in Rome, Italy for Food..." if it exceeds the limit, losing valuable information. A translator would ensure the most critical information is present.
#### 16.2.7 Building credibility and trust
Inaccurate or poorly executed translations of metadata can severely damage a website's credibility. Professional translators safeguard the integrity of the content, fostering trust with users in foreign markets .
#### 16.2.8 Adapting to market trends
The linguistic and cultural landscape of markets is dynamic. Translators who stay current with these shifts can update and adapt translated metadata to reflect contemporary trends, contributing to sustained optimization efforts .
### 16.3 The essence of the collaboration
Ultimately, a holistic approach to web page optimization in foreign markets necessitates a strong collaboration between professional translators and web developers or marketers. Translators effectively bridge linguistic and cultural divides, ensuring that metadata is not just translated but strategically optimized for the specific nuances of each target audience, thereby enhancing a website's global competitiveness .
---
# Managing glossaries in WFA
This section details how to effectively use, add to, and manage glossaries within the WFA tool to enhance translation consistency and efficiency.
### 17.1 Using a glossary
When WFA identifies a term present in its glossary within a source segment, it highlights the term with a blue background. These highlighted terms are treated as placeables. This means they can be navigated using the "Previous" and "Next" icons or their corresponding keyboard shortcuts (Ctrl+Alt+Right and Ctrl+Alt+Left). Clicking directly on the term or typing its initial letter followed by Tab also allows for interaction .
A key benefit of using glossaries is that when you copy a glossary term to the target segment, it is the *translation* of that term that is copied. The Auto-suggest feature, which is enabled by default, is an efficient way to copy target terms. Auto-suggest proposes target terms by either typing the first letter of the target term or the first three letters of the source term .
#### 17.1.1 Accessing glossary information
You can view the translations of highlighted glossary terms in advance by activating the glossary panel. This can be done via the keyboard shortcut Ctrl+Alt+H or by navigating to the View tab and selecting "Show/Hide Glossary" .
Furthermore, to gain more insight into a specific glossary term, such as comments or information stored in F1, F2, and F3 fields, you can simply hover your mouse pointer over the source term. This action will display the associated information in a pop-up bubble .
### 17.2 Adding terms to the glossary
WFA allows for dynamic addition of terms to the glossary, integrated directly into the translation process, without needing to exit the application. This functionality is invaluable for reinforcing your linguistic memory and reducing the need to re-research terms encountered during translation .
#### 17.2.1 Dynamic term addition process
The process of adding a new term to the glossary is as follows :
1. **Select the source term:** Identify and select the specific term within the source text that you wish to add to the glossary. If it's a single word, you can click on it directly or use the Tab key (or Shift+Tab) to navigate to it. The selected source term will be highlighted with a red border .
2. **Select the target term:** Subsequently, select the corresponding translated term within the target segment. This selected target term will appear with a blue background .
3. **Invoke the Glossary Dialog Box:** Open the Glossary Dialog Box by using the keyboard shortcut Ctrl+Alt+T or by clicking the "Add Term" button .
4. **Populate term fields:** If the selected terms consist of a single word, they should automatically populate the "Source" and "Target" fields within the dialog box. For terms comprising multiple words, you might need to paste the text from your clipboard or manually input it into these fields .
5. **Add supplementary information:** You have the option to add a comment to the glossary entry. This comment can serve as a reminder of the context in which a particular translation was used. Additionally, the F1, F2, and F3 fields can be utilized to store information such as the word's role, context, grammatical form, or any other pertinent text-based details .
6. **Save the term:** Once all the necessary information has been entered, click "Save" to confirm and add the term to your glossary .
> **Tip:** Dynamically adding terms as you encounter them is a highly effective strategy for building a personalized and comprehensive glossary, significantly streamlining future translation tasks.
>
> **Example:** If you encounter a technical term, "epistemological framework," and determine its translation is "marco epistemológico," you would select both phrases in source and target, invoke the glossary dialog, and potentially add a note in the comment field like "Used in the context of philosophical research on knowledge."
---
# Controlled natural languages for technical documentation
Controlled natural languages (CNLs) are restricted versions of natural languages designed to minimize ambiguity and complexity by limiting grammar and vocabulary. They are broadly categorized into two types: those enhancing readability for humans, particularly non-native speakers, and those facilitating reliable automatic semantic analysis. The former, often termed "simplified" or "technical" languages, are employed in industries to elevate the quality of technical documentation and potentially simplify its semi-automatic translation. Examples include Caterpillar Technical English, Simplified Technical English, and IBM's Easy English. These languages impose writing constraints such as advocating for short sentences, avoiding pronouns, using only approved vocabulary, and preferring the active voice .
### 18.1 Principles and types of controlled natural languages
CNLs aim to achieve clarity and enable machine processing through systematic restrictions .
#### 18.1.1 Types of controlled languages
* **Readability-focused CNLs:** These versions aim to make technical documentation more accessible to a wider audience, including individuals who are not native speakers of the language .
* **Machine-processable CNLs:** These focus on enabling accurate and reliable automatic analysis of the text's meaning .
> **Tip:** The distinction between these types is not always rigid; many CNLs serve both purposes to varying degrees.
#### 18.1.2 Examples of controlled natural languages
A wide array of CNLs exist, reflecting efforts across different sectors and research areas :
* ASD Simplified Technical English
* Attempto Controlled English
* Aviation English
* Basic English
* ClearTalk
* Common Logic Controlled English
* Distributed Language Translation Esperanto
* E-Prime
* Français fondamental
* Gellish Formal English
* Interlingua-IL sive Latino sine flexione (Giuseppe Peano)
* ModeLang
* Newspeak
* Processable English (PENG)
* Seaspeak
* Semantics of Business Vocabulary and Business Rules
* Special English
* PLAIN LANGUAGE MOVEMENT (Lenguaje claro)
#### 18.1.3 Controlled languages in industry
Many companies have developed or adopted CNLs to standardize their technical communication and translation processes .
* **Avaya:** Avaya Controlled English (ACE)
* **Boeing:** Simplified Technical English (STE), ASD-STE100
* **Caterpillar:** Caterpillar Technical English (CTE), Caterpillar Fundamental English (CFE) .
* **Dassault Aerospace:** Français Rationalisé
* **European Aeronautic Defence and Space Company (EADS):** Simplified Technical English (STE), ASD-STE100
* **Ericsson:** Ericsson English
* **General Motors (GM):** Controlled Automotive Service Language (CASL)
* **IBM:** Easy English
* **Kodak:** International Service Language
* **Nortel:** Nortel Standard English (NSE)
* **Océ:** Controlled English
* **Rolls-Royce:** Simplified Technical English (STE), ASD-STE100
* **Saab Systems:** Simplified Technical English (STE), ASD-STE100
* **Scania:** Scania Swedish
* **Sun Microsystems:** Sun Controlled English
* **Xerox:** Xerox Multilingual Customized English
> **Example:** Caterpillar Inc. uses Caterpillar Fundamental English, which restricts its vocabulary to approximately 850 words, to ensure consistency and quality in the vast amount of technical documentation for its complex products .
### 18.2 Rules for controlled languages
The specific grammar rules for controlled languages vary by language. However, general principles can significantly reduce ambiguity in texts across many languages, making them more suitable for machine translation .
> **Tip:** The goal of these rules is not to create a universal grammar but to establish a framework that minimizes linguistic uncertainty for both human readers and automated systems.
#### 18.2.1 CLOUT rule set
The CLOUT™ rule set, an acronym for Controlled Language Optimized for Uniform Translation, was developed by Uwe Muegge. These rules exemplify the types of restrictions implemented in CNLs to enhance translation accuracy and consistency .
> **Example:** While specific CLOUT rules are not detailed in the provided text, they would typically include guidelines on sentence structure, vocabulary usage, and the avoidance of ambiguous linguistic constructs.
---
# Evolution of technology in the translation industry
The evolution of technology in the translation industry has been driven by an increasing volume of documentation and the need for greater efficiency and consistency in handling similar and updated texts.
### 19.1 General overview and historical context
The development of translation memory (TM) tools was significantly influenced by several key factors [8](#page=8):
* The growing volume of documentation accompanying technical equipment [8](#page=8).
* The economic development of companies leading to increased documentation needs [8](#page=8).
* The rise in the number of similar and updated documents requiring translation [8](#page=8).
* The necessity to translate documents with diverse communicative functions [8](#page=8).
* The increasing demand for reproducing electronic content in various formats [8](#page=8).
> **Tip:** Understanding these underlying pressures is crucial for grasping why technological advancements in translation became so essential.
The provided document content outlines a journey through the history and impact of technology on the translation industry, beginning with a look at its origins questioning the extent of actual improvement and then presenting a general overview of the background that necessitated these technological shifts [2](#page=2) [6](#page=6) [7](#page=7) [8](#page=8).
### 19.2 History of translation memory systems
While the document mentions the "History of Translation Memory systems" as a key outline point the provided content specifically details the initial driving forces behind their development rather than the granular history of their first steps or commercial developers. The background provided strongly suggests that the need for handling large volumes of similar and evolving documentation was the primary catalyst for the creation of TM systems [2](#page=2) [8](#page=8).
> **Tip:** The "History of Translation Memory systems" section would typically delve into early concepts, pioneer developers, and significant milestones in TM software. The current document focuses more on the *why* than the *how* of TM's historical development.
### 19.3 Translation workflow and industry challenges
The document lists "Translation workflow" and "Challenges in Language Industry" as topics covered. However, the provided pages do not elaborate on the specifics of these sections. The general overview focuses on the context leading to the development of TM tools implying that these tools are part of a broader workflow aimed at addressing challenges in the language industry, such as managing large volumes and ensuring consistency [2](#page=2) [8](#page=8).
> **Tip:** Future sections of this study guide would likely explore how technologies like TM have reshaped translation workflows and helped overcome specific industry challenges.
---
# Definition and basic components of computer-assisted translation
Computer-assisted translation (CAT) refers to a suite of computer applications specifically designed to efficiently assist translators in their tasks. The primary goal of a CAT system is to provide translators with all the resources they might need for their work, automatically and quickly [32](#page=32).
### 20.1 The role and benefits of CAT tools
CAT tools significantly enhance a translator's productivity by streamlining various aspects of the translation process. The ability to leverage existing translations, particularly through features that identify repetitions, can lead to substantial time and cost savings, impacting how clients are billed. Proficiency in CAT tools is a sought-after skill in the translation industry, with institutions like the European Commission emphasizing the importance of mastering these technologies, along with terminology and common office software [34](#page=34) [35](#page=35).
> **Tip:** CAT tools are not about replacing the translator but empowering them to work more efficiently and consistently.
### 20.2 Differentiating CAT from Machine Translation (MT)
It is crucial to distinguish between Machine Translation (MT) and Computer-Assisted Translation (CAT) [36](#page=36).
* **Machine Translation (MT):** The translation is performed entirely by a machine, such as Google Translate [36](#page=36).
* **Computer-Assisted Translation (CAT):** The translation is performed by a human translator who is aided by various translation tools [36](#page=36).
While distinct, MT can be integrated into CAT workflows, as many CAT systems have access to MT engines [36](#page=36).
### 20.3 Essential CAT tool components
Essential CAT tools can be broadly categorized, with project management software and translation memory software being fundamental [37](#page=37) [38](#page=38).
#### 20.3.1 Translation project management software
This type of software is vital for overseeing the entire translation workflow. Key functions include:
* **Information flow control:** Managing the movement of data and project files [37](#page=37).
* **Translation assignment:** Distributing tasks to translators [37](#page=37).
* **Quality control:** Implementing checks to ensure the accuracy and consistency of translations [37](#page=37).
* **Content analysis:** Examining the source text for specific requirements [37](#page=37).
* **Report generation:** Producing summaries of translation progress, including metrics like full matches, fuzzy matches, and repetitions within and across files [37](#page=37).
* **Word count:** Calculating the volume of text to be translated [37](#page=37).
* **Final delivery:** Managing the handover of the translated content to the client [37](#page=37).
#### 20.3.2 Translation memory (TM) software
Translation memory software is designed to store previously translated segments, enabling their reuse and ensuring consistency. Its core functions are:
* **Storing translations:** Creating a database of translated sentence pairs (source and target) [38](#page=38).
* **Establishing consistency:** Maintaining uniformity in terminology and phrasing across a project or for a specific client [38](#page=38).
* **Retrieving translation units:** Facilitating the quick recall of stored translations, which significantly boosts productivity [38](#page=38).
Common file extensions for translation memories include:
* `.tmx` (Translation Memory eXchange): An open, standard format compatible with most CAT tools like Trados, MemoQ, and Wordfast [38](#page=38).
* `.sdltm`: A proprietary format specific to SDL Trados Studio [38](#page=38).
* `.txt` / `.csv`: Sometimes used for exporting memories in plain text or for manipulation in spreadsheet software like Excel [38](#page=38).
> **Example:** When translating a new document for a client, a CAT tool with a translation memory can instantly flag sentences or phrases that have been translated before for that same client, suggesting the previous translation for reuse. This ensures consistency and saves the translator time from re-translating identical or similar content.
---
# Automated alignment process
The automated alignment process is the initial step in preparing translation memories by matching sentences or segments from source and target language files [100](#page=100).
### 21.1 Overview of the automated alignment process
Automated alignment tools take a set of source and target files, linking them based on filenames. The tools analyze the structure of both files and sequentially match source text with its probable translations. Modern alignment tools have become highly sophisticated, generally producing very good results. Some tools also generate reports with a quality score, calculated using internal algorithms, to indicate the success of the alignment [100](#page=100).
### 21.2 Importance and development of alignment evaluation
The significance of sentence alignment has led to international projects focused on developing evaluation metrics. These include :
* **Project ARCADE (1995-1996):** Aimed to create a bilingual French-English corpus suitable for alignment tasks and evaluation .
* **MULTEXT-East Project:** Aligned six translations of George Orwell's novel "1984" with the English original, with manual validation of the alignments .
* **Egypt Statistical Machine Translation Toolkit:** Contributed to the field of alignment tools .
* **GIZA++:** Used for training statistical translation models .
### 21.3 Linguistic verification after automated alignment
Following the automated alignment, a linguist reviews each segment. This involves approving correct matches and correcting or deleting incorrect ones .
> **Example:** An incorrect match can occur if two English source sentences are translated into a single German sentence for better flow. The alignment tool might not recognize this, leading to subsequent matches being out of sync. Once the linguist makes the necessary corrections, they can re-run the automatic alignment from that point forward, updating any incorrect matches. After completion, the approved segments from all files are exported into a Translation Memory (TM) format .
### 21.4 Factors influencing alignment results
Several factors can improve the outcomes of the automated alignment process .
#### 21.4.1 File format consistency
Source and target files should be in the same format. Differences in formatting (e.g., InDesign vs. Word) are converted into tags by Translation Memory systems. While alignment tools can use these tags as guides, inconsistencies between source and target files can hinder accurate segment matching .
#### 21.4.2 Version control of files
It is crucial that the source and target files are the same version. If a source file is updated with new information or deletions after the initial translation, and the target file is not similarly updated, the alignment process becomes more complex .
#### 21.4.3 Quality of translated files
The automated alignment process typically does not include a linguistic review of existing translations. Therefore, it is essential that the client is satisfied with the quality of the source translations. While a linguist can review files during the alignment process, this will increase the time required for the task .
### 21.5 Available alignment tools
A variety of automated alignment tools are available on the market. Some notable examples include [100](#page=100) :
* WinAlign
* SDL Align
* FarkasAndras' open source aligner
* AlignFactory light (part of MemoQ's built-in aligner)
* Abbyy Aligner
* LF Aligner
* Hunalign
* Transit NXT aligner
* Plus Tools
* TextAlign
* Bligner
* bitext2tmx
---
# Localization is essential for global business success and market penetration
Localization is crucial for global business success and market penetration, as it involves adapting content and products to resonate with local audiences beyond mere translation .
### 22.1 Understanding localization versus translation
Translation is the process of converting text from a source language to a target language, adhering to grammar and syntax, and ensuring the original meaning is preserved. It is a fundamental component of localization but does not encompass its full scope .
Localization, on the other hand, is a more comprehensive process that adapts a message to local audiences. This involves more than just rewriting text; it requires understanding and integrating cultural nuances, local laws, and consumer preferences. Localization is applied to websites, mobile apps, software, video games, multimedia content, and voiceovers .
> **Tip:** While translation focuses on linguistic accuracy, localization focuses on cultural relevance and user experience, aiming to make the target audience feel the content was created specifically for them .
### 22.2 Key differences and reasons for localization
#### 22.2.1 Beyond language barriers
Selling in a foreign country involves more than just overcoming language barriers. It requires a customized message tailored for each local audience to gain their trust. Cultural barriers can impede the understanding of the original message .
#### 22.2.2 Adapting to local tastes and expressions
Languages often have local versions and dialects that need consideration in marketing strategies, similar to how English varies across regions. Companies must go beyond direct translation to connect with local sentiment .
> **Example:** KitKat's Japanese campaign replaced the slogan "Have a break, have a KitKat" with "Kitto Katsu," meaning "surely win," and introduced exotic chocolate flavors to appeal to local tastes .
#### 22.2.3 Meeting cultural expectations
Localization ensures that a business aligns with the local culture of each target country. While maintaining a consistent brand voice globally is important for brand recognition, marketing campaigns need adaptation to meet local expectations .
> **Example:** Coca-Cola became "kekou kele" in China, translating to "delicious happiness," to better resonate with the market while maintaining its brand identity. This involved collaborating with local experts to develop a new name and marketing strategy that respected the local culture .
### 22.3 Elements requiring localization
To achieve global success, businesses must localize various aspects of their products and communications beyond just text. This includes preparing websites to be appealing to diverse audiences by considering a wide range of details that can break cultural barriers and improve usability .
#### 22.3.1 Visual and design elements
* **Colors:** Different colors carry varied meanings across cultures. For instance, red might signify danger in one region, while white could represent death in another. Thorough research is necessary before targeting new audiences .
* **Layout:** Languages differ in their conciseness. Some require more space than others to convey the same meaning, necessitating a flexible layout design .
* **Visuals:** Images and photos must be adapted to local cultures. A blonde mother hugging her children might not appeal to a Chinese audience and could even offend consumers in the Middle East .
#### 22.3.2 Practical and regulatory considerations
* **Units of Measurement:** Most countries use the metric system. Converting units of measurement ensures content is easily understood .
* **Currency units:** Currency amounts need localization, converting from one currency to another and showing equivalent values. For example, changing from 100 dollars to 65 pounds sterling .
* **Paper size:** Document formatting can be affected by different paper sizes used in different regions, such as European A4 versus American letter size .
* **Date formats:** Variations in date formats, such as day/month/year versus month/day/year, can lead to crucial misunderstandings .
* **Text length:** Translations can cause text to expand significantly (30% to 100%), requiring flexible text areas in products and documents .
* **Contracts and Agreements:** Compliance with local regulations and laws is essential when conducting business in foreign countries to avoid legal issues, penalties, or website bans .
### 22.4 The role of localization in market penetration and growth
Localization is critical for increasing engagement and fostering customer identification with a brand's message before purchase. By tailoring marketing efforts to meet local expectations, businesses can maximize their investment and enhance their chances of increasing sales and achieving global business growth. It is a strategic imperative in today's globalized marketplace .
---
# Understanding translation memory file formats
A translation memory (TM) file is a structured text file, typically in XML format, that stores translation and linguistic data, enabling translators to leverage prior work for improved efficiency and consistency [66](#page=66).
### 23.1 What is a translation memory file?
A translation memory file is fundamentally a structured text file, often an XML (Extensible Markup Language) file. XML provides a well-defined structure that allows for the representation of complex data structures. These files are not proprietary and can be opened and understood using standard text editors [66](#page=66).
### 23.2 Information stored in translation memory files
Translation memory files store various types of linguistic and contextual data, including:
* **Main Information:**
* Segments (source and target text) [67](#page=67).
* Language of the segments [67](#page=67).
* Creation dates and times [67](#page=67).
* **Additional Data (optional):**
* Author of the translation [67](#page=67).
* Usage count of a translation unit [67](#page=67).
* Change dates and times [67](#page=67).
* Creation tool used [67](#page=67).
* Domain or field of the content [67](#page=67).
* Alternate translations [67](#page=67).
* Notes related to the translation [67](#page=67).
### 23.3 Typical formats of translation memory files
The most prevalent industry-standard formats for translation memory files are XLIFF and TMX, both of which are based on XML. While other formats like spreadsheet files (Excel - XLS) and comma-separated value text files (CSV) can also store translation memory data, they typically offer less detailed information per translation unit, such as only storing the segment and language [68](#page=68).
The adoption of XML for XLIFF and TMX offers several advantages over raw text files:
* **Parsability:** XML's well-defined structure makes it easy to parse [68](#page=68).
* **Semantic Meaning:** Tags within XML (e.g., ``, ``) provide semantic context, indicating the meaning of the data [68](#page=68).
* **Tooling Support:** A wide array of software tools are built to validate, import, parse, and search XML files [68](#page=68).
* **Interoperability:** XML's well-defined structure facilitates data exchange between different applications and systems [68](#page=68).
#### 23.3.1 File structure: Header and Body
Translation memory files are generally divided into two main sections: the header and the body [69](#page=69).
##### 23.3.1.1 Header
The header contains metadata about the file and the localization process. The semantic naming of XML tags in the header makes these files human-readable and understandable even without referring to the specification [69](#page=69).
* **TMX Header Example:**
[ ] [70](#page=70).
* **XLIFF Header Example:**
[ ] [71](#page=71).
##### 23.3.1.2 Body
The body of the file contains the core translation data, specifically the translation units and their corresponding segments [72](#page=72).
* **TMX Body Example:**
[ ] [73](#page=73).
* **XLIFF Body Example:**
[ ] [74](#page=74).
### 23.4 Importance of translation memory files
Translation memory files are crucial for professional translation workflows for two primary reasons:
* **Improved efficiency:** By loading a TM file into a Computer-Assisted Translation (CAT) or Translation Environment Tool (TEnT), translators can leverage their previous work. The tool automatically alerts the translator to exact or partial matches with existing translations, significantly speeding up the translation process [75](#page=75).
* **Ensured consistency:** TM files help maintain consistency in terminology and phrasing across projects and for different clients. Using client-based or project-based TMs ensures accuracy and adherence to specific requirements throughout a translator's career [75](#page=75).
### 23.5 Differences between TMX and XLIFF
Both TMX and XLIFF are industry-standard, XML-based file types with commonalities, but they possess distinct structures and elements tailored to slightly different purposes [76](#page=76).
Key differences include:
* **Purpose:** XLIFF was designed to store extracted text and facilitate data transfer throughout the localization process, while TMX was created specifically for exchanging translation memory data between tools [76](#page=76).
* **Language Handling:** TMX can accommodate multiple languages within a single document. In contrast, XLIFF is structured to handle one source and one target language at a time [76](#page=76).
* **Inline Code Handling:** TMX primarily uses encapsulation methods for inline codes. XLIFF supports both encapsulation (similar to TMX) and a placeholder method where native codes are moved to a separate "Skeleton" file and replaced by short referring elements, akin to OpenTag's approach [76](#page=76).
* **Order and Rebuilding:** A collection of `` elements in a TMX file has no specific order and lacks a mechanism to reconstruct the original file structure. XLIFF, however, often provides better support for this [76](#page=76).
* **Additional Data Fields:** XLIFF includes data types and fields not present in TMX, such as pre-translation, history tracking, versioning, and support for binary objects [76](#page=76).
* **Time and Date Data:** TMX files can store time and date information at the translation unit level, a capability not present in XLIFF files [76](#page=76).
---
# Controlled languages in companies and their rules
Controlled languages are a crucial tool for companies aiming to improve the clarity, consistency, and machine-translatability of their technical documentation. These languages employ a specific set of rules to restrict linguistic complexity and ambiguity, thereby facilitating easier comprehension and more accurate automated translation. Several major companies have developed or adopted their own controlled languages, including Kodak's International Service Language, Nortel's Nortel Standard English (NSE), Océ's Controlled English, Rolls-Royce and Saab Systems' Simplified Technical English (STE) (also known as ASD-STE100), Scania's Scania Swedish, Sun Microsystems' Sun Controlled English, and Xerox's Xerox Multilingual Customized English .
### 24.1 Principles of controlled languages
The fundamental principle behind controlled languages is that grammatical rules are language-specific, meaning there are no universal rules that guarantee optimal results for all languages. However, implementing a set of controlled language rules can significantly reduce ambiguities in texts across many languages, making them ideal for machine translation. The CLOUT™ (Controlled Language Optimized for Uniform Translation) rule set, developed by Uwe Muegge, exemplifies such a set of rules .
### 24.2 Key rules for controlled language
The following rules are examples of those found in controlled language sets, designed to enhance clarity and support machine translation :
#### 24.2.1 Sentence length
**Rule 1: Write sentences that are shorter than 25 words** .
This rule promotes conciseness and avoids the complexity often associated with lengthy sentences.
> **Example:**
> **Write:** The author performs the following tasks: Collect the necessary information. Analyze and evaluate the information. Write a structured draft .
> **Do not write:** Authors will approach any writing project by collecting the necessary information first, and after carefully analyzing and evaluating it, they will create a structured draft .
#### 24.2.2 Single idea per sentence
**Rule 2: Write sentences that express only one idea** .
Breaking down complex thoughts into distinct sentences improves readability and simplifies parsing for translation systems.
> **Example:**
> **Write:** Authors who optimize their texts for easy comprehension facilitate the translation process. These texts enable machine translation systems to produce better translation results .
> **Do not write:** By optimizing their texts for easy comprehension, authors facilitate the translation process, and doing so enables machine translation systems to create better translation results .
#### 24.2.3 Consistent phrasing for identical content
**Rule 3: Write the same sentence if you want to express the same content** .
Consistency in phrasing for similar ideas or instructions helps users recognize patterns and understand information more readily, especially in procedural texts.
> **Example:**
> **Write:** Printer Installation. 1) Remove the printer from the carton. 2) Remove the plastic wrapping .
> **Do not write:** Instructions for installing the printer. After unpacking the printer from the shipping carton, take the printer out of the plastic bag .
#### 24.2.4 Grammatical completeness
**Rule 4: Write sentences that are grammatically complete** .
Ensuring sentences have a subject and verb prevents them from being ambiguous or appearing as fragments.
> **Example:**
> **Write:** Do you wish to continue the installation of the software ?
> **Do not write:** Continue installing software ?
#### 24.2.5 Simple grammatical structure
**Rule 5: Write sentences that have a simple grammatical structure** .
Avoiding complex clauses, inversions, or convoluted phrasing makes sentences easier to understand and process.
> **Example:**
> **Write:** Show that you can organize your thoughts by using a simple sentence structure in your texts .
> **Do not write:** You, in your texts, to show that you can organize your thoughts, should use a simple sentence structure .
#### 24.2.6 Active voice
**Rule 6: Write sentences in the active form** .
The active voice generally leads to more direct, clear, and concise sentences compared to the passive voice.
> **Example:**
> **Write:** The program manager will send a summary of all questions to the responsible coworkers .
> **Do not write:** A summary of questions will be sent to the responsible individuals .
#### 24.2.7 Noun repetition over pronouns
**Rule 7: Write sentences that repeat the noun instead of using a pronoun** .
This rule aims to eliminate potential ambiguity that can arise from pronoun references, particularly in complex sentences or across multiple sentences.
> **Example:**
> **Write:** You must check the spelling of your text before you publish your text .
> **Do not write:** You must check the spelling of your text before publishing it .
#### 24.2.8 Use of articles
**Rule 8: Write sentences that use articles to identify nouns** .
The inclusion of articles (like "a," "an," "the") clarifies whether a noun is specific or general, preventing potential misinterpretations.
> **Example:**
> **Write:** Test the installation .
> **Do not write:** Test installation .
---
# Understanding and utilizing term bases and glossaries in translation
A term base, also known as a glossary, is a database that stores single words or expressions pertinent to a specific subject matter, often in a bilingual or multilingual format .
### 25.1 What is a term base?
A term base is essentially a specialized database designed to manage terminology. It contains individual words or phrases that are relevant to a particular field or subject. These terms are typically presented in a bilingual or multilingual arrangement, meaning they can include translations into multiple languages .
### 25.2 How term bases work
Term bases are frequently integrated as a feature within many Computer-Assisted Translation (CAT) tools. Users have the capability to import pre-existing glossaries into these systems. Furthermore, as translation work progresses, new terms can be added or existing ones updated within the term base. It's also possible to consolidate several bilingual glossaries into a single multilingual one. Term bases also allow for the designation of "forbidden terms," which are expressions or words that should not be used in translations .
### 25.3 Utilizing a term base
When constructing a term base, the initial step involves identifying key terminology that needs to be managed. To ensure the quality of a term base, it is crucial to utilize final versions of source texts, translations that have already been approved, and thoroughly researched contextual information .
A term base can either be created specifically for a new translation project or imported from previous translation endeavors. Once a term base is established, ongoing maintenance is necessary to incorporate any changes in source texts, translations, or contextual data. Neglecting this maintenance can lead to a decline in translation quality .
> **Tip:** Always use the most up-to-date and approved source materials when building or updating a term base to ensure accuracy.
### 25.4 Benefits of using a term base
The adoption of a term base offers several significant advantages in the translation process:
* **Increased consistency:** A well-constructed term base ensures that the core message remains consistent across multiple translation projects, which is particularly vital when an organization involves several collaborators .
* **Improved translation quality:** By effectively managing terminology and specifying forbidden terms, organizations can prevent the use of undesirable words or expressions by translators .
* **Accelerated translation process:** The term base functionality within CAT tools is engineered for straightforward access and use of terminology resources, thus speeding up the translation workflow .
* **Correct usage and spelling of corporate terminology:** Language and translation are inherently subjective, but a term base guarantees the correct spelling of product or company names, which are often case-sensitive. It also informs translators about terms that should not be translated and should remain in the source language .
### 25.5 Terminology management in the translation process
Effective terminology management is a cornerstone of an efficient translation process. Language Service Providers (LSPs) should furnish translators with term bases and specific instructions to align translations with the provided glossary. LSPs can also mandate that translators add any new terms discovered during their work to the glossary for subsequent review. This transforms the glossary into a valuable asset that might even become a product the client can purchase in the future. When translators have all the necessary terminology readily available, they are more likely to adopt the client's preferred terms, leading to savings in proofreading time and costs, and an increase in client satisfaction due to consistent translation output .
On extensive projects, it's improbable that the Project Manager will possess expertise in every language involved. However, a term base can be used to verify terminological consistency, even for languages the Project Manager does not understand. If a translation deviates from the glossary, Project Managers can send it back to the translator with a request for alignment. This process can further reduce the time and expense associated with proofreading .
> **Example:** An LSP sends a project to a translator along with a glossary of industry-specific terms. The translator is instructed to use these terms exclusively. If the translator introduces new terminology not found in the glossary, they are to add it for review by the LSP. This ensures consistency and builds a valuable terminological resource.
### 25.6 Wordfast Anywhere and glossary usage
The misuse of terminology can severely undermine an otherwise excellent translation. Many clients possess well-defined glossaries, often containing industry-specific jargon. By providing these glossaries to translators, clients can enforce the use of particular terminology. This approach, common in technical translation, aims to blend the translator's linguistic skills with the client's specific terminological demands .
Occasionally, clients may request translators to compile a glossary of terms that emerge from research conducted during the translation process. In such instances, the translator is responsible for creating and populating the glossary with specific terminology. This glossary creation can occur either before the translation begins (during an initial terminology research phase) or concurrently with the translation work itself .
However, it is more common for clients to supply a bilingual glossary that has already been compiled from prior translation projects. In these cases, the translator must adhere strictly to the provided glossary and, where appropriate, contribute their own terms .
For general translation work, or when a translator is still developing their vocabulary in a source language, Wordfast Anywhere's (WFA) glossary function can be utilized to catalog terminology encountered during the translation process. Wordfast Anywhere is designed to support translators in these scenarios through its glossary feature. This glossary is typically a simple tab-delimited text document that, similar to translation memories (TMs), can be uploaded and downloaded from WFA and shared with other CAT programs as needed .
### 25.7 Demonstrating glossary operations
The process of working with glossaries in tools like Wordfast Anywhere often involves several key operations:
* Opening an existing glossary .
* Adding a new glossary .
* Merging multiple glossaries .
* Adding individual terms to an existing glossary .
### 25.8 Interacting with a glossary during translation
When using a CAT tool like Wordfast Anywhere, a glossary can significantly streamline the translation process. Whenever the tool identifies a term from the glossary within the source segment, it highlights it visually, typically with a blue background .
These highlighted terms are often treated as "placeables," meaning they can be easily managed using navigation icons (Previous/Next) or keyboard shortcuts like `Ctrl+Alt+Right` and `Ctrl+Alt+Left`. Clicking on the highlighted term or typing its initial letter followed by `Tab` can also facilitate interaction. The key benefit is that when you use the "Copy" icon or the `Ctrl+Alt+Down` shortcut, the corresponding translation from the glossary is inserted into the target segment. The most intuitive method for copying a target term is often through the "Auto-suggest" feature, which is usually enabled by default. Target terms are suggested by typing the first letter of the target term or the first three letters of the source term .
> **Tip:** Familiarize yourself with the keyboard shortcuts for glossary term insertion in your CAT tool to maximize efficiency.
---
# Considerations for localization across different languages and cultures
Localization is the process of adapting a product or content to a specific locale or market, considering linguistic, cultural, and technical requirements. This involves more than just translation; it requires a deep understanding of the target audience's nuances to ensure the product resonates locally and avoids costly misunderstandings .
### 26.1 Linguistic considerations
Adapting content for different languages involves several key linguistic aspects:
#### 26.1.1 Writing systems and directionality
Different writing systems utilize unique scripts or characters, which can be symbols, logograms, syllograms, or letters. Writing directionality also varies significantly; European languages typically read left-to-right, while Arabic and Hebrew are written right-to-left. Boustrophedon scripts alternate directionality, and some Asian languages can be written vertically .
#### 26.1.2 Text complexity and layout
Some languages require complex text layouts where characters change shape based on context. Capitalization rules differ, with some languages mandating it while others do not. Additionally, text sorting rules vary across different writing systems and languages .
#### 26.1.3 Numerals and grammar
Translators must account for different numeral systems used in various languages. Pluralization and other grammatical rules also exhibit significant variation, necessitating careful attention to detail .
#### 26.1.4 Punctuation
The use of punctuation can differ. For instance, the French language employs guillemets (similar to English double quotes) in some publications .
> **Tip:** Always verify the correct usage of punctuation and quotation marks for the target language, as standard English conventions may not apply.
### 26.2 Technical and economic considerations
Beyond language, several technical and economic conventions need localization:
#### 26.2.1 Economic conventions
Varying economic conventions include differences in paper sizes, preferred storage media, broadcast TV systems, phone number formats, delivery services, and postal code and address formats .
#### 26.2.2 Currency and measurement
Localization efforts must address currency symbols, their position, and the use of currency markers. Measurement systems, battery sizes, and standards for electric current and voltage also require adaptation .
> **Tip:** Ensure all currency amounts are written out in full (e.g., "one hundred dollars," "fifty euros") and never use currency symbols.
#### 26.2.3 Service and data presentation
Variations may exist in providers of payment services, weather reports, and the presentation of online maps from third-party providers .
#### 26.2.4 Time zones
Translators must carefully consider variations in time zones when localizing content .
### 26.3 Legal and regulatory considerations
Legal requirements vary significantly by country and can necessitate product customization or complete redesign:
#### 26.3.1 Compliance and disclaimers
This includes compliance with privacy laws, additional disclaimers on packaging or websites, and different consumer labeling requirements .
#### 26.3.2 Export and censorship regulations
Regulations on encryption and export restrictions must be considered, as well as conformity with subpoena procedures or internet censorship .
#### 26.3.3 Accessibility
Accessibility requirements are also a critical legal consideration .
#### 26.3.4 Tax collection
Localization must account for tax collections, including customs duties, value-added tax (VAT), and sales tax .
### 26.4 Cultural and social considerations
Cultural sensitivity is paramount to successful localization:
#### 26.4.1 Political and geographical sensitivities
Localization efforts must be sensitive to political issues like disputed borders and geographical naming disputes .
#### 26.4.2 Government-assigned numbers
Consideration should be given to numbers assigned by other governments, such as national identification numbers, Social Security Numbers, and passport numbers .
#### 26.4.3 Personalization and aesthetics
Translators should consider local holiday customs, title conventions, personal name conventions, aesthetics, the appropriateness of colors and images, local architecture, people's socioeconomic status, clothing, and ethnicity .
#### 26.4.4 Local customs and taboos
Care must be taken with local customs, including superstitions, local religions, and social taboos .
> **Example:** A color considered auspicious in one culture might signify mourning in another. Similarly, gestures or imagery that are acceptable in one region may be offensive elsewhere.
### 26.5 The importance of localization for websites and software
Localization is crucial for global businesses looking to succeed in new markets:
#### 26.5.1 Website localization
Adapting brand/product websites to local markets allows businesses to appear local and avoid standing out as foreign. This is vital when entering markets with different cultures, languages, consumers, and socioeconomic and political conditions .
#### 26.5.2 Mobile app and software localization
Localizing mobile apps helps gain traction and attract more users in other markets. Similarly, localizing most software ensures users can easily follow instructions, navigate, and use the program .
### 26.6 Consequences of localization failures
Mistakes in localization can lead to significant financial losses and reputational damage:
#### 26.6.1 HSBC's rebranding campaign
HSBC's rebranding campaign in 2009, with the tagline "Assume Nothing," was mistranslated as "Do Nothing" in various countries, costing millions to rectify .
#### 26.6.2 Pepsi's logo redesign in China
PepsiCo's slogan "Pepsi Brings You Back to Life" was translated in China as "Pepsi Brings Your Ancestors Back from the Grave," causing backlash and forcing a strategic reevaluation .
#### 26.6.3 NASA's Mars Orbiter metric mix-up
The failure of NASA's Mars Climate Orbiter mission in 1999 was due to a mix-up between metric and imperial units, resulting in a USD 125 million loss .
#### 26.6.4 Canadian Maple Leaf Coin error
The Royal Canadian Mint mistakenly engraved "Souverain du Canada" (Sovereign of Canada) instead of "Souvenir du Canada" (Memory of Canada) on coins, requiring the recall of approximately 30 million Canadian dollars worth of currency .
#### 26.6.5 London Olympics ticket website error
The 2012 London Olympics ticket website mistranslated "See Tickets" as "Gweld Tocynnau" in Welsh, directing users to the wrong website and causing confusion and financial losses .
#### 26.6.6 Siri's gender-biased responses
Siri faced criticism for gender-biased responses in various languages, reinforcing stereotypes, such as implying only men could hold certain job positions in Chinese .
### 26.7 Principles for internationalization and localization
Effective internationalization lays the groundwork for seamless localization:
#### 26.7.1 Unicode Standard
Utilizing the Unicode Standard ensures compatibility with various writing systems and the representation of diverse languages .
#### 26.7.2 Separation of content and code
Keeping content separate from source code allows for easier translation without extensive coding changes .
#### 26.7.3 Flexible user interface (UI)
Designing a flexible UI accommodates varying text lengths and different reading directions .
#### 26.7.4 Locale-specific formats
Adapting date, time, and number formats to locale-specific conventions is crucial for cultural relevance .
#### 26.7.5 Culturally neutral imagery
Choosing culturally neutral images and icons, or providing alternatives for different regions, ensures inclusivity .
> **Tip:** Internationalization (#i18n) is the process of designing and developing a product or website to allow for easy adaptation to different languages and cultural preferences, making subsequent localization efforts more efficient and cost-effective .
---
# Localization requires adapting various elements beyond text
Localization is the comprehensive process of adapting a product or service to a specific target locale or audience, extending far beyond mere text translation to ensure cultural relevance and user experience. The ultimate goal is to make a product feel as though it was created specifically for the target market, irrespective of its actual origin .
### 27.1 Key elements requiring localization
Beyond translation, numerous elements must be localized to effectively engage with diverse audiences and break down cultural barriers .
#### 27.1.1 Visual and design elements
* **Colors:** The symbolic meaning of colors varies significantly across cultures. For example, red might signify danger in one region, while white could represent death, and orange might symbolize mourning. Thorough research is crucial before targeting new audiences .
* **Layout:** Different languages require varying amounts of space to convey the same concepts. A flexible layout is necessary to accommodate text of diverse lengths, which can expand by 30% to 100% after translation from English .
* **Visuals:** Images and photographs must be adapted to local cultures. For instance, images of "blond moms hugging their kids" may not resonate with a Chinese audience and could even be offensive in the Middle East .
* **Logos and images with text:** These may require alterations or replacement with more generic icons if they contain text that needs translation .
#### 27.1.2 Content and informational elements
* **Units of Measurement:** Most countries adopt the metric system. Localization necessitates converting units of measurement to ensure content is easily understood .
* **Currency Units:** Currency amounts and their representation require localization. This includes converting from one currency to another, such as from dollars to pounds sterling, and presenting equivalent amounts clearly, e.g., "100 dollars (65 pounds)" .
* **Date Formats:** Differences in how dates are presented (e.g., month/day/year vs. day/month/year) can lead to critical misunderstandings .
* **Number Formats:** Conventions for grouping digits and decimal separators vary .
* **Time and Date Formats:** Beyond date formats, localization must consider time differences and potentially different calendar systems .
* **Time Zones:** Translators must carefully account for variations in time zones .
* **Writing Systems and Scripts:** Different languages employ distinct scripts, which can be symbols, logograms, syllabograms, or letters. The direction of writing also differs, with some languages flowing left-to-right, others right-to-left, and some utilizing boustrophedon or vertical writing .
* **Text Presentation:** Complex text layouts are common, where characters change shape based on context. Some languages require capitalization, while others do not. Text sorting rules also vary .
* **Numeral Systems:** Some languages use entirely different numeral systems .
* **Grammar Rules:** Pluralization and other grammatical rules differ significantly between languages, requiring attention to detail .
* **Punctuation:** The usage of punctuation marks can vary, with examples like French guillemets used in place of English double quotes .
#### 27.1.3 Technical and economic elements
* **Paper Size:** Document formatting can be affected by differences in standard paper sizes, such as A4 versus American letter size, impacting layout and page breaks .
* **Economic Conventions:** These vary widely and include preferences for paper sizes, storage media, broadcast TV systems, phone number formats, delivery services, and postal code and address formats .
* **Payment Service Providers:** Variations may exist in providers of payment services .
* **Third-Party Provider Presentations:** This can include weather reports and the presentation of online maps from third-party providers .
* **Storage Media:** Preferences for storage media can differ by country .
* **Broadcast TV Systems:** Localization must consider regional standards for broadcast television .
* **Phone Number Formats:** Adapting phone number formats is essential for local usability .
* **Delivery Services:** Local delivery service conventions may influence product design or information presentation .
* **Postal Codes and Addresses:** Standard formats for postal codes and addresses need to be localized .
* **Measurement Systems:** Localization must address differing measurement systems .
* **Battery Sizes:** Preferences or standards for battery sizes can vary .
* **Electric Current and Voltage Standards:** Localization must conform to local electrical standards .
#### 27.1.4 Legal and regulatory elements
* **Contracts and Agreements:** Compliance with local regulations is paramount when conducting business in foreign countries to avoid legal issues, penalties, or website bans .
* **Legal Requirements:** Customization or complete product changes may be necessary to meet specific country regulations. This includes :
* Privacy laws compliance .
* Additional disclaimers on packaging or websites .
* Different consumer labeling requirements .
* Regulations on encryption and export restrictions .
* Conformity with subpoena procedures or internet censorship .
* Accessibility requirements .
* Tax collections (customs duties, value-added tax, sales tax) .
### 27.2 The distinction between localization and translation
Translation is a component of localization but is not synonymous with it. While translation focuses on linguistic conversion, localization encompasses a broader adaptation to fit local beliefs, traditions, and expectations. To foster identification and understanding with the message, localization is essential for increasing engagement and maximizing the impact of an investment, thereby improving sales and business growth globally .
### 27.3 Types of content requiring localization
Localization is applied across a wide range of digital and physical products and services :
* Websites .
* Video games .
* Movies .
* Product information .
* Mobile applications .
* Software .
* Whitepapers .
* Tech support pages .
* Help files .
* Newsletters .
> **Tip:** Understanding the nuances of cultural values, aesthetic preferences, and regulatory landscapes is as critical as linguistic accuracy in successful localization.
> **Example:** A marketing campaign that uses humor reliant on specific cultural references in one country might fall flat or even cause offense in another, highlighting the need for non-textual adaptation.
---
# Challenges in the translation industry
The translation industry faces significant challenges driven by globalization, rapid technological advancements, and evolving market demands for speed and efficiency.
### 28.1 Globalization and market demands
Multinational companies are increasingly developing products for global markets, requiring them to appear simultaneously on all local markets (simship). This necessitates faster time-to-market schedules. To accommodate this, products must be designed in a way that avoids re-design for each local market, a process known as internationalisation (I18N). Furthermore, products and their accompanying documentation need to be adapted to the language and culture of the target markets, a process called localisation (L10N) [21](#page=21).
### 28.2 Technological advancements and software
The development and specialization of computer software, particularly in general office applications and translation and localisation software, presents ongoing challenges. These tools encompass functionalities such as Translation Memory (TM), alignment, terminology management, terminology extraction, software localisation, project management, and machine translation. The integration of plug-ins and interfaces, the increasing number of features and variants, and the frequency and speed of software updates create compatibility problems. This leads to a "necessity of continuous upgrading" for both the software and the user [22](#page=22).
### 28.3 Evolution of electronic file formats
The proliferation of electronic file formats, including those for office applications, desktop publishing (DTP), markup languages, and software, poses a considerable challenge. The continuous development of new file formats and modifications to existing ones in new software versions requires translators to constantly update their technical know-how. File preparation and post-processing have emerged as new fields of activity for translators, necessitating adaptation of old workflows and modification of translation strategies due to new tools [23](#page=23).
### 28.4 Computer-assisted translation (CAT) tools
Computer-assisted translation (CAT), or *traducción asistida por ordenador* (TAO), is a crucial development in addressing the challenges within the translation industry [25](#page=25) [27](#page=27).
#### 28.4.1 Introduction and objectives of CAT
The primary goal of CAT tools is to efficiently assist translators by automatically and rapidly providing them with the necessary resources for their work. The objectives of understanding CAT include defining what it is, recognizing its core components, comprehending the translation process using a CAT system, and gaining an initial understanding of major machine translation systems [30](#page=30) [32](#page=32).
#### 28.4.2 Definition and basic components of CAT
CAT is defined as a set of computer applications specifically designed to efficiently assist translators in their tasks. A key aspect is how CAT tools handle repetitions in texts; clients generally do not pay the same rate for repetitive content, making repetition management important for productivity. The adoption of CAT tools significantly enhances translator productivity. The European Commission emphasizes proficiency in CAT and terminology tools, along with common office software, as essential translation capabilities [32](#page=32) [33](#page=33) [34](#page=34) [35](#page=35).
> **Tip:** It's important to distinguish between Machine Translation (MT) and Computer-Assisted Translation (CAT). MT is when a machine performs the translation (e.g., Google Translate), while CAT involves a human translator using various tools. MT can be integrated into CAT systems [36](#page=36).
#### 28.4.3 Essential CAT tools
Several types of CAT tools are considered indispensable for translators [37](#page=37) [41](#page=41) [43](#page=43) [46](#page=46) [47](#page=47):
1. **Translation project management software:** This software controls information flow, assigns translations, manages quality control, analyzes content, generates reports (including full and fuzzy matches, and intra- and cross-file repetitions), performs word counts, and facilitates final delivery to the client [37](#page=37).
2. **Translation memory (TM) software:** These tools store previously translated segments, ensuring terminological and phraseological consistency and enabling the rapid retrieval of translation units for increased productivity. Common file extensions for TMs include `.tmx` (a standard open format compatible across most CAT tools), `.sdltm` (a proprietary format for SDL Trados Studio), and `.txt` / `.csv` for plain text exports [38](#page=38).
3. **Terminology management software:** This software is used to create glossaries from ongoing translations. Popular examples include RWS Trados MultiTerm, Wordfast, and MemoSource. Standard formats for terminology databases include `.tbx` (TermBase eXchange), an open standard for glossary exchange, and `.sdltb`, a proprietary format for SDL Trados Studio [41](#page=41).
4. **Alignment software:** This tool is used to create translation memories from existing pairs of original texts and their translations. It identifies corresponding segments, which then require manual confirmation. Alignment is particularly useful when working with older documents translated in formats like Word, allowing the creation of a reusable TM [43](#page=43) [46](#page=46).
5. **Localization tools:** These specialized tools are designed for translating software, video games, and websites [47](#page=47).
#### 28.4.4 The translation process with a CAT system
A recommended routine for using a CAT system involves a structured approach to arrive at the final translation. The process typically includes [49](#page=49):
* **File format check:** Ensuring the file format is compatible with the CAT tool [50](#page=50).
* **Resource assignment:** Assigning relevant translation memories, termbases, and other resources to the project [50](#page=50).
* **Segmentation:** Breaking down the source text into manageable segments, often sentence by sentence [50](#page=50).
* **Translation:** The translator works on each segment, leveraging the CAT tool's features like TM lookups and termbase suggestions [50](#page=50).
---
# The key to successful post-editing is quick decision making
This section explores the concept that efficient post-editing in translation relies heavily on the translator's ability to make rapid decisions. This efficiency is not just about speed but also about leveraging technology to support informed and swift choices, ultimately enhancing productivity and consistency.
### 29.1 The evolution and purpose of translation technologies
The history of translation technology reveals a progression towards systems that actively assist translators. Early approaches, like the text-related glossary approach used by the Federal Armed Forces Translation Agency (1960-1965), focused on identifying unknown words and providing equivalents. Similarly, the European Coal and Steel Community (1960-1965) developed automatic dictionary look-up with context inclusion, where the computer searched for similar sentences to the input [11](#page=11) [9](#page=9).
More advanced systems emerged, such as Alan Melby's Interactive Translation System (ITS) in the 1980s, which featured a three-level approach to Computer-Aided Translation (CAT), integrating editing, terminology management, text analysis, and even machine translation systems [13](#page=13).
The development of commercial systems further revolutionized the field:
* **TRADOS** (established 1984) transitioned from a translation provider to a software developer, launching its first Translation Memory (TM) tool, TED, in 1988, followed by terminology management software (MultiTerm) and the Translator's Workbench [14](#page=14).
* **STAR** (established 1984) also developed software alongside translation services, releasing Transit (with TermStar) in 1991 [15](#page=15).
* **ATRIL** launched its first Windows-based TM tool in 1993, later redesigning it as an integrated translation environment [16](#page=16).
* **IBM Germany** introduced Translation Manager/2 (TM/2) in 1992, notably including linguistic resources for nineteen languages [17](#page=17).
These systems, including current market players like Across, DéjàVu, MemoQ, MultiTrans, SDL Trados, SDLX, and Wordfast, aim to improve translator productivity and consistency [18](#page=18).
The core principle of Computer-Aided Translation (CAT) is to provide translators with tools that efficiently deliver needed resources. This is crucial because multinational companies operate with rapid product introduction cycles (time-to-market) and require simultaneous global launches (simship). Products are designed for internationalization (I18N) and localization (L10N) to adapt them to local languages and cultures. The increasing complexity of software, file formats, and frequent updates necessitates continuous upgrading of technical know-how and adaptation of translation strategies [21](#page=21) [22](#page=22) [23](#page=23) [32](#page=32).
> **Tip:** Understanding the historical development of CAT tools helps appreciate the current functionalities and their underlying principles.
### 29.2 The translation workflow with CAT systems
The translation process within a CAT system involves a structured workflow designed for efficiency. When a source text is opened or imported, it is segmented into "translation units" based on punctuation and user-defined rules [19](#page=19).
The segment currently being processed is automatically looked up in the Translation Memory (TM) [20](#page=20).
* If an identical or similar segment is found, the associated translation is displayed and can be selected or modified by the translator [20](#page=20).
* If no match is found, the translator enters a new translation, which is then stored in the TM, making it available for future identical or similar source segments. This iterative process gradually builds the TM [20](#page=20).
The recommended routine for using a CAT system typically includes:
1. **File format check**: Ensuring the input file is compatible [50](#page=50).
2. **Resource assignment**: Loading relevant TMs and termbases [50](#page=50).
3. **Segmentation**: Dividing the text into translatable units [50](#page=50) [51](#page=51).
4. **Translation**: The core task of rendering the source text into the target language, leveraging TM matches and terminology tools [50](#page=50).
5. **TM update**: Storing new translations in the TM for future use [52](#page=52).
6. **Revision**: Reviewing the translated text for accuracy and quality [52](#page=52).
7. **Final review**: A final check before delivery [52](#page=52).
> **Tip:** The efficiency gains from CAT tools stem from automating repetitive tasks and providing quick access to previously translated content and terminology.
### 29.3 Key components of Computer-Aided Translation (CAT) tools
CAT tools are defined as specialized software applications designed to efficiently assist translators. Their primary goal is to provide translators with necessary resources automatically and rapidly. The Commission of the European Union highlights proficiency in CAT and terminology tools as crucial capabilities for translators [32](#page=32) [35](#page=35).
Essential CAT tools include:
1. **Project management software**: Manages the translation workflow, including information flow, assignment, quality control, content analysis, reporting (e.g., full/fuzzy matches, repetitions), word count, and final delivery [37](#page=37).
2. **Translation Memory (TM) software**: Stores previous translations to ensure terminological and phraseological consistency and enable quick retrieval of translation units, thus boosting productivity. TMs store segments (source and target), language, creation dates, and potentially author, usage count, change dates, creation tool, domain, alternate translations, and notes. Common file extensions include.tmx (Translation Memory eXchange - an open standard) and.sdltm (SDL Trados proprietary format), with.txt and.csv also used for simpler exports [38](#page=38) [67](#page=67).
3. **Terminology management software**: Essential for creating and managing glossaries based on ongoing translations. Popular examples include RWS Trados Multiterm, Wordfast, and MemSource. Standard formats include.tbx (TermBase eXchange) and proprietary formats like.sdltb [41](#page=41).
4. **Alignment software**: Used to create TMs from existing source texts and their translations. It identifies correspondences between segments, which are then manually confirmed. This is useful for building a TM from documents previously translated in simpler formats like Word [43](#page=43) [46](#page=46).
5. **Localization tools**: Specifically designed for translating software, video games, or websites [47](#page=47).
### 29.4 Understanding Translation Memory (TM) files
A Translation Memory (TM) file is fundamentally a structured text file, often in XML format (e.g., XLIFF or TMX), containing translation and linguistic data. While file extensions like *.tmx or *.xliff might suggest specialized software is needed, they can be opened with standard text editors [66](#page=66).
TM files store key information such as:
* Segments (source and target) [67](#page=67).
* Language [67](#page=67).
* Creation dates and times [67](#page=67).
* Additional metadata: Author, usage count, change dates/times, creation tool, domain, alternate translations, and notes [67](#page=67).
The most popular TM file formats are XLIFF and TMX, both based on XML. XML's well-defined structure makes files easy to parse, its semantic tags indicate data meaning, and it facilitates data exchange between different applications and systems [68](#page=68).
**Typical TM file structure includes:**
* **Header**: Contains metadata about the file and the localization process. Examples of headers for TMX and XLIFF files illustrate their human-readable nature due to semantic XML tags [69](#page=69) [70](#page=70) [71](#page=71).
* **Body**: Contains the most crucial data – the translation units and segments. Examples of TMX and XLIFF bodies show how translation units are structured [72](#page=72) [73](#page=73) [74](#page=74).
### 29.5 The importance and differences between TMX and XLIFF
Translation Memory files are vital because they significantly improve efficiency and ensure consistency. By loading TMs into CAT tools, translators can leverage prior work, with the tool automatically alerting them to matches or partial matches. This also helps maintain consistency across projects and clients by utilizing "client-based" or "project-based" translation memories [75](#page=75).
**Key differences between TMX and XLIFF:**
* **Purpose**: XLIFF was designed to store extracted text and facilitate data transfer within the localization process, while TMX was created specifically for exchanging TM data between tools [76](#page=76).
* **Languages**: TMX supports multiple languages within a single document, whereas XLIFF is designed for one source and one target language [76](#page=76).
* **Inline Codes**: TMX uses encapsulation methods, while XLIFF offers both encapsulation and a placeholder method where native codes are moved to a "Skeleton file" [76](#page=76).
* **Order and Reconstruction**: In TMX, translation units ( elements) may not have a specific order and lack a mechanism to rebuild the original file. XLIFF, however, can be more powerful for reconstructing or rebuilding the original file [76](#page=76) [78](#page=78).
* **Data Types**: XLIFF includes additional data types and fields not present in TMX, such as pretranslation, history, versioning, and binary objects [76](#page=76).
* **Timestamping**: TMX files can store time and date data at the translation unit level, allowing for productivity analysis, while XLIFF files cannot [76](#page=76) [78](#page=78).
Both TMX and XLIFF are industry-standard, XML-based formats supported by most translation software. The choice between them often depends on the specific project, tool, or provided TM file. Using TM, regardless of format, is vastly more beneficial than not using it. Some prefer TMX for its timestamping and multi-language capabilities, while XLIFF is preferred for its reconstruction potential [77](#page=77) [78](#page=78).
### 29.6 Metadata and its role in CAT tools
Metadata, defined as "data about data," provides additional information about digital content and processes. In CAT tools, metadata associated with segments allows for tracing them back to a specific translator, date, and time. This enables translators to leverage more recent translations or discard segments with outdated terminology. Language service providers can effectively manage their TM resources using this metadata [84](#page=84) [85](#page=85).
There are three main types of metadata:
* **Descriptive metadata**: Describes content.
* **Structural metadata**: Describes the organization of objects or components.
* **Administrative metadata**: Describes technical information, such as file type [85](#page=85).
The potential loss of important metadata during format transfers can restrict users to specific software due to interoperability issues [84](#page=84).
---
Successful post-editing hinges on the ability to make swift and accurate decisions regarding translation choices. This involves efficiently evaluating segments, identifying potential issues, and implementing appropriate corrections or confirmations, all while maintaining a steady workflow. The core of this process is rapid assessment and decisive action to ensure both quality and productivity [86](#page=86).
### 29.1 The role of quick decisions in post-editing
In the context of Computer-Assisted Translation (CAT) tools, post-editing involves reviewing and refining machine-generated translations or translations assisted by Translation Memory (TM) systems. The efficiency and quality of this process are directly linked to the post-editor's speed in making decisions [97](#page=97) [98](#page=98).
#### 29.1.1 Evaluating and confirming segments
When a segment is presented to a post-editor, whether it's a direct machine translation or a fuzzy match from a TM, a rapid decision is required. This decision involves determining if the translation is accurate, contextually appropriate, and meets quality standards. If the translation is acceptable, the post-editor quickly confirms it, allowing the process to move to the next segment [86](#page=86) [97](#page=97).
#### 29.1.2 Identifying and correcting errors
When a translation segment contains errors, such as grammatical mistakes, mistranslations, or stylistic inconsistencies, the post-editor must quickly identify these issues and apply the necessary corrections. This requires not only linguistic proficiency but also an understanding of the subject matter and the client's specific requirements. The faster these errors are spotted and rectified, the more efficient the post-editing process becomes .
#### 29.1.3 Handling fuzzy matches
In TM systems, "fuzzy matches" are suggested translations that are similar to, but not identical with, a previous translation. The post-editor must quickly decide whether to accept the fuzzy match as is, modify it, or disregard it and translate the segment from scratch. This decision depends on the percentage of similarity and the perceived quality of the fuzzy match [86](#page=86).
#### 29.1.4 The impact of decision speed on workflow
A post-editor who can make quick decisions allows for a smoother and faster workflow. Delays in decision-making, whether due to hesitation, uncertainty, or the need for extensive research, can significantly slow down the entire project. This has a direct impact on project timelines and, consequently, on costs .
### 29.2 Factors influencing decision-making speed
Several factors contribute to a post-editor's ability to make rapid decisions:
#### 29.2.1 Familiarity with the subject matter
A deep understanding of the domain or subject matter being translated allows the post-editor to quickly grasp the meaning of the source text and evaluate the accuracy of the target text. This reduces the need for time-consuming research into specialized terminology .
#### 29.2.2 Proficiency in both languages
Strong command of both the source and target languages is fundamental. This includes not only grammatical correctness but also an awareness of idiomatic expressions, cultural nuances, and stylistic conventions, enabling quicker assessment of translation quality .
#### 29.2.3 Effective use of CAT tools and resources
Familiarity with the CAT tool's features, including Translation Memory (TM) and Term Bases (TB), is crucial. The ability to quickly access and utilize these resources for consistency and terminology verification speeds up the decision-making process .
#### 29.2.4 Clear project instructions and guidelines
Well-defined project instructions, style guides, and glossaries provide post-editors with clear parameters for making decisions. When expectations are unambiguous, post-editors can confirm or correct segments more rapidly .
#### 29.2.5 Experience in post-editing
As post-editors gain experience, they develop an intuition for common errors, effective solutions, and efficient workflows, leading to faster decision-making [98](#page=98).
### 29.3 Consequences of delayed decision-making
Hesitation or delays in the post-editing process can lead to several negative outcomes:
#### 29.3.1 Reduced productivity
When post-editors spend too much time on each segment, the overall output of translated content decreases, impacting project completion times .
#### 29.3.2 Increased project costs
Slower turnaround times can result in higher project costs, especially for clients who pay by the hour or for projects with strict deadlines .
#### 29.3.3 Inconsistent quality
Indecision or a rushed approach to making decisions can lead to inconsistencies in the final translation, as the post-editor may overlook errors or apply corrections inconsistently .
#### 29.3.4 Frustration for project managers and clients
Delays and potential quality issues stemming from slow decision-making can lead to frustration for project managers and clients alike .
> **Tip:** Cultivating a mindset of decisive action, coupled with robust linguistic and subject matter expertise, is paramount for efficient and high-quality post-editing.
### 29.4 The relationship between post-editing and alignment
While not directly about decision-making speed, the alignment process, which creates Translation Memories from existing source and target texts, indirectly supports faster post-editing. High-quality alignment provides cleaner TM data, reducing the ambiguity and potential for errors that post-editors need to resolve. The accuracy of the alignment process itself relies on effective matching of segments, which mirrors the decision-making required in post-editing [100](#page=100) [99](#page=99).
### 29.5 Terminology management and rapid decision-making
The use of Term Bases (TB) and glossaries is another critical factor that supports quick decisions. When a post-editor encounters a term, a well-maintained term base can instantly confirm the correct translation or highlight a forbidden term, eliminating the need for ad-hoc research. This readily available, authoritative information allows for immediate confirmation of terminology, significantly speeding up the decision-making for those specific segments .
### 29.6 Localization and decision complexity
While the core principles of quick decision-making apply to general translation post-editing, localization introduces additional layers of complexity. Adapting content to local cultures, laws, and preferences requires more nuanced decisions than simple linguistic correction. However, even within localization, efficient post-editors will still strive for quick, informed decisions regarding cultural appropriateness, color choices, layout adjustments, and currency conversions. The speed at which these localized decisions are made directly impacts the overall localization workflow .
---
## 29 The key to successful post-editing is quick decision making
Successful post-editing hinges on the linguist's ability to make swift and accurate decisions regarding the machine translation (MT) output. This involves quickly assessing the quality of the MT and determining whether it is more efficient to post-edit the generated text or to translate the segment from scratch .
### 29.1 Understanding post-editing
Post-editing is the process where human translators revise machine-generated translations to produce an acceptable final product. It is distinct from editing human-generated text, which is typically known as revision in the translation field. Post-editing is employed when raw machine translation is insufficient, but full human translation is not mandated .
#### 29.1.1 Post-editing strategies and quality levels
The required level of post-editing varies per project and is influenced by three main considerations: time, quality, and cost. Two primary levels of post-editing are recognized :
* **Light post-editing**: This involves minimal intervention, focusing solely on making the text understandable for the end-user. It is typically used for inbound purposes, when the text is needed urgently, or for short-term use. Guidelines for "good enough" quality emphasize semantic correctness, ensuring no information is added or omitted, editing offensive content, using as much raw MT output as possible, and basic spelling rules. Stylistic improvements or sentence restructuring for flow are not prioritized .
* **Full post-editing**: This requires a greater degree of intervention to achieve a negotiated level of quality. The outcome is a text that is not only understandable but also stylistically appropriate for assimilation and dissemination, suitable for both inbound and outbound purposes. At the highest end of full post-editing, the quality should be indistinguishable from human translation. Guidelines for achieving human-level quality include aiming for grammatically, syntactically, and semantically correct translations, ensuring correct key terminology and adherence to "Do Not Translate" lists, preventing information loss or addition, editing inappropriate content, using raw MT output where possible, applying correct spelling, punctuation, and hyphenation, and ensuring correct formatting .
#### 29.1.2 Efficiency and productivity in post-editing
Post-editing can potentially double, or even quadruple (in the case of light post-editing), the productivity of manual translation. However, predicting post-editing efficiency is challenging. While studies generally suggest it's faster than translating from scratch, there's no consensus on the exact time savings, with industry reports citing around 40% while some academic studies suggest 0–20%. Professionals have also reported negative productivity gains if corrections take longer than re-translating .
> **Tip:** The effectiveness of post-editing is significantly influenced by the quality of the raw MT output and the expected final quality of the content .
### 29.2 The importance of quick decision making in post-editing
The core of successful post-editing lies in the linguist's promptness in deciding between post-editing and re-translating from scratch .
* **Decision point 1: Post-edit or re-translate?**: After reviewing the MT output and comparing it to the source, the linguist must quickly determine if post-editing the MT suggestion is more efficient than deleting it and starting anew. If the MT quality is good and requires minor adjustments, post-editing is the preferred approach. Conversely, if the MT output is poor and would take longer to fix than to re-translate, the linguist should opt for translating from scratch .
* **Decision point 2: Edit or move on?**: Another critical decision point is determining if a machine-translated segment requires any editing at all. Some language service providers encourage linguists to move to the next segment if they cannot identify any issues within a very short timeframe, such as 3 seconds, to maintain efficiency .
> **Tip:** The principle of productive and time-saving post-editing relies on the linguist utilizing as much of the MT output as possible .
### 29.3 Avoiding over-editing and under-editing
Making necessary amendments is crucial, but over-editing and under-editing can both hinder efficiency and quality.
* **Over-editing**: This involves making purely preferential or unnecessary amendments, which goes against the principle of maximizing MT output utilization. Examples include replacing a word with a synonym when both are viable, or reordering words in languages with flexible sentence structures, when not strictly required .
* **Under-editing**: This is equally detrimental and involves leaving errors in the target text. This includes failing to correct mistranslations, overlooking punctuation errors, leaving the translation sounding unnatural or robotic, and not ensuring approved terminology is used .
### 29.4 Post-editing in the language industry
Post-editing is described as a "nascent profession" with overlaps with translating and editing. The ideal post-editor is often considered a translator trained in specific skills, though some believe bilingual individuals without a translation background might be easier to train. Information about the demographics and working conditions of post-editors remains limited .
Many professional translators express dislike for post-editing, partly due to lower payment rates compared to conventional translations. Translation efficiency gains can be tracked using translation management systems and CAT tools that record post-editing times and quality assessment results .
The market size of post-editing within the translation industry is not clearly defined. While a significant percentage of language service providers offer post-editing services, it often accounts for a small portion of their overall throughput. The increasing integration of MT with translation memory platforms and the use of crowdsourcing portals are driving the growth of post-editing. Advances in MT, partly fueled by feedback from post-edited texts, continuously improve MT quality, which in turn is expected to further increase the demand and adoption of post-editing .
---
## Common mistakes to avoid
- Review all topics thoroughly before exams
- Pay attention to formulas and key definitions
- Practice with examples provided in each section
- Don't memorize without understanding the underlying concepts
Glossary
| Term | Definition |
|------|------------|
| Computer-Assisted Translation (CAT) System | A technological tool designed to aid human translators in the translation process, offering features such as translation memory, terminology management, and quality assurance checks to improve efficiency and consistency. |
| Translation Memory (TM) | A database that stores previously translated segments of text, allowing translators to reuse existing translations for similar content, thereby saving time and ensuring consistency in terminology and style. |
| Segmentation | The process of breaking down a source text into smaller, manageable units, typically sentences or phrases, which are then presented to the translator for translation and stored in the translation memory. |
| Resource Assignment | The allocation of necessary resources, such as translation memories, termbases, and style guides, to a specific translation project to ensure the translator has access to all relevant tools and information. |
| File Format Check | An initial step in the computer-assisted translation process that verifies the compatibility and integrity of the source file format to ensure it can be processed correctly by the CAT tool. |
| Resource Update | The process of incorporating new translations, terminology, and other linguistic data into the existing translation memory and termbases, thereby enriching the resources for future projects. |
| Final Review | The last stage of the translation process where the translated text is thoroughly checked for accuracy, consistency, grammar, style, and adherence to project requirements before delivery. |
| TM Generation | The creation of a Translation Memory database from a set of completed translations, which can then be used to assist in the translation of future, similar texts. |
| CAT tool | Stands for Computer-Assisted Translation tool, which is software designed to assist human translators by providing features like translation memory integration, terminology management, and quality assurance checks. |
| TEnT tool | Stands for Translation Environment Tool, a broader category that encompasses CAT tools and other software used in the localization process to manage and facilitate translation workflows. |
| Segment | A unit of text, typically a sentence or a phrase, that is translated and stored within a translation memory. |
| Consistency | The uniformity of terminology, style, and phrasing across a translation project, which is crucial for maintaining brand voice and ensuring clarity for the target audience. |
| Client-based translation memory | A translation memory file specifically created or curated for a particular client, ensuring that all translations adhere to their unique terminology, style guides, and preferences. |
| Project-based translation memory | A translation memory file tailored for a specific translation project, containing terminology and previously translated segments relevant only to that particular job. |
| TMX (Translation Memory eXchange) | An industry-standard, XML-based file format designed for exchanging translation memory data between different translation tools, allowing for interoperability and data sharing. |
| XLIFF (XML Localization Interchange File Format) | An industry-standard, XML-based file format primarily used to store extracted text and facilitate the transfer of data throughout the localization process, supporting features like pretranslation and versioning. |
| XML-based file type | A file format that uses Extensible Markup Language (XML) to structure and encode data in a human-readable and machine-readable format, commonly used in data exchange and web technologies. |
| Inline markup elements | Special tags or codes embedded within the text of a translation memory file that represent formatting, tags from the source document, or other non-translatable content. |
| Encapsulation methods | A technique used in file formats to enclose or wrap specific data, such as inline codes, within distinct elements or tags to define their boundaries and properties. |
| TBX | TermBase eXchange format, also known as DXLT (Default XLT format), which is an XML representation of lexicons and terminologies. This format facilitates the transfer of glossaries between different translation tools and is based on the ISO 12200 standard, MARTIF (Machine-Readable Terminology Interchange Format). |
| DXLT | Default XLT format, an alternative name for TBX (TermBase eXchange format). It represents lexicons and terminologies using XML and is designed for the exchange of glossary data between translation software. |
| ISO 12200 | The international standard for Machine-Readable Terminology Interchange Format (MARTIF), which serves as the foundation for the TBX file format. This standard aims to enable the consistent exchange of terminology data between different systems. |
| MARTIF | Machine-Readable Terminology Interchange Format, the standard upon which the TBX file format is based. It provides a framework for representing and exchanging terminology information in a machine-readable way. |
| SALT | Standards-based Access service to multilingual Lexicons and Terminologies, an initiative at BYU focused on maintaining and providing access to multilingual lexical and terminological resources. |
| W3C | World Wide Web Consortium, an international community that develops open standards to ensure the long-term growth of the Web. While not directly a file format, it is a significant resource for web-related standards that can impact translation workflows and resources. |
| Internationalization (i18n) | The process of designing and developing a website in a manner that facilitates its easy adaptation to various languages and cultural preferences without requiring fundamental re-engineering. |
| Unicode Standard | A character encoding standard that ensures compatibility with diverse writing systems, enabling the representation of a wide range of languages and symbols across different platforms and applications. |
| Separation of Content and Code | A development principle where website content is kept distinct from the underlying source code, allowing for easier translation and modification of text without impacting the website's functionality or requiring extensive coding changes. |
| Flexible User Interface (UI) | A user interface designed to accommodate variations in text length, reading direction (e.g., left-to-right or right-to-left), and other linguistic requirements, ensuring usability across different languages. |
| Localization (L10n) | The process of adapting a website to a specific target locale or market, which includes translating content, adjusting graphics, modifying layout, and ensuring compliance with local regulations and cultural norms. |
| Translation of Content | The act of converting textual and multimedia information from the source language into the target language, paying close attention to linguistic accuracy, cultural nuances, and the intended meaning. |
| Adaptation of Graphics and Multimedia | The process of modifying visual elements such as images, videos, and icons to ensure they are culturally appropriate, relevant, and appealing to the target audience in a specific locale. |
| Adjustment of Layout and Design | Modifying the visual arrangement and aesthetic presentation of a website to accommodate differences in text length, font styles, reading directions, and other language-specific design considerations. |
| Integration of Local Regulations | Ensuring that a website complies with the legal requirements, industry standards, and specific regulations of a target region concerning content, data privacy, accessibility, and other relevant aspects. |
| Testing and Quality Assurance | A critical phase in the localization process involving rigorous evaluation of the localized website to verify its functionality, linguistic accuracy, cultural appropriateness, and overall user experience. |
| Dynamic Content | Content on a website that changes frequently or is generated in real-time, presenting a unique challenge for localization due to the need for continuous updates and consistent adaptation. |
| SEO Considerations | The strategic adaptation of website elements like metadata, keywords, and tags to optimize search engine visibility and performance within specific regional search engines and language markets. |
| Pre-editing | The process of revising technical documentation before it undergoes machine translation (MT) to improve the source text and enhance the quality of the raw output. Effective pre-editing aims to reduce or eliminate the subsequent post-editing effort. |
| Post-editing | A human-driven process that involves reviewing and correcting the output generated by a machine translation (MT) engine. The goal is to ensure accuracy, fluency, and adherence to stylistic requirements, thereby improving the overall quality of the translated text. |
| Machine Translation (MT) | An automated process that uses computer software to translate text or speech from one language to another. MT systems aim to produce translations without direct human intervention, though human involvement is often crucial for quality assurance. |
| Source Text | The original text in the source language that is intended to be translated into another language. The quality and clarity of the source text significantly impact the effectiveness of both machine translation and human translation processes. |
| Raw Output | The initial translation produced by a machine translation (MT) system without any subsequent human review or correction. This output often requires significant editing to meet quality standards. |
| Specialized Human Editor | An individual with expertise in translation and an understanding of how machine translation (MT) engines process text. This professional can analyze source material from an MT perspective to anticipate and mitigate potential translation errors. |
| Sentence Length | A linguistic characteristic referring to the number of words in a sentence. Reducing sentence length is a pre-editing technique used to simplify text and improve its suitability for machine translation, as shorter sentences are often easier for MT engines to process accurately. |
| Syntactic Structures | The grammatical arrangement of words and phrases in sentences. Complex or ambiguous syntactic structures can pose challenges for machine translation (MT) systems, and pre-editing often involves simplifying these to improve translation quality. |
| Term Consistency | The uniform and accurate use of specific terminology throughout a document or project. Ensuring term consistency is a key aspect of pre-editing to prevent machine translation (MT) systems from producing varied or incorrect translations of key terms. |
| Automated Revision Tools | Software applications designed to assist in the review and correction of text. Examples include spell-checkers and grammar-checkers, which are used during pre-editing to identify and rectify errors in the source text before machine translation. |
| Project-Specific Glossary | A curated list of terms and their approved translations relevant to a particular project or organization. This glossary is used by automated revision tools and human editors to ensure accurate and consistent terminology in translations. |
| Tagging Elements | The process of marking specific parts of a source document that should not be translated by a machine translation (MT) system. This is often applied to elements like codes, specific formatting, or proper nouns that require special handling. |
| Pronoun Avoidance | A controlled language rule that dictates repeating the noun instead of using a pronoun to enhance clarity and avoid potential ambiguity in written text. |
| Article Usage | A controlled language rule emphasizing the use of articles (e.g., "a," "an," "the") to clearly identify nouns within sentences, thereby improving precision. |
| General Dictionary Words | A controlled language rule promoting the use of common words found in a general dictionary to ensure broad comprehension and avoid specialized or obscure terminology. |
| Spelling Accuracy | A controlled language rule mandating the use of only correctly spelled words to prevent confusion and facilitate smoother processes, such as translation. |
| Full Post-editing | A type of post-editing typically recommended to achieve quality similar to high-quality human translation and revision, often referred to as "publishable quality." This involves extensive corrections to ensure accuracy, style, and fluency. |
| Light Post-editing | A type of post-editing usually recommended for lower quality standards, often described as "good enough" or "fit for purpose." This focuses on essential corrections to ensure comprehensibility and accuracy without extensive stylistic improvements. |
| "Good enough" quality | A quality level defined as comprehensible and accurate, meaning the main content is understandable and the meaning matches the source text. However, the text may not be stylistically compelling, might sound computer-generated, and could have unusual syntax or imperfect grammar. |
| Quality similar or equal to human translation | A quality level defined as comprehensible, accurate, and stylistically fine, though potentially not as polished as a native-speaker human translation. Syntax is normal, and grammar and punctuation are correct. |
| Comprehensible | The quality of a text that allows the reader to understand the main content of the message without significant difficulty. |
| Accurate | The quality of a translation that faithfully communicates the same meaning as the original source text, ensuring no critical information is lost or misrepresented. |
| Stylistically fine | Refers to the quality of writing that is appropriate and pleasing to the reader, adhering to grammatical and syntactic norms without sounding awkward or unnatural. |
| Key terminology | Important words or phrases that are specific to a particular field or subject matter and must be translated consistently and correctly to maintain the integrity of the message. |
| "Do Not Translate" terms | A client-specified list of terms that should not be translated and should be retained in their original language within the translated text. |
| Computer-assisted translation (CAT) systems | Software applications designed to aid human translators in the translation process. These systems typically offer features such as translation memory, terminology management, and machine translation integration to improve efficiency and consistency. |
| Terminology Database (Termbase) | A specialized database used to store and manage specific terms, their definitions, and their translations across different languages. This ensures that consistent terminology is used throughout a project, especially in technical or specialized fields. |
| Machine Translation (MT) Engine | A component within a CAT system that automatically translates text from one language to another. While not always perfect, MT engines can provide a first draft translation that a human translator can then refine and edit. |
| Trados | A widely used translation environment that supports the entire translation project lifecycle, from initial project setup and the creation of translation memories and terminology databases to final review and editing. |
| Wordfast | An online CAT system, also available in a desktop version, that integrates a machine translation engine. It offers features for text translation, alignment, and glossary management, and was historically available for free. |
| MemoQ | A CAT tool developed in 2004 to compete with Trados, which has gained significant popularity. It is offered by Kilgray Translation Technologies and provides various products tailored to different translator needs. |
| Déjà Vu X3 | A translation program developed by Atril, similar in functionality to other CAT tools. It enables project managers to evaluate, prepare, and control translation projects from inception to completion across various language pairs. |
| TMX | Translation Memory eXchange. This format facilitates the transfer of translation memories between different translation tools, enabling the reuse of previously translated content. A translation memory itself is a database storing source text segments and their corresponding translations in various target languages. |
| XLIFF | XML Localization Interchange File Format. This standard allows for the transfer of localizable data extracted from original files through different stages of the localization workflow, including merging the translated content back into the original format. |
| OLIF | Open Lexicon Interchange Format. This format is designed for the transfer of terminological and lexical data between translation tools, with a particular emphasis on data relevant to Natural Language Processing (NLP) applications, such as machine translation lexicons. |
| Glossary | A collection of terms and their definitions, often specific to a particular domain or project, used to maintain terminological consistency in translations. |
| Post-editor | A person who performs the task of post-editing machine translation output. |
| Controlled Language | A subset of a natural language with restricted grammar and vocabulary, designed to improve the clarity and consistency of text, particularly for machine translation. |
| Editing | The process of improving human-generated text, often referred to as revision in the translation field. |
| Revision | A process of reviewing and improving human-generated text to ensure quality and correctness. |
| Computer Assisted Translation (CAT) Tools | Software applications that assist human translators in the translation process, often including features for post-editing machine translation output. |
| Raw Machine Translation | The direct output from a machine translation engine without any human post-editing or revision. |
| Manual Translation | The process of translating text from one language to another solely by human translators without the aid of machine translation. |
| Import | This function is used to transfer a text and its translation from a text file into a Translation Memory (TM). It can be performed from a raw format, where an external source text and its translation are available, or from a native format, which is the format the TM uses to save translation memories. |
| Textual Parsing | A process within analysis that focuses on correctly recognizing punctuation to differentiate between punctuation marks at the end of sentences and those within abbreviations. This is a form of pre-editing, often found in materials processed by translator aid programs, where mark-up is used to identify and handle special text elements that may or may not require translation or conversion. |
| Linguistic Parsing | A process used to reduce words to their base form, preparing lists of words and texts for automatic retrieval from a term bank. It can also involve syntactic parsing to extract multi-word terms or phraseology from a source text, normalizing word order variations to identify potential phrases. |
| Alignment | The task of establishing translation correspondences between source and target texts. Effective alignment should provide feedback to segmentation and be capable of correcting initial segmentation errors, ensuring accurate pairings between text units. |
| Term Extraction | A function that can utilize a previous dictionary as input or employ parsing based on text statistics to identify unknown terms. This process is valuable for estimating the workload of a translation job, aiding in planning and scheduling by counting words and assessing repetition. |
| Export | The function that transfers text from a Translation Memory (TM) into an external text file. Ideally, export operations should be the inverse of import operations, allowing for seamless data transfer in both directions. |
| Exact Match | A type of retrieval from a TM where the current source segment matches a stored segment character by character, often referred to as a "100% match." This indicates that the exact same sentence has been translated previously. |
| In-Context Exact (ICE) Match | An exact match that occurs in the identical context, specifically at the same location within a paragraph. Context is typically defined by surrounding sentences and attributes like document file name, date, and permissions, ensuring a precise contextual correspondence. |
| Fuzzy Match | A match that is not exact, where the similarity between the current source segment and a stored segment falls between 0% and 100%. Some systems assign percentage scores to these matches, but these figures are only comparable across systems if the scoring methodology is specified. |
| Concordance | A feature where, upon selecting one or more words in a source segment, the system retrieves segment pairs that satisfy the search criteria. This is particularly useful for locating translations of terms and idioms when a dedicated terminology database is not available. |
| Updating | The process of incorporating a new translation into a TM after it has been accepted by the translator. This involves managing existing database content, with some systems allowing modification or deletion of entries and the storage of multiple translations for the same source segment. |
| Metadata | Data that describes other data, providing supplementary information about digital content and associated processes. It offers context and details about the primary data. |
| Translation Memory (TM) Software | Computer-assisted translation (CAT) tools that store previously translated segments of text, enabling translators to reuse these translations for new projects, thereby saving time and ensuring consistency. |
| Translation Unit (TU) | A pair of source and target language segments that are stored together in a translation memory, representing a completed translation of a specific piece of text. |
| Descriptive Metadata | A type of metadata that focuses on describing the content of digital resources, helping users to identify and understand what the resource is about. |
| Structural Metadata | A type of metadata that describes the organization and arrangement of digital objects or components, outlining how different parts of a resource are structured and related. |
| Administrative Metadata | A type of metadata that provides technical information about a file or digital resource, such as its format, creation date, or access rights, aiding in its management and preservation. |
| Interoperability | The ability of different software systems or tools to exchange and use data effectively, which is crucial for seamless data transfer and collaboration in translation workflows. |
| Leveraging | The process of re-using or utilizing existing translated text from a Translation Memory (TM) in new translation projects. |
| Legacy Material | Previous translations that are not already in a Translation Memory (TM) format, often existing as standalone translated files. |
| Translation Unit | A pair of matched source and target segments that is saved into a Translation Memory (TM) database. |
| Automated Alignment Tools | Software applications designed to automatically match source text segments with their probable translations based on file structure and content. |
| Quality Score | A metric generated by some alignment tools, often based on internal algorithms, to indicate the success and reliability of the automated alignment process. |
| Translation | The process of converting content from a source language into a target language, adhering to grammar rules and syntax, while ensuring the original meaning is preserved. It is not a word-for-word conversion but a complex adaptation to language standards. |
| Localization | A comprehensive process that goes beyond mere translation to adapt content for specific local audiences. This involves considering cultural nuances, local laws, dialects, and preferences to ensure the message resonates effectively with each target market. |
| Source Language | The original language of a text or content that is intended for translation or localization. |
| Target Language | The language into which content is translated or adapted during the localization process. |
| Cultural Aspects | The various elements of a society's way of life, including customs, traditions, values, and social norms, that need to be considered and respected when adapting content for a local audience. |
| Local Versions | Specific adaptations of a language that cater to the unique dialects, idioms, and cultural references of a particular region or country, even if they share a common official language. |
| Marketing Strategy | A plan developed by a company to promote and sell its products or services in a specific market, which often needs to be customized for different local audiences during localization. |
| Cultural Barriers | Obstacles arising from differences in cultural understanding, values, or practices that can hinder the effective communication and comprehension of a message across different groups. |
| Search Engine Optimization (SEO) | The practice of enhancing a website's visibility in search engine results pages to attract more organic traffic. This involves optimizing content, technical aspects, and building authority. |
| Keyword Research | The process of identifying relevant terms and phrases that potential customers use when searching for products or services online, specifically in the target language and region. |
| Cultural Relevance | The degree to which content or keywords align with the customs, preferences, and nuances of a specific target audience, ensuring it resonates naturally and effectively. |
| Localized Content | Website content that has been translated and adapted to be culturally appropriate and linguistically accurate for a specific target market, going beyond simple word-for-word translation. |
| Metadata Optimization | The process of translating and refining meta titles, meta descriptions, and URL slugs to be compelling, keyword-rich, and encourage user clicks from search engine results pages. |
| Multilingual Link Building | The strategic effort to acquire backlinks from reputable websites in various languages, which helps to improve a website's authority and search engine rankings in different international markets. |
| Content Structure and Formatting | The organization and presentation of translated content using elements like headers, bullet points, and paragraphs to enhance readability and user experience, which search engines also value. |
| Mobile Optimization | Ensuring that translated website content is designed and formatted to perform optimally on mobile devices, including fast loading times for media and responsive design. |
| Analytics and Reporting | The process of monitoring website performance data, such as traffic and user engagement, for localized content to assess effectiveness and inform future SEO strategy adjustments. |
| International Web Presence | The overall visibility and accessibility of a website across different countries and languages, achieved through strategic localization and SEO efforts. |
| Meta Titles | A specific type of metadata that provides a concise and descriptive title for a webpage, appearing in search engine results and browser tabs, and significantly impacting user click-through rates when well-crafted and translated. |
| Meta Descriptions | A brief summary of a webpage's content, also appearing in search engine results, that aims to entice users to click on the link by accurately and compellingly describing what the page offers. |
| Search Engine Visibility | The degree to which a webpage is discoverable and appears in search engine results pages (SERPs) for relevant queries, which is directly influenced by accurate and optimized metadata. |
| User Click-Through Rates (CTR) | The percentage of users who click on a specific link in search engine results or other online advertisements after viewing it, a metric heavily influenced by the attractiveness and relevance of meta titles and descriptions. |
| Keyword Optimization | The strategic incorporation of relevant keywords into webpage content and metadata to improve its ranking in search engine results for specific queries, a process that requires understanding region-specific terms. |
| Global Brand Consistency | The practice of maintaining a unified brand identity, tone, and message across all international markets, ensuring that translated metadata aligns with the overall brand strategy and presents a cohesive global image. |
| Character Limits | Restrictions imposed by search engines on the maximum length of meta titles and descriptions, requiring translators to craft concise yet impactful translations that avoid truncation in search results. |
| Credibility and Trust | The level of belief and confidence users place in a website and its content, which can be significantly enhanced by accurate and professional translations of metadata, and conversely, harmed by errors. |
| Market Trends | Evolving linguistic and cultural shifts within a specific market that can influence user search behavior and content preferences, necessitating the adaptation of translated metadata for sustained optimization. |
| Source Segment | The original text or portion of text that is being translated within the WFA system. |
| Target Segment | The translated version of the source segment within the WFA system. |
| Placeables | Terms recognized by WFA from the glossary within the source segment, which are highlighted and can be manipulated using specific shortcuts or mouse actions. |
| Auto-suggest Feature | A default setting in WFA that proposes target terms as the user types the first few letters of either the source or target term, facilitating faster translation. |
| Glossary Panel | A panel within WFA that displays the translations of highlighted glossary terms, accessible via a keyboard shortcut or the View tab. |
| Glossary Dialog Box | A pop-up window invoked by a specific keyboard shortcut or button click, used for adding new terms and their translations to the glossary. |
| F1, F2, F3 Fields | Designated areas within the Glossary Dialog Box used to store additional information about a term, such as its role, context, or grammatical form, beyond the basic translation. |
| Controlled Natural Language (CNL) | A subset of a natural language that has restricted grammar and vocabulary to minimize or eliminate ambiguity and complexity, often used to improve readability for humans or enable reliable automatic semantic analysis. |
| Simplified Technical English | A type of controlled language designed to improve the quality and facilitate the (semi-)automatic translation of technical documentation by imposing restrictions on writers, such as sentence length, pronoun usage, and approved vocabulary. |
| Caterpillar Fundamental English (CFE) | A controlled language developed by Caterpillar Inc. that restricts vocabulary to approximately 850 words to ensure consistency and high quality in the authoring and translation of technical documents for their complex products. |
| CLOUT™ | An acronym for Controlled Language Optimized for Uniform Translation, representing a set of grammar rules developed to reduce ambiguities in texts across many languages, thereby making them more suitable for machine translation. |
| Ambiguity | The quality of being open to more than one interpretation; inexactness, which controlled natural languages aim to reduce or eliminate through grammatical and lexical restrictions. |
| Semantic Analysis | The process of understanding the meaning of language, which controlled natural languages are designed to facilitate reliably through their structured and unambiguous nature. |
| Computer-Assisted Translation (CAT) Tools | Software applications designed to assist human translators in the translation process, often by leveraging features like translation memory and terminology management. |
| Technical Equipment Documentation | The written materials, manuals, and guides that accompany and explain the operation, maintenance, and specifications of technical machinery and devices. |
| Electronic Content | Information that exists in a digital format and can be accessed, stored, and reproduced across various electronic media and platforms. |
| Communicative Functions | The specific purposes or intentions behind a piece of communication, such as informing, persuading, instructing, or entertaining, which influence the language and style used. |
| Computer-Assisted Translation (CAT) | A set of specialized computer applications designed to efficiently assist translators in their tasks, aiming to provide them with all necessary resources automatically and quickly. |
| Translation Memory eXchange (TMX) | An open, standard format for Translation Memories that is compatible with most translation tools, allowing for data exchange between different software. |
| Project Management Software | Software used in CAT to control information flow, assign translation tasks, manage quality control, analyze content, generate reports (including full and fuzzy matches, and repetitions), count words, and handle final delivery to clients. |
| Full Matches | Segments of text in a new document that exactly match segments already stored in a Translation Memory. |
| Fuzzy Matches | Segments of text in a new document that are similar but not identical to segments already stored in a Translation Memory, requiring human review and editing. |
| Intra-file Repetitions | Identical segments of text that appear multiple times within the same document. |
| Cross-file Repetitions | Identical segments of text that appear in different documents, often within the same project or client repository. |
| Automated alignment process | A method that uses specialized software tools to automatically match and link corresponding segments of text between source and target language files, typically for translation memory creation. |
| Source files | The original documents or texts that serve as the input for the alignment process, usually in the original language. |
| Target files | The translated documents or texts that correspond to the source files and are used in the alignment process to find sentence-level matches. |
| Sentence alignment | The process of identifying and pairing equivalent sentences or text segments between a source document and its translation in a target document. |
| Linguistic verification | The manual review and correction of automatically generated alignments by a human linguist to ensure accuracy and fix any discrepancies or errors. |
| Tags | Special markers within files that represent formatting or variable information, which alignment tools can use as guides to match segments between source and target files. |
| Corpus | A large and structured set of texts, often used for linguistic research and the development of language technologies, such as alignment tools and machine translation models. |
| Market Penetration | The extent to which a product or service is recognized and bought by customers in a particular market, often measured by sales volume or market share. Localization is crucial for achieving successful market penetration in foreign countries. |
| User Experience | The overall experience a person has when interacting with a product, system, or service, including aspects like usability, accessibility, and desirability. Localization significantly impacts user experience by making content more relevant and intuitive for local users. |
| Localized Marketing Strategy | A marketing approach tailored to the specific cultural, legal, and consumer expectations of a particular geographic market, often involving adaptations in messaging, visuals, and product offerings to enhance appeal and effectiveness. |
| Units of Measurement | Standardized quantities used to express physical properties, such as length, weight, or volume. Localizing these units (e.g., converting from imperial to metric) is essential for clarity and ease of understanding in different regions. |
| Currency Units | The standard medium of exchange for a particular country or region. Localization requires adapting currency symbols and values to reflect local denominations and conducting necessary conversions for international transactions. |
| Date Formats | The standardized way in which dates are written, which can vary significantly between countries (e.g., MM/DD/YY vs. DD/MM/YY). Localizing date formats is vital to prevent misinterpretation and ensure accurate communication. |
| Translation Memory (TM) File | A structured text file, typically in XML format, that stores translation and linguistic data, including segments, language information, and metadata. |
| XML (Extensible Markup Language) | A markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable, providing a well-defined structure for representing complex data. |
| CAT (Computer-Assisted Translation) Tool | Software used by translators to improve efficiency by leveraging translation memories, providing automatic suggestions for matching or partially matching segments. |
| TEnT (Translation Environment Tool) | An alternative term for Computer-Assisted Translation (CAT) tools, referring to software environments that assist translators in their work. |
| Localization Process | The process of adapting a product or content to a specific locale or market, which often involves translation and other modifications. |
| Encapsulation Method | A technique used in file formats to enclose inline codes within specific elements, ensuring their proper interpretation and handling. |
| International Service Language | A specific type of controlled language developed by Kodak, designed to facilitate communication across international service teams. |
| Nortel Standard English (NSE) | A controlled language variant developed by Nortel, aiming to standardize English usage within the company for clearer technical communication. |
| Simplified Technical English (STE) | A controlled language specification, such as ASD-STE100, used to simplify technical writing, making it more accessible and easier to translate. |
| CLOUT™ Rule Set | An acronym for Controlled Language Optimized for Uniform Translation, a set of rules developed by Uwe Muegge to minimize ambiguities in texts, thereby enhancing machine translation quality. |
| Machine Translation | The use of computer software to translate text or speech from one language to another. Controlled languages are particularly beneficial for improving the accuracy of machine translation. |
| Active Form | A grammatical construction where the subject of the sentence performs the action. Controlled languages often mandate the use of the active form for clearer and more direct communication. |
| Pronoun | A word that substitutes for a noun or noun phrase. Controlled languages may restrict pronoun usage to avoid potential confusion with antecedents, recommending noun repetition instead. |
| Article | A word (like "a," "an," or "the") that precedes a noun and indicates whether the noun is specific or unspecific. Controlled languages often require the use of articles for clarity. |
| Term Base | A database that stores single words or phrases related to a specific subject matter, often in a bilingual or multilingual format, designed to aid in translation consistency and accuracy. |
| CAT Tools | Computer-Assisted Translation tools that integrate features like term bases and translation memories to streamline the translation process and improve efficiency. |
| Forbidden Terms | Specific words or phrases that translators are instructed not to use in their translations, typically defined within a term base or glossary to maintain brand integrity or avoid specific connotations. |
| Terminology Management | The systematic process of identifying, collecting, organizing, and disseminating specialized vocabulary relevant to a particular field or client, crucial for high-quality translation. |
| Writing Systems | Diverse methods of representing language visually, employing scripts or characters that can be symbols, logograms, syllograms, or letters, and may differ in writing direction (e.g., left-to-right, right-to-left, vertical). |
| Complex Text Layout | Refers to languages where character shapes change based on context, capitalization rules vary, and text sorting follows different conventions, requiring careful handling during localization. |
| Pluralization Rules | Grammatical variations in how nouns and verbs change form to indicate more than one item, which differ significantly across languages and must be accurately localized. |
| Economic Conventions | Local variations in practices and standards related to commerce and daily life, including paper sizes, storage media preferences, broadcast systems, phone number formats, and postal address structures. |
| Legal Requirements | Varying laws and regulations specific to a country or region that may necessitate customization of a product or service to ensure compliance, such as privacy laws, labeling standards, and tax collection procedures. |
| Political Issues | Sensitive topics like disputed borders and geographical naming disputes that require careful consideration and sensitivity during localization to avoid offense or misunderstanding. |
| Cultural Conventions | Local customs, traditions, beliefs, and social norms, including superstitions, religions, and taboos, which must be understood and respected to ensure appropriate content and messaging. |
| User Experience (UX) | The overall experience a person has when interacting with a product, system, or service, which localization aims to enhance by making it more intuitive and appealing to the target audience. |
| Currency Conversion | The process of exchanging one currency for another, often necessary in localization to display equivalent monetary values in the target market's currency. |
| Text Expansion | The phenomenon where translated text can become longer or shorter than the original text due to linguistic differences, necessitating flexible layout designs in localized content. |
| Regulatory Compliance | Adherence to the laws, regulations, and legal requirements of a specific country or region, which is a critical aspect of localization for businesses operating internationally. |
| Simship | Simultaneous shipment, a practice where products are released concurrently in all local markets to maintain a unified global launch strategy. |
| Time-to-market | The duration from product conception to its availability on the market, with faster schedules becoming a significant challenge in product development and translation. |
| Internationalisation (I18N) | The process of designing products and systems in a way that eliminates the need for redesign for each local market, facilitating easier adaptation to different regions. |
| Localisation (L10N) | The adaptation of products and documentation to the specific language, culture, and other requirements of target markets, ensuring relevance and user experience. |
| Terminology Extraction | The automated or semi-automated process of identifying and extracting key terms from source texts, which are then used to build or enrich terminology databases. |
| Software Localisation | The adaptation of software applications to specific target languages and cultures, including text, graphics, and user interface elements, to make the software usable and appealing to local users. |
| Project Management (Translation) | The oversight and coordination of all aspects of a translation project, from initial client request to final delivery, including resource allocation, quality control, and deadline management. |
| Computer-Aided Translation (CAT) | A system that utilizes software tools to assist human translators in the translation process, encompassing features like translation memory, terminology management, and project management. |
| Exact Match (100% Match) | A situation where a source segment in the current document exactly matches a segment already stored in the translation memory, allowing for direct reuse of the previous translation. |
| Term Base (Glossary) | A database containing single words or expressions, often bilingual or multilingual, related to a specific subject area, used to ensure consistent application of terminology. |