Cover
Mulai sekarang gratis HIAT final test.pdf
Summary
# Translation process with a CAT system
This section outlines the systematic approach to translation when employing a Computer-Assisted Translation (CAT) system.
### 1.1 Overview of the translation process with a CAT system
Using a CAT system involves a structured workflow to achieve the final translated output. Following a defined routine and set of steps is recommended for efficient and accurate translation [49](#page=49).
### 1.2 Key stages in the CAT translation process
The translation process within a CAT system is typically broken down into several distinct stages:
#### 1.2.1 File format check
The initial step involves verifying the format of the source file to ensure compatibility with the CAT system [50](#page=50).
#### 1.2.2 Resource assignment
Following the format check, the necessary resources are assigned. This may include style guides, glossaries, or previous translation memories, depending on the project requirements [50](#page=50).
#### 1.2.3 Segmentation
The source text is then segmented into smaller, manageable units, typically sentences or phrases. This segmentation allows the CAT tool to process the text efficiently and leverage translation memory data [50](#page=50).
#### 1.2.4 Translation
This is the core stage where the actual translation takes place. The CAT system assists the human translator by providing suggestions from translation memories, termbases, and machine translation engines, while the translator actively works on producing the final translation [50](#page=50).
> **Tip:** Establishing a clear routine and adhering to established steps is crucial for a successful translation project utilizing a CAT system [49](#page=49).
---
# Understanding translation memory files
A translation memory (TM) file is a structured text file, typically in XML format, that stores translation and linguistic data [66](#page=66).
### 2.1 What is a translation memory file?
A translation memory file is essentially a structured text file, not a proprietary or inaccessible format. While file extensions like *.tmx or *.xliff might suggest specialized software is required, these files can be opened and understood using standard text editors. The primary characteristic of a TM file is its well-defined structure, which allows for the representation of complex data structures, making it functional and straightforward [66](#page=66).
### 2.2 Information stored in translation memory files
TM files store a variety of linguistic and metadata. The core information includes:
* **Segments:** Both the original source text and its corresponding translation (target text) [67](#page=67).
* **Language:** The source and target languages of the translation [67](#page=67).
In addition to these essential components, TM files may also store supplementary data, such as:
* Author of the translation [67](#page=67).
* Usage count (how many times a segment has been used) [67](#page=67).
* Creation and modification dates and times [67](#page=67).
* The tool used for creation [67](#page=67).
* Domain or field of the text [67](#page=67).
* Alternate translations [67](#page=67).
* Notes or comments related to the translation [67](#page=67).
### 2.3 Typical formats of translation memory files
The two most prevalent formats for translation memory files in the industry are XLIFF (XML Localisation Interchange File Format) and TMX (Translation Memory eXchange). Both of these are based on the XML (Extensible Markup Language) format [68](#page=68).
While XML-based formats are common, translation memory data can also be stored in simpler spreadsheet formats like Microsoft Excel (XLS) or comma-separated value (CSV) text files. However, these spreadsheet formats often have a limitation: they typically store less detailed information about each translation unit, usually only the source segment and its target language [68](#page=68).
#### 2.3.1 Advantages of using XML for TM files
The widespread adoption of XML for XLIFF and TMX formats is due to several key advantages it offers over plain text files:
* **Ease of parsing:** XML's well-defined structure makes it simple for software to read and interpret [68](#page=68).
* **Semantic meaning:** The use of tags (e.g., ``, ``) in XML clearly indicates the meaning and purpose of the data contained within them [68](#page=68).
* **Software ecosystem:** A robust set of software tools is available that is built around the XML format for tasks like validation, importing, parsing, and searching [68](#page=68).
* **Interoperability:** The well-defined structure of XML files facilitates seamless data exchange and interaction between different applications and systems [68](#page=68).
> **Tip:** Understanding that TM files are structured text files, often XML, demystifies them and highlights their accessibility with standard text editors.
---
# Comprehensive localization involves adapting various elements beyond text
Comprehensive localization extends far beyond simple text translation, encompassing a wide array of elements to ensure a product or website resonates with diverse global audiences and overcomes cultural barriers. Effective localization aims to make content appealing and usable for various user groups by adapting visual, numerical, and regulatory aspects .
### 3.1 Adaptation of non-textual elements
To provide a superior user experience, numerous changes beyond translation are necessary .
#### 3.1.1 Colors
The interpretation of colors varies significantly across cultures. For instance, red might signify danger in one region, while white could represent death, and orange might convey mourning. Thorough research into color symbolism within the target market is crucial before commencing localization efforts .
#### 3.1.2 Layout
Different languages require varying amounts of space to convey the same concepts. A flexible layout is essential to accommodate text of different lengths that results from translation .
> **Tip:** Text expansion after translation can range from 30% to 100%, necessitating a design that can accommodate these variations .
#### 3.1.3 Visuals
Photographs and other visual elements must be adapted to align with local cultures. Images that are appealing in one culture, such as a blonde mother hugging her children, may not impress audiences in another country, such as China, and could even be offensive in regions like the Middle East .
#### 3.1.4 Units of measurement
Most countries adhere to the metric system. To enhance clarity and comprehension, units of measurement need to be converted to the standard used in the target locale .
#### 3.1.5 Contracts and agreements
When conducting business internationally, adherence to local regulations is paramount. Compliance with these rules is vital to prevent legal complications, potential penalties, or even the banning of a website .
#### 3.1.6 Currency units
Currency denominations also require localization. This involves not only displaying the correct currency symbol and name but also performing currency conversions to show equivalent amounts accurately. For example, one hundred dollars might need to be represented as one hundred pounds sterling, with an indication of the equivalent value, such as "one hundred dollars (sixty-five pounds)" .
> **Example:** When localizing for the United Kingdom, one hundred dollars might be displayed as "one hundred pounds sterling," and an equivalent amount could be shown as "USD 100 (GBP 65)" .
#### 3.1.7 Paper size
Printed documents may be designed for specific paper sizes, such as European A4 (210 by 297 millimeters or 8.27 by 11.7 inches). This differs from American letter-size paper (8.5 by 11 inches). Minor discrepancies in paper size can affect document formatting and page breaks .
#### 3.1.8 Date formats
Understanding and adapting to different date formats is crucial. For instance, the sequence '4/5/15' could mean April 5th in the United States or May 4th in the United Kingdom. These variations can lead to significant misunderstandings if not correctly localized .
---
# Language and cultural considerations in localization
Language and cultural considerations are paramount in localization to ensure content is accurate, appropriate, and resonates with the target audience.
### 4.1 Core translation aspects
The translation of text is a time-consuming but essential part of localization. This applies to various media :
* **Video, audio, and film:** Translation is required for music lyrics and spoken words, implemented through subtitles or dubbing .
* **Printed and digital media:** All text within documentation and error messages needs translation .
* **Logos and images:** Text within logos and images may require alteration or replacement with generic icons .
* **Layout and design:** The design of websites and written content might need adjustments to accommodate differences in character sizes and translation lengths .
* **Audio materials:** Localization must account for variations in variety, register, and specific dialects .
> **Tip:** Understanding the nuances of target languages, including their specific grammatical rules and stylistic conventions, is critical for effective translation.
### 4.2 Writing systems and conventions
Different writing systems present unique challenges and require careful consideration during localization :
* **Scripts and characters:** Writing systems utilize diverse scripts, which can be symbols, logograms, syllabograms, or letters .
* **Writing direction:** The direction of text flow varies, including left-to-right (European languages), right-to-left (Arabic and Hebrew), boustrophedon (alternating directions), and vertical writing (some Asian languages) .
* **Complex text layout:** Some languages feature complex text layouts where character shapes change based on context. Capitalization requirements can also differ significantly between languages .
* **Sorting rules:** Different writing systems and languages have distinct rules for sorting text .
* **Numeral systems:** Translators must be aware that some languages use a different set of numeral systems .
* **Grammar and pluralization:** Variations in pluralization and other grammatical rules across languages necessitate meticulous attention to detail .
* **Punctuation:** The usage of punctuation can differ; for instance, French uses guillemets in some publications, akin to English double quotes .
### 4.3 Formatting and data conventions
Localization must adapt to local conventions for various data formats :
* **Number formats:** Consideration is needed for grouping of digits and decimal separators .
* **Time and date formats:** This includes adapting to different calendar systems and standard time formats .
### 4.4 Economic and logistical considerations
Economic and logistical conventions vary globally and impact localization efforts :
* **Physical and technical standards:** This includes variations in paper sizes, preferred storage media, broadcast TV systems, phone number formats, delivery services, postal codes, and postal address formats .
* **Financial conventions:** Localization must account for currency symbols, their position, and the use of currency markers. It also encompasses measurement systems, battery sizes, and electric current and voltage standards .
* **Service providers:** Variations may exist in third-party providers for payment services, weather reports, and online maps .
* **Time zones:** Translators need to carefully consider variations in time zones .
> **Example:** When localizing for Germany, one would need to use a comma for the decimal separator (e.g., 3,14) and a period for grouping digits (e.g., 1.234.567), unlike in the United States where a period is used for decimals and commas for grouping.
### 4.5 Legal and regulatory compliance
Legal requirements necessitate product customization or changes to ensure compliance within specific countries :
* **Privacy laws:** Adherence to local privacy legislation is crucial .
* **Disclaimers and labeling:** There may be requirements for additional disclaimers on packaging or websites, and different consumer labeling regulations .
* **Export and security regulations:** Compliance with regulations on encryption and export restrictions is necessary .
* **Legal procedures:** Conformity with subpoena procedures or internet censorship regulations may be required .
* **Accessibility:** Meeting local accessibility requirements is important .
* **Taxation:** This includes understanding and complying with local tax collection rules, such as customs duties, value-added tax (VAT), and sales tax .
---
# Principles of internationalization and localization for websites
Internationalization and localization are fundamental concepts in adapting websites for global audiences, focusing on creating content that is easily adaptable to various languages and cultural contexts .
### 5.1 Internationalization principles
Internationalization, often abbreviated as #i18n, is the process of designing and developing a website to facilitate its straightforward adaptation to different languages and cultural preferences. This involves building the website with future localization in mind, rather than as an afterthought .
#### 5.1.1 Key principles for internationalization
* **Unicode standard**: Employing the Unicode standard is essential for ensuring compatibility with a wide array of writing systems, thereby enabling the representation of diverse languages. Unicode provides a unique number for each character, regardless of the platform, program, or language .
* **Separation of content and code**: Maintaining a clear distinction between website content and its underlying source code is crucial. This separation allows for easier translation of text without necessitating extensive modifications to the programming .
* **Flexible user interface (UI)**: A flexible UI design is vital for accommodating texts of varying lengths, a common challenge when adapting to different languages. It also supports languages with different reading directions (e.g., right-to-left) .
* **Locale-specific formats**: Adapting to the specific formats for dates, times, and numbers relevant to each locale is critical for cultural appropriateness and user comprehension .
* **Culturally neutral images and icons**: Selecting images and icons that are universally understood or providing region-specific alternatives ensures inclusivity and avoids potential cultural misunderstandings .
> **Tip:** Internationalization is about building the foundation for localization. A well-internationalized website significantly reduces the effort and cost of adapting it to new markets.
---
# Understanding translation memory software and metadata
This section delves into the functionality of translation memory (TM) software and the crucial role of metadata in managing translation resources and ensuring quality.
### 6.1 Overview of translation memory software
Translation memory (TM) software divides source texts into segments. Each segment is associated with metadata that can identify the translator, the date, and the time of its creation. This allows translators to prioritize more recent translations or to discard segments containing outdated terminology. Language service providers also benefit from TM software, as it aids in the effective management of their TM resources [84](#page=84).
A potential challenge arises when transferring TM data between different formats, as the loss of vital metadata can lead to interoperability issues and restrict users to specific software tools [84](#page=84).
### 6.2 The nature of metadata
Metadata is fundamentally "data about data," providing supplementary information about digital content and underlying processes. As defined by Berners-Lee in the context of the World Wide Web, it is "machine understandable information about web resources or other things". There are three primary categories of metadata [85](#page=85):
* **Descriptive metadata:** This type of metadata describes the content itself [85](#page=85).
* **Structural metadata:** This category describes how objects or components are organized [85](#page=85).
* **Administrative metadata:** This pertains to technical information, such as the file type of a resource [85](#page=85).
### 6.3 Metadata within Computer-Assisted Translation (CAT) tools
In CAT tools, segments from the source language are aligned with their corresponding target language counterparts to facilitate reuse within the TM. The TM tool manages the translation workflow, presenting translators with both source and target texts through a user interface. Crucially, it automatically builds the TM by saving pairs of source and target segments as translation units [86](#page=86).
When a previously translated segment reappears, the TM software will suggest the existing translation to the translator. The system can also propose partial or "fuzzy" matches, based on a similarity percentage between a new source segment and existing source segments in the memory, depending on the translator's pre-set parameters [86](#page=86).
> **Tip:** Metadata plays a key role in filtering previous translations, ensuring that more recent or more reliable material is prioritized for reuse [87](#page=87).
TM tools are now widely adopted for specialized translation and localization tasks due to their proven ability to reduce costs, save time, and enhance the consistency of translations [87](#page=87).
---
# Translation file formats
This section summarizes key file formats used in translation workflows for managing linguistic data and facilitating interoperability between tools.
### 7.1 TermBase eXchange (TBX)
TermBase eXchange (TBX) is a standardized XML-based format designed for the exchange of terminology databases. It is also known by its acronym DXLT, which stands for Default XLT format, where XLT refers to XML representations of Lexicons and Terminologies. The primary purpose of TBX is to allow glossaries to be transferred between different translation tools [92](#page=92).
The format is based on the international standard ISO 12200, specifically the Machine-Readable Terminology Interchange Format (MARTIF) [92](#page=92).
#### 7.1.1 Standards-based Access service to multilingual Lexicons and Terminologies (SALT)
SALT is an initiative from Brigham Young University (BYU) that supports the maintenance and organization of multilingual lexicons and terminologies, likely utilizing or promoting standards like TBX [92](#page=92).
---
# Localization goes beyond translation to adapt content for local audiences
Localization is a comprehensive process of adapting content for specific local audiences, extending far beyond simple linguistic translation. While translation focuses on accurately converting text from a source language to a target language, respecting grammar and syntax, localization involves a deeper cultural and contextual adaptation. This adaptation is crucial for effectively reaching and engaging local markets .
### 8.1 Understanding the distinction between translation and localization
Translation is the foundational step of rewriting content into a different language while preserving the original meaning. It is essential for documents such as user manuals, medical texts, technical publications, and scientific journals, where accuracy is paramount .
Localization, conversely, encompasses translation but also includes adapting the content to suit the cultural nuances, preferences, and legal requirements of a specific local audience. This is particularly relevant for digital content like websites, mobile applications, software, video games, and multimedia, as well as voiceovers .
### 8.2 The scope of localization
Localization acknowledges that even within a single language, significant regional variations exist. For instance, Spanish spoken in Argentina, Mexico, and Spain will have distinct differences, much like English in the United States, Australia, and Canada. Therefore, a successful marketing strategy must account for these local versions and dialects .
> **Tip:** Recognizing these linguistic and cultural variations is key to tailoring your message effectively for each target market.
### 8.3 Key components of successful localization
Achieving effective localization requires more than just a skilled team of translators. It necessitates collaboration with local marketers and consultants to ensure that cultural sensitivities and local laws are respected within each market. Regular translation alone may not be sufficient for a client's business to thrive in local markets; localization is essential to build trust with the local public .
Selling in a foreign country involves overcoming not only language barriers but also developing a customized message tailored to each local audience .
### 8.4 Localization in practice: The KitKat example
Cultural barriers can significantly impede the understanding of an original message. A prime example of successful localization is Nestlé's KitKat campaign in Japan. Instead of a direct translation of the slogan "Have a break, have a KitKat," the company adapted it to "Kitto Katsu," which translates to "surely win" in Japanese. Furthermore, they introduced a range of exotic chocolate bar flavors to appeal to local tastes .
> **Example:** This strategic adaptation not only resonated with the Japanese market but also demonstrated an understanding of how to leverage local expressions and preferences. This approach made the KitKat Japanese campaign a clear localization success .
---
# Human roles in machine translation processes
This section details the crucial roles humans play within machine translation (MT) workflows, focusing on pre-editing to optimize source content for better MT output .
### 9.1 The spectrum of human involvement in MT
Humans can actively participate in the MT process in several key ways :
* **Pre-editing:** Modifying the source text before it enters the MT engine to improve the quality of the machine-generated translation .
* **Post-editing:** Reviewing and correcting the output produced by the MT engine after the translation has been generated .
### 9.2 Pre-editing explained
Pre-editing involves the revision of technical documentation *before* it is processed by an MT system. The primary goal is to enhance the source material to achieve a superior raw translation output from the MT engine. Effective pre-editing can significantly decrease or even entirely eliminate the need for post-editing .
#### 9.2.1 The role of the pre-editor
Ideally, a specialized human editor undertakes pre-editing. This editor analyzes the text from the perspective of an MT engine, anticipating potential errors in the generated output. The pre-editor modifies the text to facilitate MT by :
* Reducing sentence length .
* Avoiding complex or ambiguous syntactic structures .
* Ensuring consistency in terminology .
* Using articles appropriately .
#### 9.2.2 Tools and techniques in pre-editing
Beyond structural edits, pre-editors utilize various tools and techniques:
* **Automated Revision Tools:** These include spell-checking the source text against a project-specific glossary .
* **Advanced Grammar-Checking Tools:** Deploying sophisticated tools to identify and correct grammatical issues .
* **Tagging Untranslatable Elements:** Identifying and marking parts of the source document that should not be translated by the MT system .
> **Tip:** Pre-editing techniques are not exclusive to MT workflows; many organizations incorporate similar processes into their localization best practices for human translation projects to improve downstream quality and productivity .
---
# Controlled language rules for writing
This section outlines specific rules for controlled language to ensure clarity, consistency, and ease of translation in technical writing.
### 10.1 Rule 7: Repeat nouns instead of using pronouns
This rule advocates for explicitly repeating nouns rather than using pronouns to refer back to them. This approach enhances clarity, especially in technical documentation where precision is paramount, by avoiding potential ambiguity that pronouns can sometimes introduce.
* **Write:** You must check the spelling of your text before you publish your text .
* **Do not write:** You must check the spelling of your text before publishing it .
> **Tip:** Repeating nouns, while seemingly redundant, significantly reduces the cognitive load on the reader by ensuring the subject of the sentence is always clear.
### 10.2 Rule 8: Use articles to identify nouns
This rule emphasizes the importance of using articles (like "a," "an," or "the") before nouns when they are being referred to. This grammatical convention helps to clearly identify and specify the noun being discussed.
* **Write:** Test the installation .
* **Do not write:** Test installation .
> **Example:** Using an article like "the" in "Test the installation" clearly indicates that a specific installation is being referred to, whereas omitting it in "Test installation" might be interpreted as a general command about the concept of testing installations.
### 10.3 Rule 9: Use words from a general dictionary
This rule promotes the use of commonly understood words found in a general dictionary. It advises against using obscure, archaic, or highly specialized vocabulary that might not be familiar to a broad audience or easily translatable.
* **Write:** Avoid ambiguity .
* **Do not write:** Eschew obfuscation .
> **Tip:** When in doubt about a word's commonality, consider if it's likely to be found in a standard English dictionary accessible to a wide range of users.
### 10.4 Rule 10: Use only words with correct spelling
This rule is fundamental to maintaining professionalism and ensuring effective communication. Texts containing spelling errors can lead to confusion, mistrust, and significantly complicate the translation process, potentially altering the intended meaning.
* **Write:** Texts that contain spelling errors complicate the translation process .
* **Do not write:** Texts that contein speling misstakes complicate the translation procces .
> **Tip:** Employing spell-check tools is essential, but a final human review is also crucial to catch errors that automated checkers might miss, especially in context.
---
# Guidelines for achieving different quality levels in post-editing
Post-editing guidelines are tailored based on the quality of the machine translation (MT) output and the desired end quality of the content. Two primary quality levels are considered: "good enough" (or "fit for purpose") and quality similar to human translation (or "publishable quality") .
### 11.1 "Good enough" quality (light post-editing)
This quality level ensures the content is comprehensible and accurately conveys the meaning of the source text, even if it lacks stylistic polish. The text might sound computer-generated, with unusual syntax or imperfect grammar, but the core message remains accurate .
#### 11.1.1 Key objectives for "good enough" quality:
* **Semantic accuracy:** Prioritize translating the meaning correctly .
* **Information integrity:** Ensure no information is added or omitted unintentionally .
* **Cultural appropriateness:** Edit any offensive, inappropriate, or culturally unacceptable content .
* **Minimal intervention:** Utilize as much of the raw MT output as possible .
* **Basic error correction:** Apply fundamental rules for spelling .
#### 11.1.2 Exclusions for "good enough" quality:
* Stylistic corrections are not required .
* Sentence restructuring solely for improved natural flow is unnecessary .
> **Tip:** For "good enough" quality, focus on the message's accuracy and comprehensibility, minimizing edits that only affect style or flow.
### 11.2 Quality similar to human translation (full post-editing)
This level aims for content that is not only comprehensible and accurate but also stylistically fine, with normal syntax, correct grammar, and proper punctuation. While the style might not reach the peak performance of a native-speaker human translator, it should be polished .
#### 11.2.1 Key objectives for quality similar to human translation:
* **Comprehensive correctness:** Achieve grammatically, syntactically, and semantically correct translation .
* **Terminology management:** Ensure key terminology is translated accurately and that untranslated terms comply with the client's "Do Not Translate" list .
* **Information integrity:** Verify that no information has been accidentally added or omitted .
* **Cultural appropriateness:** Edit any offensive, inappropriate, or culturally unacceptable content .
* **Extensive utilization of MT output:** Use as much of the raw MT output as possible .
* **Thorough error correction:** Apply rules for spelling, punctuation, and hyphenation .
* **Formatting adherence:** Ensure all formatting is correct .
> **Example:** A user manual requiring a high degree of accuracy and clarity for a global audience would necessitate full post-editing to achieve publishable quality. In contrast, internal company notes that just need to convey basic information might only require light post-editing.
---
# File formats for translation data exchange
This section explores various file formats designed for exchanging translation-related data between different tools and stages of the localization process [88](#page=88) [89](#page=89) [90](#page=90) [91](#page=91) [92](#page=92).
### 12.1 Translation Memory eXchange (TMX)
TMX is a file format that facilitates the transfer of translation memories between different translation tools. A translation memory (TM) itself is a database comprising source text segments and their corresponding translations in one or more target languages [89](#page=89).
* **Purpose:** To enable the sharing and migration of translation memories, ensuring continuity and reusability of previously translated content across various localization software [89](#page=89).
* **Maintaining Organization:** The OSCAR Special Interest Group at LISA (the Localization Industry Standards Association) is responsible for its organization [89](#page=89).
### 12.2 XML Localization Interchange File Format (XLIFF)
XLIFF is a format designed for the transfer of localizable data extracted from original files. It supports moving data through different stages of the localization workflow, from extraction to the final reintegration of localized content into its source format [90](#page=90).
* **Purpose:** To streamline the localization process by providing a standardized way to handle translatable content, separating it from the original file structure and enabling easier management by translators and tools [90](#page=90).
* **Maintaining Organization:** The OLIF Consortium oversees XLIFF, collaborating closely with the SALT group [90](#page=90).
### 12.3 Open Lexicon Interchange Format (OLIF)
OLIF focuses on the transfer of terminological and lexical data between translation tools. It is particularly suited for natural language processing (NLP) data, such as machine translation lexicons [91](#page=91).
* **Purpose:** To facilitate the exchange of specialized terminology and linguistic resources, supporting advanced NLP applications [91](#page=91).
* **Maintaining Organization:** The OLIF Consortium is responsible for OLIF [91](#page=91).
### 12.4 TermBase eXchange (TBX)
TBX, also known as DXLT (Default XLT format), is used for the transfer of glossaries between translation tools. It is based on the ISO 12200 standard, also known as MARTIF (Machine-Readable Terminology Interchange Format) [92](#page=92).
* **Purpose:** To enable the consistent sharing and application of approved terminology across different projects and translation environments, ensuring terminological accuracy [92](#page=92).
* **Maintaining Organization:** The SALT (Standards-based Access service to multilingual Lexicons and Terminologies) group at BYU manages TBX [92](#page=92).
> **Tip:** Understanding these formats is crucial for efficient translation workflow management, enabling seamless data transfer and integration between diverse localization tools and systems [88](#page=88) [89](#page=89) [90](#page=90) [91](#page=91) [92](#page=92).
---
# Translation memory tools and their functionalities
Translation memory (TM) tools store previously translated segments to facilitate efficient and consistent future translations [65](#page=65).
### 13.1 Core functionalities of translation memory tools
Translation memory tools encompass both off-line and on-line functions that assist translators throughout the translation process [62](#page=62) [63](#page=63) [64](#page=64) [65](#page=65).
#### 13.1.1 Off-line functions
Off-line functions primarily deal with the management and retrieval of translation data.
##### 13.1.1.1 Import
The import function allows for the transfer of text and its translation from external text files into the TM. This can be done from a "raw format" where a source text and its translation are imported, sometimes requiring reprocessing. Alternatively, a "native format" can be used, which is the TM's own file format for saving translation memories [62](#page=62).
##### 13.1.1.2 Analysis
Analysis involves several steps to prepare text for translation and TM integration.
* **Textual parsing:** This crucial step accurately recognizes punctuation to differentiate between sentence-ending periods and periods in abbreviations, effectively acting as a form of pre-editing. Marked-up texts, often found in documents processed by translator aid programs, benefit from this stage. Special text elements might be identified, with some not requiring translation (e.g., proper names, codes), while others may need conversion [62](#page=62).
* **Linguistic parsing:** This involves reducing words to their base form to create lists for term bank retrieval. Syntactic parsing can also extract multi-word terms or phraseology by normalizing word order variations, identifying which words can form a phrase [62](#page=62).
##### 13.1.1.3 Retrieval
TMs offer various types of matches for retrieving stored translations.
* **Exact match:** Also known as "100% matches," these occur when the current source segment is identical character-by-character to a stored segment. This indicates the exact same sentence has been translated previously [64](#page=64).
* **In-Context Exact (ICE) match or Guaranteed Match:** This is an exact match that occurs within the same context, meaning it is located in the identical position within a paragraph. Context can be defined by surrounding sentences and attributes like document file names, dates, and permissions [64](#page=64).
* **Fuzzy match:** When a match is not exact, it is considered a "fuzzy" match. Some systems assign percentage scores to fuzzy matches (greater than 0% and less than 100%). These percentages are not directly comparable across different systems without specifying the scoring methodology [64](#page=64).
* **Concordance:** This feature allows users to select one or more words within a source segment, and the system retrieves segment pairs that match the search criteria. It is particularly useful for finding translations of terms and idioms when a terminology database is unavailable [64](#page=64).
##### 13.1.1.4 Updating
A TM is updated with a new translation once it has been accepted by the translator. When updating a database, considerations arise regarding previous contents. TMs can be modified by changing or deleting entries, and some systems permit storing multiple translations for the same source segment [64](#page=64).
#### 13.1.2 On-line functions
On-line functions are active during the translation process, providing immediate assistance.
##### 13.1.2.1 Segmentation
Segmentation aims to identify the most useful translation units. It is a form of parsing performed monolingually using superficial parsing, and alignment is based on these segments. Manual correction of segmentations by translators can lead to the program repeating errors in future versions if not addressed properly. Translators typically work sentence by sentence, though sentence translation may depend on surrounding ones [63](#page=63).
##### 13.1.2.2 Alignment
Alignment is the process of establishing translation correspondences between source and target texts. Effective alignment algorithms can provide feedback to segmentation and even correct initial segmentation [63](#page=63).
##### 13.1.2.3 Term extraction
Term extraction can utilize a pre-existing dictionary as input. When identifying new terms, it can employ parsing based on text statistics. These statistics are valuable for estimating the workload of a translation job, aiding in planning and scheduling. Translation statistics typically count words and assess repetition within the text [63](#page=63).
##### 13.1.2.4 Export
Export functionality transfers translated text from the TM into an external text file. Ideally, import and export functions should be inverse operations [63](#page=63).
##### 13.1.2.5 Automatic translation and substitution
TM tools often offer automatic retrieval and substitution of translations. As a translator progresses through a document, TM systems are searched, and their results are displayed automatically. With automatic substitution, if an exact match is found for a segment in a new document version, the software will insert the old translation. This carries a risk of repeating previous translation errors if the translator doesn't verify the output against the source text [65](#page=65).
##### 13.1.2.6 Networking
Networking allows a group of translators to collaborate on a text more efficiently than working individually, as translations made by one become available to others. Sharing TMs before the final translation offers an opportunity for team members to correct each other's mistakes [65](#page=65).
##### 13.1.2.7 Text memory
"Text memory" is the foundational concept for standards like the proposed Lisa OSCAR xml:tm standard and comprises author memory and translation memory. In translation memory, unique identifiers are maintained during translation to ensure the target document is precisely aligned at the text unit level. If the source document is later modified, unchanged text units can be directly transferred to the new target version without translator intervention, embodying the concept of "exact" or "perfect" matching. The xml:tm standard also supports mechanisms for in-document leveraged and fuzzy matching [65](#page=65).
> **Tip:** Understanding the distinction between exact matches, ICE matches, and fuzzy matches is crucial for accurately assessing the effort required for a translation task and for leveraging the TM effectively [64](#page=64).
>
> **Tip:** Be cautious with automatic substitution; always review translated segments, especially if they were exact matches from previous projects, to avoid propagating errors [65](#page=65).
---
# Translation memory alignment process
The translation memory (TM) alignment process is a method for creating new translation memories from existing source and translated documents that are not already in TM format [98](#page=98).
### 14.1 The need for alignment
Translation memory systems are crucial in the translation and localization industry for reusing previously translated text, saving time and money while ensuring consistency. However, legacy translation materials may not be available in TM format for several reasons [97](#page=97):
* Translations were performed by in-country offices without TM systems [98](#page=98).
* Linguistic vendors did not deliver a TM as part of the handover [98](#page=98).
* A TM was provided, but its quality was poor, and improvements were only made to the translated files, not the TM [98](#page=98).
In such cases, the alignment process allows for the creation of TMs from this legacy text, preventing the loss of existing work and the need to start from scratch [98](#page=98).
### 14.2 What is alignment?
Alignment is the process of taking a source file and its corresponding translation and matching segments to each other. This pairing of source and target segments builds a repository of translation units, which are then saved as a TM for use in future translation projects [99](#page=99).
### 14.3 The alignment process
The alignment process typically begins with automated alignment tools available on the market [100](#page=100).
1. **Loading files:** A set of source and target files are loaded into the tool and linked, often based on their filenames [100](#page=100).
2. **Automatic alignment:** An automatic alignment process is then run for each source-target file pair [100](#page=100).
3. **Matching segments:** The alignment tools analyze the structure of both the source and target files, matching source text with probable translations on a sentence-by-sentence basis [100](#page=100).
Over the years, alignment tools have become increasingly sophisticated, and the results of automated alignment are generally very good. Some tools can also generate a report indicating the quality of the alignment based on internal algorithms and a quality score. These reports provide an indication of how successful the alignment was [100](#page=100).
---
# Translation for SEO and effective international web presence
This topic explores the crucial role of skilled translators in enhancing a client's global online visibility through effective Search Engine Optimization (SEO) within the realm of website localization. It outlines key considerations for translators aiming to optimize content for international markets .
### 15.1 Key considerations for translators
#### 15.1.1 Keyword research
Thorough keyword research in the target language and region is essential to identify terms and phrases relevant to the local audience. This includes considering linguistic variations, synonyms, and colloquial expressions that users might employ in their search queries .
#### 15.1.2 Cultural relevance
Understanding cultural nuances and preferences is vital for choosing keywords that resonate with the target audience. Literal translations should be avoided if they do not capture the intended meaning or sound unnatural in the target language .
#### 15.1.3 Localized content
Translated content must be not only linguistically accurate but also culturally appropriate. Adapting content to align with local customs, traditions, and market trends enhances its relevance for the target audience .
#### 15.1.4 Metadata optimization
Translating and optimizing meta titles, meta descriptions, and URL slugs requires special attention. Crafting compelling and concise meta descriptions that incorporate relevant keywords is key to encouraging click-throughs .
#### 15.1.5 Multilingual link building
Collaboration with web developers and marketers is important for building a network of high-quality, multilingual backlinks. Identifying reputable local websites and influencers for potential collaborations can contribute to improved search engine rankings .
#### 15.1.6 Content structure and formatting
Ensuring that translated content maintains a user-friendly structure and formatting is crucial. Utilizing headers, bullet points, and other formatting elements enhances readability and SEO, as search engines value well-organized content .
#### 15.1.7 Mobile optimization
Recognizing the growing importance of mobile search necessitates ensuring that translated content is mobile-friendly. Optimizing images and other media for fast loading times on mobile devices positively impacts SEO rankings .
#### 15.1.8 Regular updates
Staying informed about changes in search engine algorithms and adapting SEO strategies accordingly is vital. Regularly updating translated content to reflect current trends ensures sustained visibility in international markets .
#### 15.1.9 Analytics and reporting
Working closely with clients to monitor website analytics and assess the performance of localized content is important. Providing insights and recommendations based on data analysis helps to continually refine SEO strategies .
#### 15.1.10 Communication with clients
Establishing clear communication channels with clients is essential to understand their business goals, target audience, and specific SEO objectives. Collaborating on a strategy that aligns translation efforts with broader marketing initiatives leads to a comprehensive international SEO approach .
> **Tip:** Effective translation for SEO goes beyond linguistic accuracy; it involves deep cultural understanding and strategic keyword integration to connect with local audiences and improve search engine rankings.
> **Example:** A German company selling software might find that directly translating their English product descriptions into German is insufficient. They should research German keywords related to their software's function and adapt the marketing copy to resonate with German business practices and terminology, rather than just a word-for-word translation.
---
# The role of professional translators in website optimization for foreign markets
Professional translators are indispensable in optimizing websites for foreign markets by ensuring that metadata is not only accurately translated but also culturally relevant and effectively positioned for search engines and users .
### 16.1 The significance of web page metadata
Metadata, which includes elements like meta titles and meta descriptions embedded in a webpage's code, is critical for a website's visibility and search engine ranking .
### 16.2 The translator's crucial role in metadata optimization
When translating metadata for foreign markets, professional translators become central figures in optimizing and positioning web pages. Their involvement addresses several key areas :
#### 16.2.1 Search engine visibility
Metadata helps search engine algorithms understand and index webpage content. Accurate translation of this metadata makes the content accessible to a global audience, thereby improving search engine visibility in foreign markets .
#### 16.2.2 User click-through rates
Well-crafted meta titles and descriptions, translated by skilled professionals, can significantly influence user click-through rates. Translators ensure that the translated content is linguistically accurate, compelling, and culturally relevant, which encourages users to click on the provided links .
#### 16.2.3 Local relevance
Professional translators possess an understanding of the cultural nuances and preferences of the target audience. By localizing metadata, translators ensure that the content aligns with local expectations, making it more appealing and relevant to users in foreign markets .
#### 16.2.4 Keyword optimization
Effective Search Engine Optimization (SEO) relies heavily on the use of relevant keywords. A translator, with expertise in keyword research for the target language, can optimize metadata by incorporating region-specific terms, which increases the probability of the webpage appearing in relevant search results .
#### 16.2.5 Global brand consistency
For businesses expanding into international markets, maintaining brand consistency is paramount. Professional translators ensure that the translated metadata aligns with the brand's tone and message, thereby presenting a cohesive global image .
#### 16.2.6 Adherence to character limits
Search engines typically impose character limits on meta titles and descriptions. A translator's skill in creating concise yet impactful translations ensures that the content fits within these limitations, preventing truncation in search engine results .
#### 16.2.7 Credibility and trust
Inaccurate or poorly translated metadata can undermine a website's credibility. Professional translators safeguard the integrity of the content, contributing to the establishment of trust with users in foreign markets .
#### 16.2.8 Adaptation to market trends
Markets and trends are constantly evolving. Translators who stay updated on linguistic and cultural shifts can revise and adapt translated metadata to reflect current market trends, thereby supporting sustained optimization .
### 16.3 Conclusion: The collaborative approach
In essence, the collaboration between professional translators and web developers or marketers is crucial for a comprehensive approach to website optimization in foreign markets. Translators bridge linguistic and cultural divides, ensuring that metadata is not just translated but optimized for the specific nuances of each target audience, ultimately enhancing a website's global competitiveness .
> **Tip:** Consider metadata translation as a critical component of international SEO strategy, not just a localization task.
>
> **Example:** A meta description for an electronics retailer in Spain should not simply translate "Shop the latest gadgets" but rather incorporate culturally relevant terms and perhaps mention shipping to Spanish-speaking regions if applicable, like "Descubre los últimos gadgets tecnológicos y envíos a toda España."
---
# Managing glossaries in Wordfast Anywhere
This section details how to effectively use, view, and add terms to glossaries within the Wordfast Anywhere (WFA) platform to enhance translation efficiency and consistency .
### 17.1 Using a glossary
When WFA identifies a term from an active glossary within a source segment, it highlights this term with a blue background. These highlighted terms function as placeables, meaning they can be manipulated using navigation icons (Previous/Next) or keyboard shortcuts (Ctrl+Alt+Right/Ctrl+Alt+Left). They can also be selected with the mouse or by typing the initial letter followed by Tab .
A key feature is that when using the Copy icon or the Ctrl+Alt+Down shortcut, the corresponding translation from the glossary is copied into the target segment. The auto-suggest feature, enabled by default, simplifies copying target terms by proposing them as you type the first letter of the target term or the first three letters of the source term .
#### 17.1.1 Viewing glossary information
You can preview the translation of highlighted terms by activating the glossary panel. This can be done via the keyboard shortcut Ctrl+Alt+H or by selecting the "Show/Hide Glossary" button from the "View" tab .
To gain more insight into a term, such as additional information entered in comment fields or the F1, F2, and F3 fields, hover your mouse cursor over the source term in the segment. This action will display a tooltip with the available information .
### 17.2 Adding terms to the glossary
WFA facilitates the dynamic addition of terms encountered in the source text, along with their corresponding translations derived during the translation process. This practice reinforces the translator's knowledge base and reduces the need for repeated research on the same terms .
#### 17.2.1 The process of adding a new term
1. **Select the source term:** Click on the term in the source text or use the Tab key (or Shift+Tab) to navigate to it. The selected source term will be highlighted with a red border .
2. **Select the target term:** Click on the corresponding term in the target segment. This selected target term will then be highlighted with a blue background .
3. **Invoke the Glossary Dialog Box:** Press Ctrl+Alt+T or click the "Add Term" button .
4. **Populate the fields:**
* If the term consists of a single word, it should automatically appear in the "Source" and "Target" fields .
* For terms comprising multiple words, you may need to paste the text from your clipboard or enter it manually into the respective fields .
5. **Add supplementary information (optional):**
* You can include a comment to record contextual details, such as the circumstances under which a specific translation was used .
* The F1, F2, and F3 fields are available for storing information like word role, context, grammatical form, or other relevant textual data .
6. **Save the term:** Click "Save" to confirm and add the new term to the glossary .
> **Tip:** Dynamically adding terms as you encounter them in the source text is an efficient method to build and maintain a project-specific or personal glossary, thereby improving future translation speed and consistency.
---
# Controlled natural languages for technical documentation
Controlled natural languages (CNLs) are subsets of natural languages that restrict grammar and vocabulary to reduce ambiguity and complexity. They primarily serve two purposes: enhancing readability for human users, such as non-native speakers, and enabling reliable automatic semantic analysis .
### 18.1 Types and applications of controlled languages
The first type of controlled language, often termed "simplified" or "technical" languages, is used in industries to improve the quality of technical documentation and potentially simplify semi-automatic translation. Examples include Caterpillar Technical English, Simplified Technical English, and IBM's Easy English. These languages impose restrictions on writers through general rules like keeping sentences short, avoiding pronouns, using only approved words from a dictionary, and exclusively employing the active voice .
#### 18.1.1 Caterpillar Fundamental English
Caterpillar Inc., a global heavy equipment manufacturer, employs Caterpillar Fundamental English (CFE) to ensure consistency and high quality in its extensive technical documentation. CFE restricts its vocabulary to approximately 850 words, facilitating both authoring and translation of documents covering complex subsystems like engines, hydraulics, and drive systems .
#### 18.1.2 Examples of controlled natural languages
Numerous controlled natural languages exist, including:
* ASD Simplified Technical English .
* Attempto Controlled English .
* Aviation English .
* Basic English .
* ClearTalk .
* Common Logic Controlled English .
* Distributed Language Translation Esperanto .
* E-Prime .
* Français fondamental .
* Gellish Formal English .
* Interlingua-IL sive Latino sine flexione .
* ModeLang .
* Newspeak .
* Processable English (PENG) .
* Seaspeak .
* Semantics of Business Vocabulary and Business Rules .
* Special English .
* PLAIN LANGUAGE MOVEMENT (Lenguaje claro) .
#### 18.1.3 Companies utilizing controlled languages
Many companies integrate controlled languages into their documentation processes:
* Avaya: Avaya Controlled English (ACE) .
* Boeing: Simplified Technical English (STE), ASD-STE100 .
* Caterpillar: Caterpillar Technical English (CTE), Caterpillar Fundamental English (CFE) .
* Dassault Aerospace: Français Rationalisé .
* European Aeronautic Defence and Space Company (EADS): Simplified Technical English (STE), ASD-STE100 .
* Ericsson: Ericsson English .
* General Motors (GM): Controlled Automotive Service Language (CASL) .
* IBM: Easy English .
* Kodak: International Service Language .
* Nortel: Nortel Standard English (NSE) .
* Océ: Controlled English .
* Rolls-Royce: Simplified Technical English (STE), ASD-STE100 .
* Saab Systems: Simplified Technical English (STE), ASD-STE100 .
* Scania: Scania Swedish .
* Sun Microsystems: Sun Controlled English .
* Xerox: Xerox Multilingual Customized English .
### 18.2 Grammar rules for controlled languages
Grammar rules for controlled languages are language-specific, meaning there are no universally optimal rules for all languages. However, implementing general rules can significantly reduce ambiguities in texts across many languages, making them more suitable for machine translation .
> **Tip:** The primary goal of controlled languages in technical documentation is to achieve clarity and consistency, which directly benefits both human readers and automated processing systems.
#### 18.2.1 CLOUT rule set
The CLOUT™ rule set, an acronym for Controlled Language Optimized for Uniform Translation, was developed by Uwe Muegge. These rules exemplify the kind of restrictions applied to minimize ambiguities in written text .
---
# Evolution of technology in the translation industry
The evolution of technology in the translation industry has been driven by the increasing demand for translated content and the need for greater efficiency and consistency in the localization process [8](#page=8).
### 19.1 General overview
The development of Translation Memory (TM) tools was significantly influenced by several key factors [8](#page=8):
* **Increasing volume of documentation:** The rise in technical equipment and the economic growth of companies led to a surge in the amount of documentation requiring translation [8](#page=8).
* **Repetitive and evolving content:** The growing need to translate similar documents, updated versions, and content with diverse communicative aims necessitated more efficient handling of repeated segments [8](#page=8).
* **Electronic content reproduction:** The necessity to reproduce electronic content across various formats also contributed to the development of technologies that could manage and repurpose text efficiently [8](#page=8).
### 19.2 Historical context and development
While the provided document content for pages 1-8 focuses on the background and general overview, it sets the stage for understanding the historical trajectory of translation technology. The outline suggests a deeper dive into the history of Translation Memory systems, from their first steps to the emergence of commercial developers, indicating a chronological progression of technological advancements in the field. The questions "Where do we come from?" and "Did we really improve?" further emphasize the retrospective nature of this topic, prompting an examination of past methods and current advancements [2](#page=2) [3](#page=3) [6](#page=6) [7](#page=7) [8](#page=8).
> **Tip:** Understanding the driving forces behind technological adoption, such as increased document volume and content repetition, is crucial for appreciating the value and impact of tools like Translation Memory.
---
# Computer-assisted translation: definition and basic components
Computer-assisted translation (CAT) is a set of specialized software applications designed to efficiently assist translators in their tasks. The primary goal of a CAT system is to automatically and quickly provide translators with all the resources they might need for their work [32](#page=32).
### 20.1 The difference between CAT and machine translation
It is crucial to distinguish between machine translation (MT) and computer-assisted translation (CAT) [36](#page=36).
* **Machine Translation (MT):** The translation is performed entirely by a machine, such as Google Translate [36](#page=36).
* **Computer-Assisted Translation (CAT):** The translation is performed by a human who uses various translation tools [36](#page=36).
MT can be integrated into CAT systems, as many CAT systems have access to MT engines [36](#page=36).
> **Tip:** Using CAT tools can significantly increase a translator's productivity compared to not using them, as modern software simplifies many tasks. The ability to use CAT tools is a required skill in many agencies and institutions that hire translators [35](#page=35).
### 20.2 Core components of CAT tools
Essential CAT tools, as identified by Berinstein and Mermaud include :
#### 20.2.1 Translation project management software
This software is vital for controlling the workflow of translation projects and includes functionalities such as:
* Information flow control [37](#page=37).
* Translation assignment management [37](#page=37).
* Quality control processes [37](#page=37).
* Content analysis [37](#page=37).
* Generation of reports, including those on full matches, fuzzy matches, and repetitions within and across files [37](#page=37).
* Word count calculations [37](#page=37).
* Final delivery to the client [37](#page=37).
#### 20.2.2 Translation memory software
Translation memory (TM) software is fundamental for storing past translations and ensuring consistency. Its key functions are:
* Storing translations [38](#page=38).
* Establishing terminological and phraseological consistency [38](#page=38).
* Enabling the retrieval of translation units, which directly enhances productivity [38](#page=38).
Common file extensions for translation memories include:
* **`tmx` (Translation Memory eXchange):** This is the open, standard format that is compatible with almost all translation tools, such as Trados, MemoQ, and Wordfast [38](#page=38).
* **`sdltm`:** This is a proprietary format used by SDL Trados Studio [38](#page=38).
* **`.txt` / `.csv`:** These formats are sometimes used for exporting memories in plain text or for manipulation in spreadsheet software like Excel [38](#page=38).
> **Tip:** For clients, the presence of repetitions in a text can influence the cost of translation, implying that translators using CAT tools might work more efficiently on such texts. CAT tools help identify these repetitions, which can be leveraged for cost savings and consistency [34](#page=34).
---
# Understanding translation memory file formats
This section explores the various file formats used for storing translation memories, focusing on their structure, the data they contain, and the advantages of XML-based formats like TMX and XLIFF [68](#page=68).
### 21.1 Core information stored in translation memories
Translation memory files primarily store a collection of translation units, each comprising a source segment and its corresponding target translation. Beyond these core components, translation memories can also contain a wealth of additional metadata to enrich the translation process. This metadata can include [67](#page=67):
* Language pairs [67](#page=67).
* Creation and modification dates and times [67](#page=67).
* Author information [67](#page=67).
* Usage counts (how many times a segment has been used) [67](#page=67).
* The tool used for creation or modification [67](#page=67).
* Domain or field of expertise associated with the content [67](#page=67).
* Alternate translations for a given source segment [67](#page=67).
* Notes or comments related to the translation unit [67](#page=67).
### 21.2 Common translation memory file formats
The landscape of translation memory file formats is dominated by a few key types, each with its own strengths and weaknesses [68](#page=68).
#### 21.2.1 XML-based formats: TMX and XLIFF
The two most prevalent formats in the industry are TMX (Translation Memory eXchange) and XLIFF (XML Localization Interchange File Format). Both leverage the Extensible Markup Language (XML) format, offering significant advantages for data management and interoperability [68](#page=68).
##### 21.2.1.1 Advantages of XML for translation memory
XML's widespread adoption in translation memory is due to several key benefits it provides over simpler text-based formats [68](#page=68):
* **Structured Data:** XML files have a well-defined hierarchical structure, making them easy for software to parse and process [68](#page=68).
* **Semantic Tagging:** XML uses semantic tags (e.g., ``, ``) that clearly indicate the meaning and purpose of the data they enclose, enhancing human readability [68](#page=68).
* **Tool Support:** A vast ecosystem of software tools is built around XML for tasks such as validation, importing, parsing, and searching [68](#page=68).
* **Interoperability:** The well-defined structure of XML files facilitates data exchange and interaction between different applications and systems [68](#page=68).
##### 21.2.1.2 Structure of TMX and XLIFF files
Both TMX and XLIFF files follow a consistent structure, typically divided into two main parts: a header and a body [69](#page=69).
* **Header:** This section contains metadata about the file itself and the localization process it represents. This includes information about the translation memory's origin, language pairs, creation dates, and any specific settings or configurations. The semantic naming of XML tags makes these headers largely human-readable [69](#page=69) [70](#page=70) [71](#page=71).
* **Body:** This is the core of the file, housing the actual translation units. Each translation unit contains the source text and its corresponding target text, along with any associated metadata like notes or usage statistics [72](#page=72) [73](#page=73) [74](#page=74).
#### 21.2.2 Spreadsheet formats: XLS and CSV
While less common for robust translation memory management, translation memory data can also be stored in spreadsheet formats like Microsoft Excel (.XLS) or comma-separated value text files (.CSV) [68](#page=68).
##### 21.2.2.1 Limitations of spreadsheet formats
The primary drawback of using XLS and CSV formats is their limited capacity to store detailed information about each translation unit. Typically, only the source and target segments, along with the language, are stored, omitting much of the rich metadata available in XML formats. While these files may be smaller in size, they sacrifice depth and detail [68](#page=68).
> **Tip:** For professional translation workflows, XML-based formats like TMX and XLIFF are strongly recommended due to their comprehensive data storage capabilities and superior interoperability. Spreadsheet formats are best suited for simple data exchange or small, informal projects.
---
# Sentence alignment process
The sentence alignment process involves matching source language sentences with their corresponding target language translations to create a bilingual corpus suitable for translation memory (TM) systems. This process is crucial for leveraging existing translations and improving translation efficiency [100](#page=100) .
### 22.1 Automated alignment
The initial alignment is typically performed using automated tools available on the market. These tools analyze the structure of both source and target files to sentence-by-sentence match source text with probable translations. Modern alignment tools have become highly sophisticated, often producing very good results. Some tools also provide a quality score based on internal algorithms to indicate the success of the alignment [100](#page=100).
### 22.2 Evaluation and development of alignment metrics
Significant international efforts have been dedicated to developing metrics for evaluating sentence alignment due to its importance. Notable projects include :
* **Project ARCADE (1995-1996):** Aimed to create a bilingual French-English corpus suitable for alignment tasks and evaluation .
* **MULTEXT-East Project:** Aligned six translations of George Orwell's novel "1984" with the English original, with manual validation of the alignments .
* **Egypt Statistical Machine Translation Toolkit:** Contributed to the development of alignment technologies .
* **GIZA++:** Used for training statistical translation models, which relies on aligned data .
### 22.3 Linguistic verification
After the automated alignment, a linguist performs linguistic verification to ensure accuracy. This involves reviewing each segment, approving correct matches, and correcting or deleting incorrect ones .
> **Tip:** Incorrect matches can arise when the translation requires structural changes, such as combining two source sentences into one target sentence for better flow. In such cases, the alignment tool might misalign subsequent segments .
Once a correction is made, the linguist can re-run the automatic alignment from that point onwards to update subsequent incorrect matches. After verification and correction, approved segments are exported in a TM format for use .
### 22.4 Factors influencing alignment results
Several factors can significantly improve the quality of alignment results:
#### 22.4.1 File format consistency
Source and target files should be in the same format. Differences in file formats, such as InDesign and Word, can lead to variations in how formatting and variable information are converted into tags. While alignment tools can use these tags as guides, inconsistent tags can hinder accurate segment matching .
#### 22.4.2 File version consistency
It is essential that the source and target files are the same version. If a source file is updated after translation with additional information or deletions, and the translated file is not correspondingly updated, the alignment process becomes more complex .
#### 22.4.3 Quality of translated files
The alignment process generally does not include a linguistic review of the existing translations. Therefore, it is critical that the client is satisfied with the quality of the initial translations. While a linguist can review files during the alignment process, this will increase the overall time required for the task .
### 22.5 Alignment tools
A variety of alignment tools are available for use:
* WinAlign .
* SDL Align .
* FarkasAndras' open source aligner (free) .
* AlignFactory light (built into MemoQ) .
* Abbyy Aligner .
* LF Aligner .
* Hunalign .
* Transit NXT aligner .
* Plus Tools .
* TextAlign .
* Bligner .
* bitext2tmx .
---
# Understanding and using term bases and glossaries in translation
This topic explores the functionality, benefits, and practical application of term bases and glossaries in professional translation workflows .
### 23.1 What is a term base or glossary?
A term base, also known as a glossary, is a database designed to store individual words or phrases pertinent to a specific subject area. These entries are typically bilingual or even multilingual, providing equivalent terms across different languages .
### 23.2 How do term bases and glossaries work?
Term bases are often integrated features within Computer-Assisted Translation (CAT) tools. They allow users to import pre-existing glossaries, add new terms, and update existing ones during the translation process. It is also possible to merge multiple bilingual glossaries into a single multilingual resource and to designate terms as "forbidden" to prevent their use .
### 23.3 How to create and use a term base
#### 23.3.1 Content creation
When establishing a term base, identifying key terminology is paramount. To ensure high quality, it is essential to utilize final source texts, approved translations, and thoroughly researched contextual information .
#### 23.3.2 Project integration
A term base can be created specifically for a new translation project or imported from previous translation endeavors .
#### 23.3.3 Maintenance
Once established, term bases require ongoing maintenance to incorporate changes in source texts, translations, or contextual information. Neglecting this maintenance can negatively impact translation quality .
> **Tip:** Always use final source texts and approved translations when creating a term base to ensure accuracy and relevance.
### 23.4 Benefits of using term bases
#### 23.4.1 Increased consistency
Well-constructed term bases ensure consistency in core messaging, which is critical for organizations working on multiple projects or with several collaborators .
#### 23.4.2 Improved translation quality
By managing terminology and defining forbidden terms, term bases prevent the use of unwanted words or expressions, thereby enhancing overall translation quality .
#### 23.4.3 Speeding up translation
The terminology management features in CAT tools provide quick and easy access to essential language resources, accelerating the translation process .
#### 23.4.4 Correct usage and spelling
Term bases guarantee the correct spelling of product or company names, which may be case-sensitive. They also inform translators about terms that should remain in the source language and not be translated .
### 23.5 Terminology management in the translation process
#### 23.5.1 Role of Language Service Providers (LSPs)
LSPs should include term bases and specific instructions with translation assignments, directing translators to align their work with the glossary. LSPs can also request that translators contribute new terms to the glossary for review, potentially transforming the glossary into a valuable client asset. Providing translators with all necessary terminology leads to the application of client-preferred terms, saving time and money on proofreading and increasing client satisfaction through consistent output .
#### 23.5.2 Verification for Project Managers (PMs)
On large projects, a term base allows Project Managers to verify terminology consistency even if they do not speak the target language. Inconsistent translations can be returned to the translator for correction, further reducing proofreading time and costs .
> **Example:** A Project Manager receives a translation for a technical manual. By cross-referencing the translation against the provided term base, they identify several instances where a specific product component name has been translated incorrectly. The PM can then send the translation back to the translator with a clear instruction to correct the terminology according to the glossary.
### 23.6 Case study: Wordfast Anywhere (WFA)
#### 23.6.1 Imposing specific terminology
The use of incorrect terminology can undermine an otherwise competent translation. Clients often possess well-defined jargon, compiled into glossaries, which they supply to translators to enforce a particular terminology. This approach, common in technical translation, aims to harmonize the translator's linguistic skills with the client's terminological requirements .
#### 23.6.2 Translator-generated glossaries
Clients may also request that translators create a glossary of terms encountered during their research or translation process. This glossary building can occur either before translation, during an initial research phase, or concurrently with the translation itself .
#### 23.6.3 Utilizing client-provided glossaries
Frequently, clients furnish a bilingual glossary already compiled from prior translations. The translator's responsibility is to adhere strictly to this glossary and, where appropriate, contribute their own additions .
#### 23.6.4 Application in general translation
For more general translation tasks, especially when a translator is still developing their fluency in a source language, WFA's glossary function can be used to catalog terminology discovered during the translation process .
#### 23.6.5 WFA's glossary function
Wordfast Anywhere is equipped with a glossary function designed to support translators in all these scenarios. This glossary is typically a simple tab-delimited text document that can be uploaded and downloaded from WFA, and shared with other CAT tools as needed .
---
# Understanding and applying post-editing in translation
Post-editing is the process of human translators amending machine-generated translations to achieve an acceptable final product .
### 24.1 The concept and purpose of post-editing
Post-editing (or postediting) involves human translators amending machine-generated translation to achieve an acceptable final product. A person who post-edits is called a post-editor. This process is distinct from editing, which refers to the improvement of human-generated text, often known as revision in the translation field .
Post-editing is employed when raw machine translation is insufficient and human translation is not strictly required. Industry recommendations suggest using post-editing when it can at least double the productivity of manual translation, and potentially quadruple it in cases of light post-editing .
#### 24.1.1 Pre-editing vs. Post-editing considerations
The decision between pre-editing and post-editing depends on the specific project's needs and resources. Pre-editing offers a greater Return on Investment (ROI) when a technical or user document is to be translated into more than three languages, and it becomes particularly worthwhile for dozens of languages. The rationale is to utilize one resource before Machine Translation (MT) rather than numerous resources after .
However, pre-editing is not always the optimal or necessary approach. If the source quality is already high, verified by human review and automatic checks, and the MT engine is well-tuned with domain dictionaries and Translation Memories, then a light post-editing process might suffice to ensure the translations are comprehensible .
> **Tip:** The "tipping point" where money is better spent on pre-editing versus post-editing is a crucial consideration for optimizing translation workflows.
#### 24.1.2 Tools for source content creation and pre-editing
Various tools can facilitate source content creation and pre-editing for MT .
* **Source content memory:** This provides feedback to writers, identifying similar content produced by different individuals and highlighting variations, thus promoting consistency in writing style across writers and products over time .
* **Generic pre-editing plugins or automated rules:** These can assist writers in reformulating source text before MT .
* **Simplified Technical English (STE) or Controlled Language tools:** These tools automate the formalization of rules for writing for localization, including requirements for short sentences, active voice, and standard word order .
* **Program or client-specific custom tools:** These identify spelling, grammar, and preferred terminology. An example tool mentioned is Grammarly .
> **Example:** Controlled Language tools help ensure that source text is structured in a way that MT engines can process more effectively, leading to better raw output that requires less post-editing.
### 24.2 The post-editing process and quality levels
Post-editing involves correcting machine translation output to meet a pre-negotiated quality level between the client and the post-editor. The extent of post-editing required varies by project, making it essential to define expectations early, considering time, quality, and cost as the primary factors influencing the post-editing strategy .
#### 24.2.1 Light post-editing
Light post-editing entails minimal intervention by the post-editor, just enough to enable the end-user to understand the text to some extent. The expectation is that the client will use this output for inbound purposes only, often when the text is needed urgently or has a short timeframe .
#### 24.2.2 Full post-editing
Full post-editing requires a greater level of intervention to achieve a quality degree negotiated between the client and the post-editor. The expectation is that the outcome will be a text that is not only understandable but also stylistically appropriate for assimilation and dissemination, suitable for both inbound and outbound purposes .
#### 24.2.3 Quality expectations and evolving MT capabilities
At the highest end of full post-editing, the expectation is for a quality level indistinguishable from human translation. Historically, it was assumed that translators spent less effort working directly from the source text than post-editing MT output. However, with advancements in MT and AI, this is changing. For certain language pairs, tasks, and with customized engines using high-quality domain-specific data, some clients are now requesting translators to post-edit instead of translating from scratch, anticipating similar quality at a reduced cost .
> **Tip:** Always clarify the required quality level with the client before starting a post-editing task to manage expectations and ensure the final output meets their needs.
### 24.3 Efficiency and challenges in post-editing
While studies suggest post-editing is generally faster than translating from scratch, regardless of language pairs or translator experience, there is no universal agreement on the exact time savings. Industry reports indicate time savings around 40%, whereas some academic studies suggest savings are more likely between 0%–20% under actual working conditions. Professionals have also reported negative productivity gains when the necessary corrections take more time than translating from scratch .
> **Tip:** Be aware that post-editing efficiency can be difficult to predict, and actual time savings can vary significantly.
---
# Controlled languages in companies and their rules
Controlled languages are a crucial tool for companies aiming to improve clarity, consistency, and machine-translation readiness in their technical documentation and global communications. These languages employ a set of predefined rules to simplify and standardize writing, thereby reducing ambiguity and facilitating more accurate translations .
### 25.1 Examples of controlled languages in companies
Several prominent companies have developed or adopted their own controlled languages to manage their technical documentation and international communication needs:
* **Kodak:** International Service Language .
* **Nortel:** Nortel Standard English (NSE) .
* **Océ:** Controlled English .
* **Rolls-Royce:** Simplified Technical English (STE), specifically conforming to ASD-STE100 .
* **Saab Systems:** Simplified Technical English (STE), also aligning with ASD-STE100 .
* **Scania:** Scania Swedish .
* **Sun Microsystems:** Sun Controlled English .
* **Xerox:** Xerox Multilingual Customized English .
### 25.2 General principles of controlled languages
The rules for controlled languages are language-specific, acknowledging that no single set of rules can be universally optimal for all languages. However, a common goal is to minimize ambiguities in texts, making them more suitable for machine translation. The CLOUT™ rule set, developed by Uwe Muegge, exemplifies such rules, aiming for Controlled Language Optimized for Uniform Translation .
> **Tip:** Texts written in a controlled language are ideal for machine translation because they are less ambiguous .
### 25.3 Core rules for controlled language (based on the CLOUT™ rule set)
The following rules are examples of guidelines found in controlled language rule sets, designed to enhance clarity and prevent misinterpretation:
#### 25.3.1 Sentence length
* **Rule 1:** Write sentences that are shorter than 25 words .
* **Write:** The author performs the following tasks: Collect the necessary information. Analyze and evaluate the information. Write a structured draft .
* **Do not write:** Authors will approach any writing project by collecting the necessary information first, and after carefully analyzing and evaluating it, they will create a structured draft .
#### 25.3.2 Single idea per sentence
* **Rule 2:** Write sentences that express only one idea .
* **Write:** Authors who optimize their texts for easy comprehension facilitate the translation process. These texts enable machine translation systems to produce better translation results .
* **Do not write:** By optimizing their texts for easy comprehension, authors facilitate the translation process, and doing so enables machine translation systems to create better translation results .
#### 25.3.3 Consistency in expression
* **Rule 3:** Write the same sentence if you want to express the same content .
* **Write:** Printer Installation. 1) Remove the printer from the carton. 2) Remove the plastic wrapping .
* **Do not write:** Instructions for installing the printer. After unpacking the printer from the shipping carton, take the printer out of the plastic bag .
#### 25.3.4 Grammatical completeness
* **Rule 4:** Write sentences that are grammatically complete .
* **Write:** Do you wish to continue the installation of the software ?
* **Do not write:** Continue installing software ?
#### 25.3.5 Simple grammatical structure
* **Rule 5:** Write sentences that have a simple grammatical structure .
* **Write:** Show that you can organize your thoughts by using a simple sentence structure in your texts .
* **Do not write:** You, in your texts, to show that you can organize your thoughts, should use a simple sentence structure .
#### 25.3.6 Active voice
* **Rule 6:** Write sentences in the active form .
* **Write:** The program manager will send a summary of all questions to the responsible coworkers .
* **Do not write:** A summary of questions will be sent to the responsible individuals .
#### 25.3.7 Noun repetition instead of pronouns
* **Rule 7:** Write sentences that repeat the noun instead of using a pronoun .
* **Write:** You must check the spelling of your text before you publish your text .
* **Do not write:** You must check the spelling of your text before publishing it .
#### 25.3.8 Use of articles
* **Rule 8:** Write sentences that use articles to identify nouns .
* **Write:** Test the installation .
* **Do not write:** Test installation .
---
# Translation memory files and their importance
Translation memory (TM) files are crucial tools that enhance efficiency and ensure consistency in the translation process by leveraging previously translated content [75](#page=75).
### 26.1 The importance of translation memory files
Translation memory files are primarily used within Computer-Assisted Translation (CAT) or Translation Environment Tools (TEnT) to significantly improve a translator's efficiency. When a TM file is loaded into such software, it allows translators to utilize their past work. If a segment in the current translation has been previously translated, even partially, the tool will automatically flag this match or partial match, alerting the translator [75](#page=75).
Beyond efficiency, TM files are vital for maintaining consistency. Throughout a translator's career, they work on numerous projects for diverse clients, some of which may demand specific terminology or phrasing. By employing "client-based" or "project-based" translation memories, translators can guarantee accuracy and maintain a consistent style across all their work [75](#page=75).
> **Tip:** Leveraging prior translations through TM files not only saves time but also ensures that a consistent tone and terminology are used, especially important for ongoing projects or clients with specific style guides.
#### 26.1.1 Metadata and TM files
On a surface level, TM software segments texts, and associated metadata allows each segment to be traced back to the translator, the date, and the time of its creation. This metadata enables translators to opt for more recent material to leverage or to discard segments containing outdated terminology. Furthermore, language service providers can effectively manage their TM resources using this information. However, potential loss of crucial metadata during format transfers can restrict users to specific software tools due to interoperability issues [84](#page=84).
Metadata, in essence, is data that describes other data, offering supplementary information about digital content and processes. In the context of the web, it's been defined as "data about data," or more precisely, "machine understandable information about web resources or other things". There are three primary types of metadata: descriptive metadata, which describes content; structural metadata, which outlines the organization of objects or components; and administrative metadata, which provides technical details like file type [85](#page=85).
### 26.2 Common translation memory file formats: TMX and XLIFF
TMX (Translation Memory eXchange) and XLIFF (XML Localization Interchange File Format) are both industry-standard, XML-based file types commonly used in translation. While they share similarities, including some inline markup elements, they possess distinct structures and features [76](#page=76).
#### 26.2.1 Key differences between TMX and XLIFF
* **Purpose:** XLIFF was designed to store extracted text and facilitate data transfer throughout the localization process, whereas TMX was created specifically for exchanging translation memory data between different tools [76](#page=76).
* **Language Support:** TMX can accommodate an unlimited number of languages within a single document. In contrast, XLIFF is structured to work with only one source and one target language at a time [76](#page=76).
* **Inline Code Handling:** TMX exclusively uses encapsulation methods for inline codes, embedding native codes within distinct elements. XLIFF supports both encapsulation (using elements similar to TMX) and the placeholder method, where native codes are removed to a separate "Skeleton" file and replaced by short, referencing elements akin to OpenTag [76](#page=76).
* **File Structure:** In TMX files, a collection of `` (translation unit) elements lacks a specific order and has no inherent mechanism to reconstruct the original file [76](#page=76).
* **Additional Data Types:** XLIFF incorporates additional data types and fields not present in TMX, such as pretranslation, history tracking, versioning, and support for binary objects [76](#page=76).
* **Timestamping:** TMX files can store time and date information at the translation unit level, a capability that XLIFF files do not possess [76](#page=76).
#### 26.2.2 Which format is better: TMX or XLIFF?
Both TMX and XLIFF are robust and widely supported by most translation software tools, making them powerful choices. The selection often depends on the specific project requirements or the translation software being used. Frequently, translators may be provided with a TM file for a particular job, negating the need for a choice. Ultimately, both formats can effectively serve their purpose, and using translation memory in any format is vastly superior to not using it at all. Many tools also allow users to download their translation memory in either format [77](#page=77).
However, when given the option for a new translation project, some authors favor TMX for two primary reasons [78](#page=78):
* **Time-stamped Translation Units:** Translation units in TMX can be time-stamped, allowing for later productivity analysis of the work performed [78](#page=78).
* **Multiple Target Languages:** TMX files can store multiple target languages within a single file [78](#page=78).
Conversely, if the ability to reconstruct or rebuild the original file using the TM file is a significant requirement, XLIFF proves to be a much more capable format in this regard [78](#page=78).
---
# The concept and process of localization
Localization is the comprehensive process of adapting a product or content to a specific target locale, going beyond mere translation to encompass cultural, linguistic, and technical considerations to ensure it resonates with local users .
### 27.1 Understanding localization
Companies expanding internationally must do more than ensure business success in new markets; they need to appear "local" to stand out and gain traction. Achieving this "local" feel requires strategic planning and a diverse skillset from both internal teams and external suppliers. The core objective of localization is to make a product feel as if it was specifically created for the target locale, irrespective of its actual geographical origin, culture, or language. This process is essential for increasing engagement and improving the chances of sales growth for clients in global markets .
> **Tip:** Localization aims to ensure a product feels native to its target audience, fostering a sense of identification and understanding that can significantly impact user adoption and commercial success .
### 27.2 Key areas of localization
Localization is applied to a wide range of products and content types, including but not limited to:
* Websites .
* Video games .
* Movies .
* Product information .
* Mobile applications .
* Software .
* Whitepapers .
* Tech support pages .
* Help files .
* Newsletters .
### 27.3 Linguistic considerations in localization
Language adaptation is often the most time-consuming aspect of localization. This involves several critical elements :
#### 27.3.1 Text and media translation
* **Subtitles and Dubbing:** For video, audio, and film, translation of spoken words and music lyrics is typically handled through subtitles or dubbing .
* **Printed and Digital Materials:** All text within printed materials and digital media, including documentation and error messages, requires translation .
* **Logos and Images:** Graphics or logos that contain text may need to be altered or replaced with more generic icons if the text is not suitable for the target locale .
#### 27.3.2 Design and layout adaptations
* **Content Formatting:** Website design or written content might need adjustments to accommodate differences in character sizes and varying translation lengths .
* **Writing Systems and Scripts:** Different languages utilize distinct scripts (symbols, logograms, syllograms, letters). The direction of writing can vary significantly, from left-to-right (common in European languages) to right-to-left (Arabic, Hebrew), or boustrophedon scripts. Some Asian languages also support vertical writing .
* **Complex Text Layout:** Some languages have complex text layouts where character shapes change based on context .
* **Capitalization:** Certain languages require capitalization, which may not be a feature in others .
* **Text Sorting:** Different writing systems and languages have unique rules for sorting text alphabetically .
#### 27.3.3 Grammatical and stylistic nuances
* **Numeral Systems:** Translators must be aware that some languages use different numeral systems .
* **Grammar Rules:** Attention to detail is crucial, as rules for pluralization and other grammatical aspects can differ considerably between languages .
* **Punctuation:** The usage of punctuation can vary; for example, French publications sometimes use guillemets, which are similar to double quotation marks in English .
* **Variety, Register, and Dialect:** For audio materials, localization must consider variations in dialect, register, and the specific variety of the language used .
#### 27.3.4 Date, time, and number conventions
* **Number Formats:** Consideration must be given to writing conventions for number formats, including digit grouping and the decimal separator .
* **Time and Date Formats:** Localization requires adapting time and date formats, potentially including the use of different calendars .
* **Standard Data:** Local data standards for the target audience must be considered .
* **Time Zones:** Translators must carefully account for variations in time zones .
### 27.4 Technical and economic considerations in localization
Beyond linguistic aspects, localization involves adapting to technical, economic, and legal standards of the target locale:
#### 27.4.1 Economic and technical standards
* **Economic Conventions:** These can vary widely, encompassing paper sizes, preferred storage media, broadcast television systems, phone number formats, delivery services, and postal address formats .
* **Currency:** Localization requires adapting currency symbols, their positioning, and the use of currency markers .
* **Measurement Systems:** Differences in measurement systems must be addressed .
* **Electrical Standards:** Localization may involve adapting to different standards for battery sizes, electric current, and voltage .
* **Third-Party Services:** Variations in payment service providers, weather report formats, and the presentation of online maps from third-party providers need to be considered .
#### 27.4.2 Legal and regulatory compliance
* **Varied Legal Requirements:** Different countries have unique legal frameworks, necessitating customization of the product or even complete changes to ensure compliance .
* **Specific Compliance Areas:** This can include:
* Compliance with privacy laws .
* Additional disclaimers on packaging or websites .
* Different consumer labeling regulations .
* Regulations concerning encryption and export restrictions .
* Conformity with subpoena procedures or internet censorship policies .
* Accessibility requirements .
* Tax collections (e.g., customs duties, value-added tax, sales tax) .
> **Example:** Localizing a software product for Germany might require translating all user interfaces and documentation, adapting date and number formats to German conventions (e.g., using commas for decimal separators), ensuring compliance with strict data privacy laws like GDPR, and potentially modifying payment gateways to accommodate local preferences .
---
# Challenges in the translation industry
The translation industry faces significant challenges driven by globalization, technological advancements, and evolving market demands, necessitating adaptation and the adoption of new tools and workflows [21](#page=21).
### 28.1 Globalization and Market Demands
Multinational companies increasingly develop products for global markets, aiming for simultaneous release across all local markets (simship). This trend is coupled with faster time-to-market schedules, requiring products to be designed for global adaptability without country-specific re-designs (internationalization or I18N). Consequently, products and documentation must be localized (L10N) to suit the language and cultural nuances of target markets [21](#page=21).
### 28.2 Technological Advancements and Software Localization
The development and specialization of computer software, particularly general office applications and translation/localization software, present ongoing challenges. These translation tools encompass Translation Memory (TM), Alignment, Terminology Management, Terminology Extraction, Software Localization, Project Management, and Machine Translation. The integration of plug-ins and interfaces, an increasing number of features and variants, and the frequent speed of software updates create compatibility issues and necessitate continuous upgrading for both software and users [22](#page=22).
### 28.3 Evolution of Electronic File Formats
The proliferation of diverse electronic file formats across various domains such as Office, Desktop Publishing (DTP), Markup, and Software presents a significant hurdle. The continuous development of new formats and modifications to existing ones in new software versions require translators to constantly update their technical know-how. File preparation and post-processing have emerged as new areas of activity for translators, demanding adaptation of old workflows and translation strategies due to new tools [23](#page=23).
### 28.4 Computer-Assisted Translation (CAT) Tools
Computer-Assisted Translation (CAT) is defined as a series of computer applications designed to efficiently assist translators in their tasks. The primary goal of CAT systems is to provide translators with automatic and rapid access to all necessary resources for their work [32](#page=32).
#### 28.4.1 Introduction and Objectives of CAT
The objectives of studying CAT include understanding its definition and components, familiarizing oneself with the translation process using CAT systems, and gaining an initial understanding of the main CAT systems available [30](#page=30).
#### 28.4.2 Definition and Basic Components of CAT
CAT aims to increase translator productivity by facilitating tasks that would otherwise be redundant. Proficiency in CAT tools is a demanded competency by agencies and institutions. The European Commission highlights mastering CAT and terminology tools, alongside common office software, as crucial translation capabilities. It is important to distinguish between Machine Translation (MT), where a machine performs the translation (e.g., Google Translate), and CAT, where a human translates with the aid of various tools. MT can be integrated into CAT systems, as many systems offer access to MT engines [35](#page=35) [36](#page=36).
#### 28.4.3 Essential CAT Tools
Essential CAT tools, as identified by Berinstein and Mermaud include :
1. **Translation Project Management Software:** This software controls information flow, assignment of translations, quality control, content analysis, report generation (including full and fuzzy matches, intra- and cross-file repetitions), word counts, and final delivery to clients [37](#page=37).
2. **Translation Memory (TM) Software:** TM software stores previous translations to ensure terminological and phraseological consistency. It enables the retrieval of translation units, enhancing productivity. Typical TM file extensions include `.tmx` (Translation Memory eXchange, an open standard format compatible across tools like Trados, MemoQ, Wordfast), `.sdltm` (a proprietary format for SDL Trados Studio), and `.txt` / `.csv` for plain text or Excel manipulation [38](#page=38).
> **Tip:** Using `.tmx` files ensures compatibility when working with different CAT tools.
3. **Terminology Management Software:** This tool is used for creating glossaries from ongoing translations. Examples include RWS Trados - MultiTerm, Wordfast, and MemSource. Standard formats for terminology exchange are `.tbx` (TermBase eXchange) and proprietary formats like `.sdltb` (MultiTerm's terminology databases) [41](#page=41).
4. **Alignment Software:** This software creates translation memories by matching original texts with their corresponding translations. It identifies concordances, and each segment requires manual confirmation. Alignment is particularly useful for creating a TM from existing translated documents (e.g., Word files) that were not previously managed in a CAT tool, allowing for their reuse in future projects [43](#page=43) [46](#page=46).
5. **Localization Tools:** These tools are specifically designed for translating software, video games, or websites [47](#page=47).
#### 28.4.4 The Translation Process with a CAT System
A recommended routine for translating with a CAT system involves several steps to achieve the final translation [49](#page=49):
1. **File Format Check:** Initial verification of the file format to ensure compatibility with the CAT tool [50](#page=50).
2. **Resource Assignment:** Allocation of relevant translation memories, term bases, and other linguistic resources to the project [50](#page=50).
3. **Segmentation:** The CAT tool breaks down the source text into smaller units called segments [50](#page=50) [51](#page=51).
4. **Translation:** The translator works on translating each segment, leveraging TM matches and term bases [50](#page=50).
5. **TM Update:** Upon completion or segment approval, the translated segments are used to update and expand the Translation Memory [52](#page=52).
6. **Review:** A post-translation review phase to check for errors and ensure quality [52](#page=52).
7. **Final Revision:** A final quality assurance step before delivery [52](#page=52).
#### 28.4.5 CAT Systems
Prominent CAT systems include:
* **RWS TRADOS:** Widely considered the most used CAT system, Trados provides a comprehensive environment for managing translation projects from inception to final revision, including project creation, TM and term base development. It is frequently requested by clients [54](#page=54).
* **WORDFAST:** Available in both online and desktop versions, Wordfast was historically a free CAT system. It integrates an MT engine for translation suggestions and includes alignment and glossary functions [55](#page=55).
* **MemoQ:** Developed in 2004 to compete with Trados, MemoQ has gained significant popularity and is offered by Kilgray Translation Technologies with various products tailored to translator needs [56](#page=56).
* **Déjà Vu X3:** Created by Atril, this program offers similar functionality to other CAT tools, allowing project managers to evaluate, prepare, and control projects from start to finish across different language pairs [57](#page=57).
---
# Considerations for localization across different writing systems and languages
This section delves into the intricacies of localization, focusing on the challenges and considerations arising from diverse writing systems and languages within computer-aided translation tools and workflows.
### 29.1 History and evolution of translation memory systems
The development of translation memory (TM) systems has a rich history, evolving from early experimental approaches to sophisticated commercial software.
#### 29.1.1 Early steps in translation memory
* **1960-1965: Federal Armed Forces Translation Agency, Mannheim, Germany:** This period saw the development of a text-related glossary approach where translators underlined English words for which they needed German equivalents. These words were then processed by a computer after morphological reduction, and words found in the database were printed in text-related glossaries [9](#page=9).
* **1960-1965: European Coal and Steel Community, Luxembourg:** This initiative focused on automatic dictionary look-up with context. Translators would indicate words needing assistance, the entire sentence was keypunched, and the computer searched for matching sentences in its database to provide the requested items with context. Newly generated translations were added back to the database [11](#page=11).
* **1980-1990: Interactive Translation System (ITS), Alan Melby, Brigham Young University, USA:** This system introduced a three-level approach to Computer-Aided Translation (CAT). Level 1 involved editor and terminology management. Level 2 provided electronic source text, text analysis (dynamic concordance), automatic terminology look-up, and synchronized bilingual text files. Level 3 aimed for integration with machine translation systems [13](#page=13).
#### 29.1.2 First commercial systems
* **1984: TRADOS (TRAnslation & DOcumentation Software):** Established in Germany, TRADOS transitioned from a translation provider to a software developer. Key products included TED (Translation Editor including the first Translation Memory) in 1988, MultiTerm (terminology management) in 1990, and Translator's Workbench in 1992. They later migrated to the Windows platform in 1993 and were acquired by SDL International in 2005 [14](#page=14).
* **1984: STAR (Software Translation Artwork Recording):** Based in Switzerland, STAR offered translation and documentation services alongside software development. Their first version of Transit (DOS) with TermStar (terminology management) was released in 1991, followed by a Windows version in 1994 [15](#page=15).
* **1993: ATRIL:** This Spanish company launched its first translation memory tool for Windows 3.1 in 1993, featuring an interface for MS Word. A system redesign in 1996 resulted in a 32-bit Windows software integrated translation environment [16](#page=16).
* **1992: IBM Germany:** IBM released Translation Manager/2 (TM/2) in 1992, later developing a Windows version. This system was notable for including linguistic resources for 19 languages, such as lemmatizers and morphological data [17](#page=17).
#### 29.1.3 Current market situation
Prominent CAT tools in the current market include Across, DéjàVu, MemoQ, MultiTrans, SDL Trados, SDLX, and Wordfast [18](#page=18).
### 29.2 Translation workflow with CAT tools
The translation process using a CAT tool involves several key stages, from initial setup to final output.
#### 29.2.1 Initial setup and segmentation
* The first translation using a TM system begins with an empty memory [19](#page=19).
* When the source text is opened or imported, it is segmented into "translation units" based on predefined rules, typically using punctuation as a primary delimiter, with user-defined exceptions for elements like abbreviations [19](#page=19).
#### 29.2.2 Translation and memory interaction
* The currently active segment is automatically searched in the Translation Memory [20](#page=20).
* If an identical or similar segment is found, the associated translation is displayed and can be selected, modified, and inserted into the target text [20](#page=20).
* If no match is found, the translator enters a new translation, which is then stored in the TM along with the source segment, becoming available for future identical or similar segments [20](#page=20).
* The TM is populated incrementally during the translation process [20](#page=20).
#### 29.2.3 Post-translation processes
After the initial translation, the workflow continues with updating resources, revision, TM generation, and a final review [52](#page=52).
### 29.3 Challenges in the translation industry related to localization
The globalization of markets and the speed of product development present significant challenges for the translation industry, necessitating efficient localization practices.
#### 29.3.1 Globalization and time-to-market
* Multinational companies target global markets with products intended for simultaneous release (simship) [21](#page=21).
* The time-to-market pressure demands faster product introduction schedules [21](#page=21).
* Products must be designed for internationalization (I18N), meaning they require minimal re-design for each local market [21](#page=21).
* Localization (L10N) involves adapting products and documentation to the specific language and culture of target markets [21](#page=21).
#### 29.3.2 Software development and CAT tools
* The development and specialization of computer software, particularly general office applications and specialized translation/localization software, have increased [22](#page=22).
* CAT tools encompass a range of functionalities, including TM, alignment, terminology management, terminology extraction, software localization, project management, and machine translation integration [22](#page=22).
* The proliferation of plug-ins, interfaces, and features, coupled with frequent updates and variations, leads to compatibility problems and a continuous need for upgrading both software and user skills [22](#page=22).
#### 29.3.3 Electronic file formats
* The diversity of electronic file formats (Office, DTP, Markup, Software, etc.) and their continuous evolution pose a significant challenge [23](#page=23).
* New software versions often modify existing formats, requiring translators to engage in file preparation and post-processing [23](#page=23).
* This necessitates continuous updating of technical know-how and adaptation of translation strategies and workflows [23](#page=23).
### 29.4 Definition and components of Computer-Aided Translation (CAT)
Computer-Aided Translation (CAT) is defined as a suite of specialized computer applications designed to efficiently assist translators in their work. The primary goal of a CAT system is to provide translators with rapid access to all necessary resources [32](#page=32).
> **Tip:** It is important to distinguish between Machine Translation (MT) and CAT. MT involves translation performed by a machine, such as Google Translate, while CAT involves a human translator using various tools. MT can be integrated into CAT systems [36](#page=36).
#### 29.4.1 Essential CAT tools
Essential CAT tools, as identified by Berinstein and Mermaud include :
1. **Project Management Software:** Manages information flow, assigns tasks, controls quality, analyzes content, generates reports (e.g., for full/fuzzy matches, repetitions), counts words, and handles final delivery [37](#page=37).
2. **Translation Memory Software:** Stores translations to ensure terminological and phraseological consistency and facilitates the retrieval of translation units for increased productivity [38](#page=38).
* **Typical extensions:** TMX (Translation Memory eXchange) is the standard open format compatible across most tools. SDLXTM is a proprietary format for SDL Trados Studio. TXT/.CSV are sometimes used for plain text or Excel manipulation [38](#page=38).
3. **Terminology Management Software:** Facilitates the creation of glossaries from ongoing translations, with examples like RWS Trados Multiterm and Wordfast.
* **Formats:** TBX (TermBase eXchange) is the standard open format for glossary exchange. SDLTB is a proprietary format for SDL Trados Studio [41](#page=41).
4. **Alignment Software:** Creates translation memories from original texts and their translations by identifying correspondences between segments, requiring manual confirmation of each segment. This is useful for converting existing documents (e.g., Word files) into a TM format [43](#page=43) [46](#page=46).
5. **Localization Tools:** Specifically designed for translating software, video games, and websites [47](#page=47).
### 29.5 The translation process with a CAT system
A recommended routine for using a CAT system involves a structured approach to achieve the final translation [49](#page=49).
#### 29.5.1 Workflow stages
The typical workflow includes:
* File format check [50](#page=50).
* Resource assignment [50](#page=50).
* Segmentation [50](#page=50).
* Translation [50](#page=50).
* Resource updating [52](#page=52).
* Revision [52](#page=52).
* TM generation [52](#page=52).
* Final review [52](#page=52).
### 29.6 Major CAT systems
Several CAT tools are prevalent in the industry:
* **RWS TRADOS:** Considered the most widely used system, it provides a comprehensive environment for translation projects from creation to finalization, including TM and terminology databases. Clients frequently request its use [54](#page=54).
* **WORDFAST:** Initially a free CAT system, it offers online and desktop versions. It includes an integrated machine translation engine and features for alignment and glossary management [55](#page=55).
* **MemoQ:** Developed in 2004 to compete with Trados, MemoQ has gained significant popularity. Kilgray Translation Technologies offers various products catering to different translator needs [56](#page=56).
* **Déjà Vu X3:** Created by Atril, Déjà Vu X3 is a CAT program for project evaluation, preparation, and control across available language combinations [57](#page=57).
### 29.7 Structure and information within Translation Memory files
Translation Memory (TM) files are essentially structured text files, often in XML format, containing translation and linguistic data. They are not "black boxes" and can be opened with standard text editors [66](#page=66).
#### 29.7.1 Stored information
TM files store:
* **Main information:** Segments (source and target), language, creation dates, and times [67](#page=67).
* **Additional data:** Author, usage count, change dates and times, creation tool, domain (field), alternate translations, and notes [67](#page=67).
#### 29.7.2 Typical TM file formats
The most common industry formats are XLIFF and TMX, both XML-based. Spreadsheet formats like Excel (.XLS) or comma-separated values (.CSV) are also used, though they store less data per translation unit [68](#page=68).
XML's advantages for TM files include:
* **Easy parsing:** Due to its well-defined structure [68](#page=68).
* **Semantic tags:** Tags like `` and `` indicate data meaning [68](#page=68).
* **Interoperability:** A well-defined structure facilitates data exchange between different applications and systems [68](#page=68).
* **Tool support:** Numerous software tools are built around XML for validation, import, parsing, and searching [68](#page=68).
#### 29.7.3 File structure (Header and Body)
TMX and XLIFF files typically consist of:
* **Header:** Contains metadata about the file and the localization process. Examples for TMX and XLIFF headers are provided [70-71](#page=70-71) [69](#page=69).
* **Body:** Contains the core data, including translation units and segments. Examples for TMX and XLIFF bodies are illustrated [73-74](#page=73-74) [72](#page=72).
### 29.8 Importance and differences between TMX and XLIFF
TM files are crucial for improving efficiency and ensuring consistency in translation. They allow translators to leverage prior work, benefit from exact or fuzzy matches, and maintain consistency through client- or project-based memories [75](#page=75).
#### 29.8.1 Key differences between TMX and XLIFF
Both TMX and XLIFF are XML-based industry standards with similarities, but they differ in purpose and structure [76](#page=76):
* **Purpose:** XLIFF was designed to store extracted text and facilitate data transfer throughout the localization process, while TMX focuses on exchanging TM data between tools [76](#page=76).
* **Language Support:** TMX can accommodate any number of languages in a single document, whereas XLIFF is designed for one source and one target language [76](#page=76).
* **Inline Codes:** TMX uses encapsulation for inline codes, while XLIFF supports both encapsulation and a placeholder method where native codes are moved to a separate Skeleton file [76](#page=76).
* **Order and Reconstruction:** A collection of `` elements in TMX has no specific order and lacks a mechanism to rebuild the original file. XLIFF is more powerful for reconstructing the original file (#page=76, 78) [76](#page=76) [78](#page=78).
* **Additional Data:** XLIFF includes data types and fields not present in TMX, such as pretranslation, history, versioning, and binary objects [76](#page=76).
* **Time/Date Data:** TMX files can store time and date data at the translation unit level, which XLIFF files cannot [76](#page=76).
#### 29.8.2 Choosing between TMX and XLIFF
Both formats are powerful and widely supported. The choice often depends on the specific project, tool, or provided TM files [77](#page=77).
* **Preference for TMX:** Some authors prefer TMX for its ability to store time-stamped translation units (useful for productivity analysis) and its support for multiple target languages within a single file [78](#page=78).
* **Preference for XLIFF:** If the ability to reconstruct or rebuild the original file is a priority, XLIFF is the preferred format [78](#page=78).
Regardless of the format, using translation memory is significantly more beneficial than not using it [77](#page=77).
### 29.9 Metadata in CAT Tools
Metadata, defined as "data about data," provides additional information about digital content and processes [85](#page=85).
#### 29.9.1 Types of metadata
Metadata can be categorized into:
* **Descriptive metadata:** Describes content [85](#page=85).
* **Structural metadata:** Describes the organization of objects or components [85](#page=85).
* **Administrative metadata:** Describes technical information, such as file type [85](#page=85).
#### 29.9.2 Metadata and TM
In TM software, metadata associated with segments can trace them back to the translator, date, and time of creation. This allows for the selection of more recent or relevant material and the deletion of outdated terminology. Effective management of TM resources by language service providers relies on this metadata [84](#page=84).
> **Caution:** The potential loss of important metadata during format transfers can restrict users to specific software tools due to interoperability issues [84](#page=84).
---
Localization extends beyond simple translation to adapt content for specific target audiences, considering cultural nuances, local laws, and regional variations .
### 29.1 The scope of localization
Localization is a comprehensive process that aims to give a product the feel and look of being specifically created for the target locale, regardless of its location, culture, or language. While translation is a core component, localization encompasses much more to ensure a product or content resonates with local audiences and meets their expectations .
#### 29.1.1 Applications of localization
Localization is widely applied to various types of content and products, including:
* Websites .
* Mobile apps .
* Software .
* Video games .
* Multimedia content .
* Voiceovers .
* Product information .
* Whitepapers .
* Tech support pages .
* Help files .
* Newsletters .
* User manuals .
* Medical documents .
* Technical publications .
* Scientific journals .
* Literature .
#### 29.1.2 Localization vs. Translation
While translation focuses on converting content from a source language to a target language while respecting grammar and syntax, localization goes further by adapting the message to local audiences. Translation is considered a step within the broader localization process. Companies need to localize to gain the trust of local publics and create a customized message for each local audience, which is crucial for success in foreign markets .
### 29.2 Key considerations in localization
Localization requires careful attention to a multitude of factors beyond just linguistic accuracy to effectively meet cultural expectations and local regulations .
#### 29.2.1 Cultural adaptation
Localization involves adapting content to respect cultural aspects, local laws, and regional variations, even within the same language. This includes understanding local beliefs, traditions, and the connotations of various elements like animals, food, gestures, and colors. Companies must maintain a unique brand voice globally while adapting campaigns to local markets .
> **Tip:** Successful localization often involves working with local marketers and consultants to ensure cultural sensitivity and compliance with local laws .
##### 29.2.1.1 Examples of cultural adaptation
* **KitKat's Slogan in Japan:** The slogan "Have a break, have a KitKat" was changed to "Kitto Katsu," meaning "surely win," and a variety of exotic chocolate bars were introduced to cater to local tastes, leading to a successful localization campaign .
* **Coca-Cola in China:** The brand name "Coca-Cola" was adapted to "kekou kele," which translates to "delicious happiness," to resonate with the local market. This involved collaborating with local experts and specialists to develop a new name and a localized marketing strategy .
* **Visuals:** Photos need to be adapted to local cultures, as elements like "blond moms hugging their kids" may not impress a Chinese audience or could offend customers in the Middle East .
* **Colors:** Colors have different meanings across cultures; for instance, red can signify danger, white can mean death, and orange can express mourning and loss in some countries .
* **Political Issues:** Localization requires sensitivity to political matters such as disputed borders and geographical naming disputes .
* **Aesthetics and Social Factors:** Translators should consider local customs, superstitions, religions, social taboos, aesthetics, the appropriateness of colors and images, local architecture, socioeconomic status, clothing, and ethnicity .
#### 29.2.2 Linguistic and writing system considerations
Adapting to different writing systems and languages involves more than just translating words; it requires understanding unique conventions .
* **Writing Systems and Scripts:** Different writing systems use distinct scripts, which can be symbols, logograms, syllograms, or letters .
* **Writing Direction:** Languages can have varied writing directions, including left-to-right (European languages), right-to-left (Arabic, Hebrew), or boustrophedon (alternating directions). Some Asian languages can be written vertically .
* **Complex Text Layout:** Some languages require complex text layouts where characters change shape based on context .
* **Capitalization:** The need for capitalization varies; some languages require it, while others do not .
* **Sorting Rules:** Different writing systems and languages have distinct rules for text sorting .
* **Numeral Systems:** Some languages employ different numeral systems .
* **Grammar and Pluralization:** Grammar rules, including pluralization, vary significantly across languages, requiring close attention to detail .
* **Punctuation:** The usage of punctuation can differ; for example, French uses guillemets ($\ll \dots \gg$) in some publications, similar to English double quotes .
#### 29.2.3 Technical and formatting considerations
Beyond language, various technical and formatting aspects need localization for user-friendliness and accuracy .
* **Layout and Text Length:** Different languages require varying amounts of space to express the same concepts. A flexible layout is necessary to accommodate text expansion, which can range from 30% to 100% when translating from English into other languages .
* **Units of Measurement:** Most countries use the metric system, requiring conversion of measurements to ensure content is easy to follow .
* **Currency Units:** Currency amounts must be localized, including conversion and indicating equivalent amounts in different currencies .
* **Date and Time Formats:** Differences in date formats (e.g., MM/DD/YY vs. DD/MM/YY) are crucial. Time zones also need careful consideration .
* **Paper Size:** Document design might be based on specific paper sizes (e.g., A4 vs. US Letter), impacting formatting and page breaks .
* **Number Formats:** Writing conventions for number formats, including digit grouping and decimal separators, need to be considered .
* **Logos and Images:** Logos and images containing text may require alteration or replacement with more generic icons .
* **Audio and Video:** For video, audio, and film localization, the translation of lyrics or spoken words is done through subtitles or dubbing, requiring precise timing and synchronization .
#### 29.2.4 Legal and regulatory considerations
Compliance with local regulations is a critical aspect of localization to avoid legal issues and penalties .
* **Contracts and Agreements:** Businesses operating in foreign countries must adhere to local regulations regarding contracts and agreements .
* **Privacy Laws:** Compliance with privacy laws is essential .
* **Disclaimers:** Different requirements may exist for disclaimers on packaging or websites .
* **Consumer Labeling:** Regulations on consumer labeling can vary .
* **Encryption and Export Restrictions:** Compliance with regulations on encryption and export restrictions is necessary .
* **Subpoena Procedures and Censorship:** Localization may involve changes to conform with subpoena procedures or internet censorship requirements .
* **Accessibility:** Accessibility requirements must be met .
* **Tax Collections:** Tax considerations, including customs duties, value-added tax, and sales tax, need to be addressed .
* **Identification Numbers:** Consideration should be given to numbers assigned by governments, such as national identification numbers, Social Security Numbers, and passport numbers .
#### 29.2.5 Economic and service provider variations
Economic conventions and service providers can also differ significantly by country .
* **Economic Conventions:** These include variations in paper sizes, preferred storage media, broadcast TV systems, phone number formats, delivery services, postal codes, and postal address formats .
* **Payment Services:** Providers of payment services may vary .
* **Weather Reports and Maps:** The presentation of weather reports and online maps from third-party providers can differ .
* **Electric Current and Voltage:** Standards for electric current and voltage need to be considered .
### 29.3 Localization process for specific media
#### 29.3.1 Video game localization
Localizing video games aims to provide players with content they can fully understand. The process typically involves :
1. **Audit of materials:** Reviewing all localization materials, including text files, documentation, instructions, and artwork. Playing the game in the source language helps translators understand the story, dialogues, and menus .
2. **Localization:** The actual translation and adaptation of content, which can take weeks or months depending on the team size and material volume .
3. **Programming:** Integrating the translated and localized texts into the game by editors or developers .
4. **Quality Control:** Verifying the localized game for grammatical errors, spelling mistakes, wayward text, inconsistencies, and system issues like graphics and sound problems .
5. **Manufacturer’s approval:** Representatives confirm that the localized content meets the original game's requirements .
#### 29.3.2 Movie localization
Movie localization is a cost-effective method for distributing films to global audiences. The primary methods are :
* **Dubbing:** Voice actors replace original dialogue in different languages. Timing is critical to match character movements and speech .
* **Subtitling:** Spoken lines are translated and displayed at the bottom of the screen. Precision is required due to character and time limitations on screen, and subtitles must be synchronized with actions and dialogue .
Localizers must understand the target audience's cultural perceptions and the connotations of various elements to ensure effective adaptation .
#### 29.3.3 Website localization
Localizing a brand's or product's website is crucial for entering new markets with different cultures, languages, and socioeconomic conditions. The goal is to appear local rather than foreign by adapting the website to the local market's needs .
#### 29.3.4 Mobile app and software localization
Mobile apps and software require localization to gain traction in other markets and attract more users. This ensures users can easily follow instructions, navigate, and use the program effectively .
---
Localization is the process of adapting a product or content to a specific locale or market, which involves more than just linguistic translation. This section explores the multifaceted considerations required for effective localization, particularly focusing on differences across writing systems and languages, and the evolving role of human translators in machine translation workflows.
### 29.1 Internationalization and Localization Principles for Websites
Internationalization (#i18n) is the design and development of a website to allow for easy adaptation to different languages and cultural preferences. Key principles include :
* **Unicode Standard:** Essential for ensuring compatibility with diverse writing systems and languages .
* **Separation of Content and Code:** Keeping content distinct from source code facilitates translation without extensive coding alterations .
* **Flexible User Interface (UI):** Designing a UI that can accommodate varying text lengths and different reading directions (e.g., right-to-left scripts) .
* **Date, Time, and Number Formats:** Adapting these to locale-specific conventions is crucial for cultural relevance .
* **Images and Icons:** Selecting culturally neutral visuals or providing region-specific alternatives promotes inclusivity .
The localization (#L10n) process for websites involves:
* **Translation of Content:** Converting text and multimedia elements while considering linguistic nuances and cultural sensitivities .
* **Adaptation of Graphics and Multimedia:** Ensuring visuals are culturally appropriate for the target audience .
* **Adjustment of Layout and Design:** Modifying the visual arrangement to accommodate language-specific text lengths and font styles .
* **Integration of Local Regulations:** Complying with legal requirements concerning content, privacy, and accessibility .
* **Testing and Quality Assurance:** Rigorous testing for functionality, linguistic accuracy, and cultural appropriateness .
Websites present unique localization challenges compared to other audiovisual products due to dynamic content, SEO considerations, the need for extreme cultural sensitivity as public-facing platforms, and the requirement for continuous updates .
### 29.2 Translation for SEO and Metadata Optimization
Effective localization is critical for global online visibility, with translators playing a vital role in Search Engine Optimization (SEO). Key considerations for translators aiming to optimize content for international markets include :
* **Keyword Research:** Identifying relevant terms and phrases in the target language and region, including variations, synonyms, and colloquial expressions .
* **Cultural Relevance:** Understanding cultural nuances to select keywords that resonate with the target audience and avoid unnatural literal translations .
* **Localized Content:** Ensuring translations are not only linguistically accurate but also culturally appropriate, aligning with local customs and market trends .
* **Metadata Optimization:** Translating and optimizing meta titles, meta descriptions, and URL slugs, crafting compelling descriptions with relevant keywords to encourage click-throughs .
* **Multilingual Link Building:** Collaborating to build high-quality, multilingual backlinks from reputable local websites and influencers .
* **Content Structure and Formatting:** Maintaining a user-friendly structure with elements like headers and bullet points to enhance readability and SEO value .
* **Mobile Optimization:** Ensuring translated content is mobile-friendly and media loads quickly on mobile devices .
* **Regular Updates:** Adapting SEO strategies to search engine algorithm changes and regularly updating translated content .
* **Analytics and Reporting:** Monitoring website analytics and providing data-driven insights to refine SEO strategies .
* **Communication with Clients:** Understanding business goals, target audiences, and SEO objectives to align translation efforts with marketing initiatives .
Metadata (meta titles, meta descriptions, tags) is crucial for search engine algorithms to understand and index webpage content, directly impacting visibility and ranking in foreign markets. Professional translators enhance user click-through rates by ensuring translated metadata is linguistically accurate, compelling, and culturally relevant. They ensure local relevance by understanding cultural nuances optimize metadata with region-specific keywords maintain global brand consistency adhere to character limits build credibility and trust by safeguarding content integrity and adapt metadata to current market trends. Collaboration between translators and web developers/marketers is essential for optimizing web pages in foreign markets, bridging linguistic and cultural gaps .
### 29.3 Human Roles in Machine Translation: Pre-editing and Post-editing
The integration of Machine Translation (MT) introduces new workflows where human intervention remains crucial. Humans play key roles in pre-editing and post-editing .
#### 29.3.1 Pre-editing
Pre-editing involves revising technical documentation *before* it undergoes MT to improve the quality of the raw MT output. This process aims to reduce or eliminate the post-editing workload .
* **Purpose:** To facilitate MT by minimizing potential output errors .
* **Techniques:**
* Reducing sentence length .
* Avoiding complex or ambiguous syntactic structures .
* Ensuring term consistency .
* Using articles .
* Running automated revision tools (spell-check, grammar-check) .
* Tagging elements not to be translated .
* **Controlled Natural Language (CNLs):** Subsets of natural languages with restricted grammar and vocabulary to reduce ambiguity and complexity. CNLs improve readability for humans and enable reliable automatic semantic analysis. Examples include Caterpillar Technical English and Simplified Technical English .
* **Caterpillar Fundamental English:** Uses a restricted vocabulary of approximately 850 words to support consistent, high-quality authoring and translation .
* Various companies utilize CNLs, such as Avaya Controlled English (ACE) by Avaya and Simplified Technical English (STE) by Boeing .
* **Controlled Language Rules (Examples):** The CLOUT™ rule set, developed by Uwe Muegge, provides rules to reduce ambiguity for MT. These include :
1. Write sentences shorter than 25 words .
2. Express only one idea per sentence .
3. Repeat the same sentence structure for the same content .
4. Write grammatically complete sentences .
5. Use a simple grammatical structure .
6. Write in the active form .
7. Repeat nouns instead of using pronouns .
8. Use articles to identify nouns .
9. Use words from a general dictionary (avoid obscure vocabulary) .
10. Use only words with correct spelling .
* **When to Consider Pre-editing:** Pre-editing ROI is typically achieved when a document is translated into more than three languages. It is also beneficial when source quality is already high and MT engines are finely tuned .
* **Tools for Pre-editing:** Content memory, generic pre-editing plugins, automated rules from CNLs, and program/client-specific custom tools can aid writers .
#### 29.3.2 Post-editing
Post-editing (or postediting) is the process of amending machine-generated translation to achieve an acceptable final product. A post-editor is the person performing this task. It is distinct from editing human-generated text (revision) .
* **Purpose:** To correct raw MT output when it's not sufficient but full human translation isn't required .
* **Productivity:** Post-editing can at least double, and potentially quadruple (with light post-editing), the productivity of manual translation. However, efficiency is hard to predict, with studies showing varying time savings .
* **Post-editing Strategies:** Strategies depend on project-specific expectations regarding time, quality, and cost .
* **Light Post-editing:** Minimal intervention to make the text understandable for inbound purposes, often when text is needed urgently or has a short lifespan. The focus is on comprehensibility, accuracy, but not stylistic perfection .
* **Full Post-editing:** Greater intervention to achieve a negotiated level of quality, resulting in a text that is understandable and stylistically appropriate for assimilation and dissemination (inbound and outbound). This aims for quality indistinguishable from human translation .
* **Guidelines for Achieving Quality:**
* **"Good Enough" Quality (Light Post-editing):** Comprehensible, accurate (same meaning as source), but not necessarily stylistically compelling. May have unusual syntax or imperfect grammar. Key guidelines include aiming for semantically correct translation, ensuring no information is added or omitted, editing offensive content, using as much raw MT output as possible, basic spelling rules, and avoiding stylistic-only corrections or sentence restructuring for flow .
* **Quality Similar/Equal to Human Translation (Full Post-editing):** Comprehensible, accurate, stylistically fine (though maybe not native-speaker level), with normal syntax, correct grammar, and punctuation. Key guidelines include aiming for grammatically, syntactically, and semantically correct translation, ensuring key terminology is correctly translated, no information is added/omitted, editing offensive content, using raw MT output where possible, applying correct spelling/punctuation/hyphenation, and ensuring correct formatting .
* **Key to Successful Post-editing: Quick Decision Making:** Linguists must quickly decide whether to post-edit MT suggestions or translate from scratch, opting for translation from scratch if post-editing would take longer. Some providers ask linguists to move on if they can't find errors within seconds to ensure efficiency .
* **Over-editing:** Avoid making purely preferential or unnecessary amendments, like replacing a word with a synonym when both are viable .
* **Under-editing:** Avoid leaving errors like mistranslations, punctuation errors, robotic-sounding text, or unapproved terminology .
* **Post-editing and the Language Industry:** Post-editing is a developing profession. While it offers efficiency gains, it is often paid at lower rates than conventional translation. The market size is growing, with advances in MT (partly driven by feedback from post-edited text) leading to improved MT quality and increased use of post-editing. Crowdsourcing platforms and translation management systems facilitate post-editing (#page=227, 228) .
### 29.4 Examples of Localization Failures
Failures in localization can lead to significant financial losses and damage brand reputation (#page=164, 165). Notable examples include :
* **HSBC's "Assume Nothing" Tagline:** Mistranslated as "Do Nothing" in various countries, costing USD 10 million for correction .
* **Pepsi's Chinese Slogan:** "Pepsi Brings You Back to Life" was translated as "Pepsi Brings Your Ancestors Back from the Grave," causing backlash .
* **NASA's Mars Orbiter:** A metric-imperial unit mix-up led to the loss of the spacecraft and a USD 125 million loss .
* **Canadian Maple Leaf Coin:** An inscription error on coins worth approximately 30 million Canadian dollars required recalls .
* **London Olympics Ticket Website:** A mistranslation in Welsh directed users to the wrong website .
* **Siri's Gender Bias:** Virtual assistant responses reinforced gender stereotypes in certain languages, implying men exclusively held certain job positions in Chinese .
---
## Common mistakes to avoid
- Review all topics thoroughly before exams
- Pay attention to formulas and key definitions
- Practice with examples provided in each section
- Don't memorize without understanding the underlying concepts
Glossary
| Term | Definition |
|------|------------|
| Computer-Assisted Translation (CAT) System | A translation tool that assists human translators by providing features such as translation memory, terminology management, and quality assurance checks, thereby streamlining the translation workflow. |
| Translation Process | The systematic series of steps undertaken by a translator to convert text from a source language into a target language, often involving pre-translation checks, the actual translation, and post-translation review. |
| File Format Check | An initial step in the translation process that verifies the compatibility and integrity of the source file format to ensure it can be processed correctly by the CAT system and other translation tools. |
| Resource Allocation | The process of assigning necessary resources, such as translators, project managers, and specialized software, to a translation project to ensure efficient and timely completion. |
| Segmentation | The division of a source text into smaller, manageable units, typically sentences or phrases, which are then processed individually within a CAT system for translation and storage in translation memory. |
| Translation | The core activity of converting text from a source language to a target language, performed by a human translator, often with the aid of a CAT system. |
| Translation Memory (TM) File | A structured text file, typically in XML format, that stores translation and linguistic data, including source and target segments, language information, and creation dates. |
| XML (Extensible Markup Language) | A text-based markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable, providing a well-defined structure for representing complex data. |
| Segment | A unit of text within a translation memory file, typically consisting of a source segment (the original text) and its corresponding target segment (the translated text). |
| XLIFF (XML Localization Interchange File Format) | A popular XML-based file format used for storing translation memory data, designed to facilitate the exchange of localized content between different translation tools and workflows. |
| TMX (Translation Memory eXchange) | Another widely adopted XML-based file format for translation memory, enabling the storage and exchange of bilingual text segments and associated metadata between various translation software applications. |
| Spreadsheet Files (XLS, CSV) | File formats like Excel (XLS) or comma-separated values (CSV) that can also store translation memory data, though they generally store less detailed information per translation unit compared to XML-based formats. |
| Semantic Tags | Tags within an XML file, such as `` or ``, that provide meaning and context to the data they enclose, making the file easier to understand and process. |
| Localization | The process of adapting content and elements beyond mere text translation to make it appealing and usable for diverse local markets and audiences. |
| Cultural Barriers | Obstacles arising from differences in cultural norms, values, and perceptions that can hinder effective communication and user experience in a localized product or service. |
| User Experience (UX) | The overall experience a person has when interacting with a product, system, or service, which can be significantly improved by adapting various elements beyond text during localization. |
| Currency Conversion | The process of exchanging one currency for another, often necessary in localization to display equivalent monetary values in the target market's currency, such as converting $100 to £65. |
| Date Formats | The various ways dates are written and understood in different regions, requiring adaptation in localization to avoid misinterpretation, for example, distinguishing between April 5th and May 4th. |
| Text Expansion | The phenomenon where translated text can increase in length compared to the source text, often ranging from 30% to 100%, necessitating flexible layout and design in localized documents and software. |
| Units of Measurement | Systems used to quantify physical quantities, such as length, weight, and volume, which must be converted to local standards (e.g., metric system) for clarity and ease of understanding in localization. |
| Subtitling | The process of displaying translated text on screen to accompany audio in a video, film, or other media, often used for localization when dubbing is not feasible or desired. |
| Dubbing | The process of replacing the original audio track of a video, film, or other media with a translated version, requiring synchronization with lip movements and cultural appropriateness. |
| Writing Systems | The diverse methods used to represent language visually, encompassing different scripts, characters (symbols, logograms, syllograms, letters), and writing directions (left-to-right, right-to-left, boustrophedon, vertical). |
| Text Layout | The arrangement and presentation of text, which can be complex and vary significantly between languages, affecting character shapes, capitalization rules, and sorting orders. |
| Numeral Systems | The distinct sets of symbols and rules used to represent numbers, which can differ across languages and require adaptation during localization. |
| Pluralization Rules | The grammatical variations in forming the plural of nouns and other word forms, which differ significantly between languages and must be accurately translated. |
| Punctuation | The use of symbols to structure and clarify written text, with conventions varying across languages, such as the use of guillemets in French instead of double quotes. |
| Economic Conventions | Localized practices and standards related to commerce and daily life, including paper sizes, storage media, broadcast systems, phone number formats, postal codes, currency symbols, and measurement systems. |
| Time Zones | Geographical regions that observe a uniform standard time, necessitating careful consideration during localization to ensure accurate scheduling and communication. |
| Regulatory Compliance | Adherence to the laws and regulations of a specific country or region, which may require product customization or significant changes to meet legal requirements like privacy laws, labeling standards, and export restrictions. |
| Internationalization (i18n) | The process of designing and developing a website in a manner that facilitates its straightforward adaptation to various languages and cultural preferences, ensuring broad accessibility. |
| Unicode Standard | A universal character encoding standard that ensures compatibility with a wide array of writing systems, enabling the accurate representation of diverse languages and symbols. |
| Separation of Content and Code | A development principle where textual content is kept distinct from the underlying source code, simplifying the translation process by minimizing the need for extensive modifications to the programming. |
| Flexible User Interface (UI) | A user interface design that can accommodate variations in text length and adapt to different reading directions, such as right-to-left languages, ensuring usability across diverse linguistic contexts. |
| Locale-Specific Formats | The practice of adjusting the presentation of dates, times, and numbers to align with the conventions and expectations of a particular geographical or cultural region, enhancing user relevance. |
| Translation Memory (TM) Software | Software that divides texts into segments and uses metadata to trace each segment back to its translator, date, and time of creation, allowing for the reuse of previously translated content. |
| Metadata | Data that describes other data, providing additional information about digital content and processes. It can include details like the translator, date, time, file type, and organizational structure. |
| Translation Unit | A pair of aligned segments, one in the source language and one in the target language, saved together within a translation memory database. |
| Fuzzy Match | A suggestion provided by translation memory software for a segment that is not an exact match but shares a significant percentage of similarity with a previously translated segment. |
| Descriptive Metadata | A type of metadata that describes the content of digital information. |
| Structural Metadata | A type of metadata that describes the organization of digital objects or components. |
| Administrative Metadata | A type of metadata that describes technical information about digital content, such as its file type or creation date. |
| Interoperability | The ability of different software tools or systems to exchange and use information effectively, which can be a challenge when transferring translation memory data between formats. |
| TBX | TermBase eXchange format, also known as DXLT (Default XLT format), is a file format designed for transferring glossaries between different translation tools. It is based on the ISO 12200 standard, which is the Machine-Readable Terminology Interchange Format (MARTIF). |
| DXLT | Default XLT format, which is an XML representation of Lexicons and Terminologies. It is an alternative designation for the TBX (TermBase eXchange) format, used for transferring glossaries between translation tools. |
| ISO 12200 | The international standard for Machine-Readable Terminology Interchange Format (MARTIF). This standard forms the basis for file formats like TBX, enabling the exchange of terminology data between different systems. |
| MARTIF | Machine-Readable Terminology Interchange Format, specified by ISO 12200. It provides a standardized way to represent and exchange terminology information, facilitating interoperability between translation and terminology management tools. |
| SALT | Standards-based Access service to multilingual Lexicons and Terminologies. This service, provided by BYU, is related to the organization and maintenance of multilingual terminology resources. |
| Source Language | The original language of the content that is being translated or localized. |
| Target Language | The language into which the content is being translated or localized. |
| Local Versions and Dialects | Variations of a language spoken in specific regions or by particular groups, which may differ in vocabulary, pronunciation, and idiomatic expressions. |
| Pre-editing | The process of revising technical documentation before it undergoes machine translation (MT) to improve the quality of the raw output. This involves making the source text easier for an MT engine to process, thereby reducing the need for post-editing. |
| Post-editing | The process of revising the output generated by a machine translation (MT) system. The goal is to correct errors and improve the fluency and accuracy of the translated text to meet quality standards. |
| Machine Translation (MT) | An automated process that uses computer software to translate text or speech from one language to another without direct human intervention during the translation itself. |
| Source Text | The original text that is to be translated into another language. In the context of pre-editing, this is the text that is revised to improve its suitability for machine translation. |
| Raw Output | The initial translation produced by a machine translation system before any human revision or post-editing has been performed. |
| Specialized Human Editor | An individual with expertise in translation and an understanding of how machine translation engines process text. This professional can anticipate potential errors and optimize the source text for better MT results. |
| Sentence Length Reduction | A pre-editing technique where long and complex sentences in the source text are broken down into shorter, simpler ones to facilitate machine translation. |
| Ambiguous Syntactic Structures | Grammatical constructions in the source text that can be interpreted in multiple ways, potentially leading to errors in machine translation. Pre-editing aims to simplify these structures. |
| Term Consistency | The practice of using the same translation for specific terms throughout a document or project. Ensuring term consistency in the source text is crucial for accurate machine translation. |
| Automated Revision Tools | Software applications used to check and improve text quality, such as spell checkers and grammar checkers. These tools are often employed during the pre-editing process. |
| Project-Specific Glossary | A curated list of terms and their approved translations relevant to a particular project. This glossary is used by automated revision tools to ensure consistent terminology. |
| Localization Best Practices | Established methods and procedures followed by organizations to adapt products, services, and content for specific international markets, often including pre-editing and other quality assurance steps. |
| Pronoun Substitution Rule | A controlled language rule that mandates the repetition of a noun instead of using a pronoun to enhance clarity and avoid potential ambiguity in written text. |
| Article Usage Rule | A controlled language rule that requires the use of articles (e.g., "a," "an," "the") to clearly identify nouns, thereby improving the precision and understandability of sentences. |
| General Dictionary Rule | A controlled language rule that promotes the use of common, widely understood words found in a general dictionary, discouraging the use of obscure or specialized vocabulary that might hinder comprehension. |
| Spelling Accuracy Rule | A controlled language rule that emphasizes the importance of using only words with correct spelling to ensure that written content is easily understood and does not introduce complications, particularly in translation processes. |
| Raw MT output | The initial, unedited text generated by a machine translation system before any human intervention or revision. |
| Publishable quality | A high standard of translation quality, equivalent to that produced by a human translator and subsequently revised, suitable for public dissemination or publication. |
| Good enough quality | A lower standard of translation quality that ensures the message is comprehensible and accurate, but may not be stylistically perfect or sound entirely natural. |
| Fit for purpose | A quality level where the translated content effectively serves its intended use, even if it does not meet the highest stylistic or linguistic standards. |
| Light post-editing | A type of post-editing focused on making minimal corrections to machine-translated text, typically addressing errors that impede comprehension or accuracy, without extensive stylistic refinement. |
| Full post-editing | A comprehensive post-editing process aimed at achieving a quality level comparable to human translation, involving extensive revisions to grammar, syntax, style, and terminology. |
| Semantically correct translation | A translation where the meaning of the source text is accurately conveyed in the target language, ensuring no information is lost or misrepresented. |
| Stylistically compelling | Refers to text that is not only accurate and comprehensible but also possesses a natural flow, engaging tone, and appropriate linguistic nuances, akin to high-quality human writing. |
| Do Not Translate terms | A specific list provided by a client that designates certain terms or phrases that should not be translated and should remain in their original language in the translated output. |
| TMX | Translation Memory eXchange. This format facilitates the transfer of translation memories between different translation tools, enabling the reuse of previously translated segments. A translation memory itself is a database storing source text segments and their corresponding translations in various target languages. |
| XLIFF | XML Localisation Interchange File Format. This standard is designed for the transfer of localizable data extracted from original files. It supports the movement of content through different stages of the localization process, including the final merging of translated data back into the source format. |
| OLIF | Open Lexicon Interchange Format. This format is specifically designed for the exchange of terminological and lexical data between translation tools. It is particularly suited for Natural Language Processing (NLP) applications, such as machine translation lexicons, and serves a similar purpose to TBX but with a stronger focus on NLP data. |
| Translation Memory (TM) | A structured collection of source text segments and their corresponding translations in one or more target languages. TMs are used by translation tools to suggest previously translated content, ensuring consistency and improving efficiency in the translation process. |
| Glossary | A collection of terms and their definitions, often specific to a particular domain or project. In the context of translation, glossaries are crucial for maintaining terminological consistency across documents and ensuring accurate translation of specialized vocabulary. |
| CAT Tool | Computer-Assisted Translation tool, a software application that assists human translators in the translation process by providing features such as translation memory, terminology management, and quality assurance checks. |
| Import | The function within a CAT tool used to transfer text and its corresponding translation from an external file into the translation memory database. This can be done from various formats, including raw and native formats. |
| Analysis | A process within CAT tools that involves parsing and analyzing source texts to identify elements like punctuation, proper names, and specialized text, which aids in pre-editing and preparing the text for translation. |
| Textual Parsing | The initial stage of analysis that focuses on correctly recognizing punctuation and other textual elements to distinguish between different uses, such as sentence endings and abbreviations, often involving markup. |
| Linguistic Parsing | A more advanced stage of analysis that involves reducing words to their base forms for term retrieval from a term bank and syntactically analyzing phrases to normalize word order and identify multi-word expressions. |
| Alignment | The task of establishing correspondences between segments in a source text and their corresponding translations in a target text, ensuring that the translation memory accurately links source and target units. |
| Term Extraction | The process of identifying and collecting specific terms or phrases from a source text, often with the aid of dictionaries or text statistics, to build or augment terminology databases. |
| Export | The function within a CAT tool used to transfer translated text from the translation memory database into an external text file, typically as the inverse operation of importing. |
| Exact Match | A type of retrieval from a translation memory where the current source segment is identical, character by character, to a segment already stored in the TM, often referred to as a "100% match". |
| In-Context Exact (ICE) Match | An exact match that not only involves identical source and target segments but also occurs in the same contextual environment, such as the same location within a paragraph or with similar surrounding attributes. |
| Automated Alignment Tool | Software used in the alignment process that analyzes source and target files to automatically match sentences and build a preliminary Translation Memory. |
| Legacy Material | Previous translations that are not already in a Translation Memory format and need to be processed through alignment to be leveraged in future projects. |
| Linguistic Vendor | A company or individual that provides translation services, which may or may not deliver Translation Memories as part of their project deliverables. |
| Search Engine Optimization (SEO) | The practice of enhancing a website's visibility in search engine results pages to attract more organic traffic. This involves optimizing content, technical aspects, and building authority. |
| Keyword Research | The process of identifying relevant terms and phrases that potential customers use when searching for products or services online. This is crucial for tailoring content to local search behavior. |
| Cultural Relevance | The degree to which content aligns with the customs, traditions, and preferences of a specific target audience, ensuring that translations are not only linguistically accurate but also culturally appropriate and resonant. |
| Localized Content | Website content that has been adapted to a specific target market, considering linguistic accuracy, cultural nuances, local customs, traditions, and market trends to enhance its relevance and appeal. |
| Metadata Optimization | The process of translating and refining meta titles, meta descriptions, and URL slugs to be compelling, concise, and keyword-rich, encouraging click-throughs from search engine results pages. |
| Multilingual Link Building | The strategic effort to acquire backlinks from reputable websites in various languages. This collaboration with local websites and influencers helps improve search engine rankings in international markets. |
| Content Structure and Formatting | The organization and presentation of translated content using elements like headers, bullet points, and paragraphs. Well-structured content enhances readability and is valued by search engines for SEO. |
| Mobile Optimization | Ensuring that translated web content is designed and formatted to function effectively and load quickly on mobile devices, which is increasingly important for search engine rankings due to the prevalence of mobile search. |
| Analytics and Reporting | The process of monitoring website performance data, such as traffic sources and user engagement, to assess the effectiveness of localized content and SEO strategies, providing insights for continuous refinement. |
| International Web Presence | The visibility and accessibility of a website across different countries and languages. This is achieved through effective localization and SEO strategies tailored for global audiences. |
| Search Engine Visibility | The degree to which a webpage is discoverable and appears in relevant search engine results for a given query, significantly influenced by accurate and optimized metadata. |
| User Click-Through Rates | The percentage of users who click on a specific link in search engine results after viewing it, which can be positively impacted by compelling and culturally relevant translated meta titles and descriptions. |
| Local Relevance | The degree to which content, including metadata, aligns with the cultural nuances, preferences, and expectations of a specific target audience in a foreign market, making it more appealing. |
| Keyword Optimization | The strategic incorporation of relevant and region-specific terms within metadata to improve a webpage's chances of appearing in targeted search results, a task requiring expertise in the target language. |
| Global Brand Consistency | The maintenance of a unified brand tone, message, and identity across all international markets, ensuring that translated metadata accurately reflects the brand's core values and communication style. |
| Character Limits | Restrictions imposed by search engines on the maximum length of meta titles and descriptions, requiring translators to craft concise yet impactful translations that avoid truncation in search results. |
| Credibility and Trust | The perception of reliability and authenticity that a website conveys to users, which can be undermined by inaccurate translations and enhanced by professional, integrity-preserving localization efforts. |
| Market Trends | Evolving linguistic and cultural shifts within specific markets that can influence user search behavior and content preferences, necessitating the adaptation of translated metadata for sustained optimization. |
| Source Segment | The portion of the original text that is currently being translated. Terms recognized from the glossary within this segment are highlighted. |
| Target Segment | The portion of the translated text where the translation of the source segment is entered. |
| Placeables | Terms identified from the glossary within the source segment, highlighted in blue, which can be easily navigated and manipulated using specific shortcuts or mouse actions. |
| Auto-suggest Feature | A default setting in Wordfast Anywhere that proposes target terms as the user types the first few letters of either the source or target term, aiding in rapid translation entry. |
| Glossary Panel | A dedicated panel within Wordfast Anywhere that displays glossary information, allowing users to preview translations and access additional details about terms. It can be activated via a keyboard shortcut or a menu option. |
| Comment Field | A designated area within the Glossary Dialog Box where translators can add notes or contextual information about a specific term and its translation, serving as a reminder for future reference. |
| F1, F2, F3 Fields | Additional fields available in the Glossary Dialog Box for storing specific types of information related to a term, such as its grammatical form, context, or word role, to enhance the glossary's utility. |
| Glossary Dialog Box | A pop-up window invoked to add new terms to the glossary. It presents fields for the source term, target term, and optional comment or F1-F3 fields for additional data. |
| Controlled Natural Language (CNL) | A subset of a natural language that restricts grammar and vocabulary to minimize ambiguity and complexity, often used to improve readability for humans or enable reliable automatic semantic analysis. |
| Simplified Technical English | A type of controlled language designed to improve the quality of technical documentation and facilitate (semi-)automatic translation by imposing restrictions on writers, such as sentence length, pronoun usage, and approved vocabulary. |
| Caterpillar Technical English (CTE) | A controlled language developed by Caterpillar Inc. to ensure consistency and high quality in technical documentation and translation, featuring a restricted vocabulary of approximately 850 words. |
| Caterpillar Fundamental English (CFE) | Another controlled language from Caterpillar Inc., likely similar in purpose to Caterpillar Technical English, aimed at supporting the authoring and translation of extensive technical documentation for complex machinery. |
| ASD Simplified Technical English (ASD-STE) | A specific standard for controlled technical English, widely adopted in industries like aerospace, which provides a set of rules to ensure clarity and reduce ambiguity in technical documentation. |
| CLOUT™ | An acronym for Controlled Language Optimized for Uniform Translation, representing a set of rules developed to reduce ambiguities in texts across many languages, thereby improving suitability for machine translation. |
| Ambiguity | The quality of being open to more than one interpretation; uncertainty of meaning, which controlled natural languages aim to eliminate or significantly reduce in technical documentation. |
| Semantic Analysis | The process of understanding the meaning of words, phrases, and sentences in a language, which controlled natural languages are designed to facilitate for reliable automatic processing. |
| Computer-Assisted Translation (CAT) Tools | Software applications designed to aid human translators in the translation process, often by leveraging features like translation memory and terminology management. These tools aim to increase efficiency and consistency. |
| Technical Equipment Documentation | The written materials, manuals, guides, and specifications that accompany machinery, software, or other technological products, detailing their operation, maintenance, and features. The increasing volume of this documentation drove technological advancements in translation. |
| Electronic Content | Information that exists in a digital format and can be reproduced or displayed across various media and platforms. The need to translate and manage electronic content in different formats spurred the development of specialized translation technologies. |
| Communicative Functions | The specific purposes or intentions behind a piece of text, such as informing, persuading, instructing, or entertaining. Documents with diverse communicative functions require nuanced translation approaches, which technology has helped to streamline. |
| Computer-Assisted Translation (CAT) | A set of computer applications specifically designed to efficiently assist the translator in their task, aiming to automatically and rapidly provide all necessary resources for their work. |
| Translation Memory eXchange (TMX) | An open, standard format for storing and exchanging translation memory data, ensuring compatibility between various CAT tools like Trados, MemoQ, and Wordfast. |
| Project Management Software | Software used in CAT workflows to manage the translation process, including controlling information flow, assigning tasks, ensuring quality control, analyzing content, generating reports on matches and repetitions, performing word counts, and managing final delivery to the client. |
| Full Matches | Occurrences within a document where a segment is an exact match to a segment already present in the translation memory, indicating that the entire segment has been translated before. |
| Fuzzy Matches | Occurrences within a document where a segment is similar but not identical to a segment in the translation memory, requiring the translator to review and adapt the existing translation. |
| Intra-file Repetitions | Identical or similar segments that appear multiple times within the same document, which can be leveraged by CAT tools for efficiency. |
| Cross-file Repetitions | Identical or similar segments that appear across different documents, allowing for consistency and efficiency when translating related projects. |
| Header | The section of a translation memory file that contains metadata about the file itself and the localization process, providing context and administrative information. |
| Body | The main section of a translation memory file that contains the core translation data, specifically the translation units and their associated source and target segments. |
| Sentence Alignment Process | The procedure of matching corresponding sentences between a source text and its translation, enabling the creation of translation memories or comparable corpora. |
| Automated Alignment | The process of using software tools to automatically match sentences in a source document with their corresponding translations in a target document based on filename linkage and textual analysis. |
| Linguistic Verification | The manual review and correction of automatically generated sentence alignments by a human linguist to ensure accuracy and fix any mismatches or synchronization issues. |
| Source Files | The original documents or texts that serve as the basis for translation. |
| Target Files | The documents or texts that contain the translations of the source files. |
| Quality Score | A metric generated by alignment tools, often based on internal algorithms, to indicate the confidence level or success rate of the automated sentence alignment. |
| Corpus | A collection of written or spoken texts, often used for linguistic research, analysis, or training machine translation models. |
| Tags | Formatting or variable information within files that is converted during the translation process, which alignment tools can use as a guide for matching segments. |
| Segments | Individual units of text, typically sentences or phrases, that are aligned between source and target documents. |
| Term Base | A database that stores single words or expressions pertaining to a specific subject, often in a bilingual or multilingual format, used to ensure consistency and accuracy in translation projects. |
| CAT Tools | Computer-Assisted Translation tools that integrate features like term bases and glossaries to assist translators in managing terminology and streamlining the translation process. |
| Consistency | The state of maintaining uniformity in the core message and terminology across multiple translation projects, especially when several collaborators are involved, which is facilitated by the use of term bases. |
| Forbidden Terms | Specific words or expressions that translators are instructed not to use in their translations, managed within term bases to prevent the use of unwanted terminology. |
| Terminology Management | The systematic process of identifying, collecting, organizing, and standardizing terms and their definitions within a specific domain, crucial for effective translation and communication. |
| Bilingual Glossary | A glossary containing terms and their translations in two languages, commonly used to ensure that specific terminology is translated accurately between those two languages. |
| Multilingual Glossary | A glossary that includes terms and their translations in more than two languages, facilitating translation across multiple language pairs within a single resource. |
| LSP | Language Service Provider, an organization that offers translation and localization services, often utilizing term bases and glossaries to manage client-specific terminology and ensure project consistency. |
| Project Manager (PM) | The individual responsible for overseeing a translation project, who can use term bases to verify the consistency of terminology even if they do not speak the target language. |
| Controlled Language | A subset of natural language that has been restricted in its grammar and vocabulary to simplify writing and improve the accuracy and consistency of machine translation and other automated processes. |
| Return on Investment (ROI) | A performance measure used to evaluate the efficiency of an investment, in this context, assessing whether pre-editing or post-editing is more cost-effective. |
| Revision | The process of improving human-generated text, distinct from post-editing which specifically addresses machine-generated output. |
| International Service Language | A specific type of controlled language developed by Kodak, designed to facilitate communication across international service teams. |
| Nortel Standard English (NSE) | A controlled language developed by Nortel, which imposes grammatical and stylistic rules on English text to ensure consistency and reduce misinterpretation. |
| Simplified Technical English (STE) | A controlled language, notably used by Rolls-Royce and Saab Systems, that adheres to a strict set of rules to simplify technical writing and enhance comprehension. |
| ASD-STE100 | The official specification document for Simplified Technical English, providing the detailed rules and guidelines for its implementation. |
| CLOUT™ rule set | An acronym for Controlled Language Optimized for Uniform Translation, this is a set of rules developed by Uwe Muegge to minimize ambiguities in texts, making them more suitable for machine translation. |
| Machine Translation | The use of computer software to translate text or speech from one language to another automatically. |
| Grammatically Complete Sentence | A sentence that contains both a subject and a predicate and expresses a complete thought, adhering to the structural rules of a language. |
| Active Form | A grammatical construction where the subject of the sentence performs the action of the verb, as opposed to the passive form where the subject receives the action. |
| Pronoun | A word that can function as a noun substitute, such as "it," "he," "she," or "they." |
| Article | A word, such as "a," "an," or "the," that precedes a noun and specifies its grammatical definiteness. |
| Computer-Assisted Translation (CAT) Tool | Software designed to assist human translators in the translation process. These tools often integrate features like translation memory, terminology management, and quality assurance checks to enhance productivity and accuracy. |
| Translation Environment Tool (TEnT) | An alternative term for a Computer-Assisted Translation (CAT) tool, emphasizing its role in providing a comprehensive environment for translation workflows. |
| Inline Markup Elements | Special codes or tags within a document that represent formatting, tags, or other non-translatable content. Both TMX and XLIFF handle these elements, though with different methods. |
| Encapsulation Method | A technique used in file formats to enclose native codes (inline markup) within specific elements, ensuring they are preserved during translation. TMX primarily uses this method. |
| Placeholder Method | A technique where native codes are removed from the main text and replaced with short elements that refer to them. This method is supported by XLIFF, often in conjunction with a "Skeleton file." |
| Target Locale | The specific geographical region, country, or cultural group for which a product or service is being adapted and made available. |
| Engagement | The degree to which a message or product captures the attention and interest of the target audience, often enhanced by tailoring content to local expectations and cultural relevance. |
| Cultural Adaptation | The process of modifying content, design, and functionality to align with the customs, values, beliefs, and traditions of a specific culture or target audience. |
| Legal Requirements | The set of laws and regulations that a product or service must comply with in a specific country or region, which may necessitate customization or changes to fit regulatory compliance, privacy laws, labeling, encryption, and tax collection. |
| Simship | The simultaneous release of products across all local markets, aiming for a unified global launch. |
| Time-to-market | The speed at which products are introduced to the market, with faster schedules becoming increasingly common in the industry. |
| Internationalisation (I18N) | The process of designing and developing products in a way that avoids the need for re-design for each specific local market. |
| Localisation (L10N) | The adaptation of products and their accompanying documentation to the specific language and cultural nuances of target markets. |
| Terminology Extraction | The automated process of identifying and extracting key terms from source texts to build or enrich terminology databases. |
| Software Localisation | The adaptation of software applications to a specific language and culture, including text, graphics, and user interface elements. |
| Project Management (Translation) | The oversight and coordination of all aspects of a translation project, from initial assignment and resource allocation to quality control and final delivery. |
| Computer-Aided Translation (CAT) | A system that uses software tools to assist human translators in the translation process, encompassing features like translation memory, terminology management, and project management. |
| Translation Unit (TU) | A pair of source and target text segments that have been translated and stored together in a translation memory, representing a complete translation of a specific piece of text. |
| Exact Match (100% Match) | A situation where a source segment in the current document exactly matches a segment already stored in the translation memory, allowing for direct reuse of the previous translation. |
| Concordance | A feature in translation tools that allows a translator to search the translation memory for specific words or phrases within source segments, retrieving all occurrences and their corresponding translations. |
| Term Base (Glossary) | A database containing single words or expressions related to a specific subject, often bilingual or multilingual, used to ensure consistent and correct terminology in translations. |
| Localization (L10N) | The process of adapting a product or content to a specific local market, which goes beyond simple translation to include cultural, legal, and technical adaptations to meet the needs of local users. |