This site uses cookies.
Some of these cookies are essential to the operation of the site,
while others help to improve your experience by providing insights into how the site is being used.
For more information, please see the ProZ.com privacy policy.
Transit NXT translation memories w/ different number of source/target segments
Thread poster: Gary Hess
Gary Hess Local time: 00:07 German to English + ...
Mar 16, 2023
I am trying to create a custom QA tool for my own use and was analyzing some .DEU and .ENG files. Sometimes there is a mismatch between the source and target files, e.g. source file has 30 segments and target file has 31 segments. I would assume that some other translator during the translation process split 1 segment into 2 segments (that would explain the discrepancy).
I have a technical question: How does Transit NXT know which segments belong to one another? I have looked at the... See more
I am trying to create a custom QA tool for my own use and was analyzing some .DEU and .ENG files. Sometimes there is a mismatch between the source and target files, e.g. source file has 30 segments and target file has 31 segments. I would assume that some other translator during the translation process split 1 segment into 2 segments (that would explain the discrepancy).
I have a technical question: How does Transit NXT know which segments belong to one another? I have looked at the XML quite a bit, but I can't figure it out yet.
BTW: I loaded a pair of mismatched .DEU and .ENG files into XBench, but Xbench doesn't correctly align the mismatched segments either.
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
wotswot France Local time: 00:07 Member (2011) French to English
Misaligned language pairs
Mar 16, 2023
What I do is open the two files in two separate windows of a powerful text editor (like Notepad ++), place them side by side then find and delete the offending segment.
Segment lines begin with where n is a number, and end with .
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
wotswot France Local time: 00:07 Member (2011) French to English
Follow-up to my previous message
Mar 16, 2023
Segment lines begin with Seg SegID=n (where n is the segment's number) and end with /Seg
[Edited at 2023-03-16 16:19 GMT]
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Gerald Dennett United Kingdom Local time: 23:07 German to English + ...
Re-align
Mar 16, 2023
You need to perform an alignment on the offending pair of files. Otherwise the pair will be ignored in any TM.
Gerald
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Gary Hess Local time: 00:07 German to English + ...
TOPIC STARTER
How to do this automatically?
Mar 16, 2023
I should have said that I want to write a program to recognize and interpret the mismatch automatically. I can edit the file manually, but there must be something inside the XML that points to the correct alignment. I want to figure out how Transit NXT manages this misalignment.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Hans Lenting Netherlands Member (2006) German to Dutch
Reverse engineering
Mar 17, 2023
Gary Hess wrote:
I should have said that I want to write a program to recognize and interpret the mismatch automatically. I can edit the file manually, but there must be something inside the XML that points to the correct alignment. I want to figure out how Transit NXT manages this misalignment.
Did you already create a project with one segment and split this segment, to see what happens in the xml? Silly question perhaps, since you seem to know how to write code...
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Gary Hess Local time: 00:07 German to English + ...
TOPIC STARTER
Maybe it's really an error...
Mar 17, 2023
I tried your idea on a project (joining and splitting some segments to look at the results). The number of segments is actually never mismatched after these steps. So maybe the files in question do indeed have an error.
Thanks for all the suggestions!
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.