-
Download the UIPath Studio and its browser extension. Enable the extension in your browser.
-
Download the LangSync.zip file.
-
Unzip and move it to the UIPath directory(Path given during installation).
-
Collect the English sentences you want to use for training, and add them to a text file. Sentences should be separated by '\n' and a '$' key should be used after every 5000 characters because of the limits.
-
After following the preprocessing in step-4, name the text file as input and replace it with the input.txt file in the LangSync directory.
-
Now open the UIPath studio and you can find the process named ParallelCorpus. Click on it and then click on the "open main workflow" dialogue.
-
You can now run the automation by clicking the "Debug file" button on the top left and then selecting "Run File".
-
The default source and target languages are English and Hindi. To change them, follow these steps:
-> Scroll down until you see the "Browser URL" element.
-> Then double-click on the URL and change the 'sl' and 'tl' values to your desired language codes.
-> Find your language codes here.
-
The output parallel dataset for your language can be found in the output.txt file in the LangSync directory. Execution time depends on the size of the input file. You can keep track of the progress with the help of the count.txt file.
-
Below are sample input and output files for your reference.
Sample Input
Sample Output