Python离线翻译

最近有个小任务,需要翻译数量巨大的文档,文档已经由OCR识别为了文本并存储为了txt文件,但是这些文档由于某些原因不能通过各种在线翻译来翻译以防止信息泄露,因此只能离线翻译的方式,Google了一下没有找到一个好的离线翻译的解决方案,于是想通过使用有道词典的客户端进行离线翻译,设想是用python模拟键盘鼠标的操作读取文件,修改计算机剪切板的内容并将翻译好的数据存储在文件中。

话不多说,因为要复制粘贴,所以代码需要有访问计算机剪切板的功能,python中可以直接安装pyperclip

1
python -m install pyperclip

用python模拟键盘操作需要用到库PyUserInput,但是这个库中用到的pyhook并不支持python3,所以需要先下载pyhook并编译出python3的安装代码:

pyhook_py3k下载地址:https://github.com/Answeror/pyhook_py3k

编译过程中用到的swig.exe的下载地址:http://www.swig.org/download.html

解压pyhook_py3k文件夹并运行如下命令:

1
2
python setup.py build_ext --swig=你的文件路径\swig.exe
pip install .

运行期间可能会出现缺少VC build tool的报错,直接谷歌下载相应工具安装后即可正常运行

安装完成后输入命令:

1
python -m pip install PyUserInput

完整代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
#encoding=utf-8
import pyperclip
import time
import os
from pymouse import *
from pykeyboard import PyKeyboard


def get_trans_result(string_to_be_trans):
print('待翻译'+string_to_be_trans)
pyperclip.copy(string_to_be_trans)

m.click(500, 400, 1, 1)
k.press_key(k.control_l_key)
time.sleep(0.01)
k.tap_key('a')
time.sleep(0.01)
k.release_key(k.control_l_key)
time.sleep(0.01)
k.press_key(k.backspace_key)
time.sleep(0.01)
k.release_key(k.backspace_key)
time.sleep(0.01)

m.click(500, 400, 2, 1)
time.sleep(0.01)
m.click(550, 435, 1, 1)

time.sleep(3)

m.click(500, 800, 1, 1)
k.press_key(k.control_l_key)
time.sleep(0.01)
k.tap_key('a')
k.release_key(k.control_l_key)
time.sleep(0.01)
k.press_key(k.control_l_key)
time.sleep(0.01)
k.tap_key('c')
time.sleep(0.01)
k.release_key(k.control_l_key)
m.click(500, 800, 2, 1)
time.sleep(0.01)
m.click(550, 810, 1, 1)
trans_result = pyperclip.paste()
print('翻译结果' + trans_result)
time.sleep(0.2)
return trans_result

if __name__=='__main__':
time.sleep(5)

m = PyMouse()
k = PyKeyboard()

for folder in os.listdir('./src'):
for filename in os.listdir('./src/'+folder):
if(os.path.splitext('./src/'+folder+'/'+filename)[1]=='.txt'):
print('filename:'+filename)
#counter=counter+1
with open('./src/'+folder+'/'+filename,'r',encoding='utf-8') as file_to_be_trans:
string_to_be_trans = ''
translated_string = ''
final_string= ''
line_num = 0

for line in file_to_be_trans.readlines():
line = line.strip()
for i in range(0, 32):
line = line.replace(chr(i), '')
line = line + '\n'
if (line != ''):
line_num = line_num + 1
string_to_be_trans = string_to_be_trans + line

if (line_num == 5):
translated_string = get_trans_result(string_to_be_trans)
if(translated_string==string_to_be_trans):
time.sleep(2)
final_string = final_string + translated_string
string_to_be_trans = ''
line_num = 0
time.sleep(0.05)

if (os.path.exists('./dst/' + folder)):
print('')
else:
os.makedirs('./dst/' + folder)

filename_zh = get_trans_result(filename.replace('.txt', '')) + '.txt'.strip()
if(len(filename_zh)>len(filename.replace('.txt', ''))+15):
filename_zh = get_trans_result(filename.replace('.txt', '')) + '.txt'.strip()
with open('./dst/'+folder+'/'+filename.replace('.txt','')+'---'+filename_zh, 'w', encoding='utf-8') as trans_ed_file:
trans_ed_file.write(final_string)
os.remove('./src/' + folder + '/' + filename)
time.sleep(0.1)

因为时间急任务紧,代码写的很难看。。最近事情比较多,就不改了😂

评论

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×