我经常遇到带有字符编码问题的文本文件(例如,使用我的母语波斯语的字幕文件)。这些文件是在Windows上创建的,并使用不合适的编码(似乎是ANSI)保存,看起来像乱码和不可读,如下所示:
在Windows中,可以使用Notepad ++轻松解决此问题,将编码转换为UTF-8,如下所示:
正确的可读结果是这样的:
I've searched a lot for a similar solution on GNU/Linux, but unfortunately the suggested solutions (e.g this question) don't work. Most of all, I've seen people suggest iconv
and recode
but I have had no luck with these tools. I've tested many commands, including the followings, and all have failed:
$ recode ISO-8859-15..UTF8 file.txt
$ iconv -f ISO8859-15 -t UTF-8 file.txt > out.txt
$ iconv -f WINDOWS-1252 -t UTF-8 file.txt > out.txt
None of these worked!
I'm using Ubuntu-14.04 and I'm looking for a simple solution (either GUI or CLI) that works just as Notepad++ does.
One important aspect of being "simple" is that the user is not required to determine the source encoding; rather the source encoding should be automatically detected by the tool and only the target encoding should be provided by the user. But nevertheless, I will also be glad to know about a working solution that requires the source encoding to be provided.
If someone needs a test-case to examine different solutions, the above example is accessible via this link.
The working solution I found is using the Microsoft Visual Studio Code text editor which is Freeware and available for Linux.
打开要在VS-Code中转换其编码的文件。在窗口的底部,有一些按钮。其中之一与文件编码有关,如下所示:
单击此按钮将弹出一个包含两个项目的顶部菜单。从此菜单中选择“使用编码重新打开”选项,如下所示:
这将打开另一个菜单,其中包含不同编码的列表,如下所示。现在选择“阿拉伯语(Windows 1256)”:
这将修复像这样的乱七八糟的文字:
现在再次单击编码按钮,这次选择“保存编码”选项,如下所示:
然后在新菜单中选择“ UTF-8”选项:
这将使用UTF-8编码保存更正的文件:
完毕! :)
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句