转换文本文件编码

Seyed Mohammad 发表于 Dev

赛义德·穆罕默德|

我经常遇到带有字符编码问题的文本文件（例如，使用我的母语波斯语的字幕文件）。这些文件是在Windows上创建的，并使用不合适的编码（似乎是ANSI）保存，看起来像乱码和不可读，如下所示：

在此处输入图片说明

在Windows中，可以使用Notepad ++轻松解决此问题，将编码转换为UTF-8，如下所示：

在此处输入图片说明

正确的可读结果是这样的：

在此处输入图片说明

I've searched a lot for a similar solution on GNU/Linux, but unfortunately the suggested solutions (e.g this question) don't work. Most of all, I've seen people suggest iconv and recode but I have had no luck with these tools. I've tested many commands, including the followings, and all have failed:

$ recode ISO-8859-15..UTF8 file.txt
$ iconv -f ISO8859-15 -t UTF-8 file.txt > out.txt
$ iconv -f WINDOWS-1252 -t UTF-8 file.txt > out.txt

None of these worked!

I'm using Ubuntu-14.04 and I'm looking for a simple solution (either GUI or CLI) that works just as Notepad++ does.

One important aspect of being "simple" is that the user is not required to determine the source encoding; rather the source encoding should be automatically detected by the tool and only the target encoding should be provided by the user. But nevertheless, I will also be glad to know about a working solution that requires the source encoding to be provided.

If someone needs a test-case to examine different solutions, the above example is accessible via this link.

Seyed Mohammad

The working solution I found is using the Microsoft Visual Studio Code text editor which is Freeware and available for Linux.

打开要在VS-Code中转换其编码的文件。在窗口的底部，有一些按钮。其中之一与文件编码有关，如下所示：

单击此按钮将弹出一个包含两个项目的顶部菜单。从此菜单中选择“使用编码重新打开”选项，如下所示：

这将打开另一个菜单，其中包含不同编码的列表，如下所示。现在选择“阿拉伯语（Windows 1256）”：

这将修复像这样的乱七八糟的文字：

现在再次单击编码按钮，这次选择“保存编码”选项，如下所示：

然后在新菜单中选择“ UTF-8”选项：

这将使用UTF-8编码保存更正的文件：

完毕！ :)

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-03-26

我来说两句

0条评论

登录后参与评论

上一篇：为什么此PCManFM命令行不起作用？

来自分类Dev

Related 相关文章

文章