I have a text file that uses various characters in the 128+ range in currently non-standard ways. The file
command just says Non-ISO extended-ASCII
.
From the context I can recognise these:
Octal 201: u + unlaut
204: a + umlaut
216: A + umlaut
224: o + umlaut
341: double s
(There are many others, which I suspect are graphical symbols, not characters.)
Addition, example:
example: E0X A ANCIENT.IMG 2 0 C:\DOS\DISKOPT.EXE A: /O /Sa /M2
ДВД В ДДВДДДДДДДД В Д ДДДДДДДВДДДДДДДДДД ДДДДДДДВДДДДД
і і і і і і
load E0X ДЩ АДДДДДДДДДї і і і
і і і і і
with ANCIENT.IMG Щ і і і і
і і і і
for drive A: ДДДДДДДДДДЩ і і і
і і і
let DISKOPT work ДДДДДДДДДДіДДДДДДДДДДБДДДДДДДДДДДДДДДДДДЩ
і
and write the result back to disk if finished.
(The graphical chars are octal 263, 277, 302, 304, 331.)
And here is the link to the file: e0x.arj. It is the E0X.ENG
, but I guess it is the same encoding in all the text files.
Which character set is this, and how can I make it readable on a modern computer?
Most probably the character positions you mention are octal numbers: 201 (which is customarily written as 0201 to make it clear it's octal) is decimal 129, or 0x81.
Those characters are consistent with several DOC codepages:
If it's German, I'd bet that it's 437 or 850. Any editor should be able to read that text file and write it in a different character set.
For example you can read it with Notepad++ and write it in UTF-8 if you are sure you need that.
P.S. after reading the file that you attached, I can see that E0X.ENG charset is MS-DOS codepage 437. You can see it converted to utf-8 at https://pastebin.com/LdnQCpk4.
If you run on Linux, you can automate conversion with GNU recode
. If you run on DOS, I see this recode
utility https://docs.seneca.nl/Smartsite-Docs/Features-Modules/Features/Tools/Recode-commandline-utility.html should do the same
이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.
침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제
몇 마디 만하겠습니다